The Google Citation User ID Search algorithm is implemented in Java. The Google Citation user ID is a web reader algorithm that retrieves the Google citation for the specified author by querying directly through Google Author Search.
Warning: Google prevents crawling activity on Google Scholar search. To avoid your IP being blocked by Google, it is recommended to search for no more than 100 records per day with this algorithm. However, Google does allow unlimited requests for the remaining Google citation algorithms (Attach Citation Indices from Google Scholar, and Attach Citation Table from Google Scholar) once the user IDs are known. Limitations on crawling activity for Google Scholar are provided in the http://scholar.google.com/robots.txt file.
Given a table with a column of author names as input, the algorithm returns a table with the Google Citation user ID and information associated with each author. Multiple results found will be indicated with same unique index and '*' will be used to indicate the default author. Once the user IDs are retrieved, the following algorithms can be invoked:
The algorithm takes 2 parameters.
The results will be generated into a CSV file with the following fields:
Because of the usage limitation from Google, it is the user's own responsibility to prevent their IP from being black-listed by Google. Once the Google Citation user IDs are retrieved, it is safe to invoke the remaining Google Citation algorithms without limitation.
The Google Citation User ID algorithm provides a service to retrieve the Google Citation user ID for the given author names. The retrieved Google Citation user ID can be used for querying the specified author h-index, citation information, etc.
The Google Citation User ID Search algorithm is implemented in Java. Each given author name will be wrapped into a Google Scholar Search query. The returned results will be parsed into a table format. Multiple results could be returned and users are responsible for deleting duplicated results.
Please read the Description section before continuing to avoid your machine's IP from being black-listed by Google.