Data Preparation > Database > ISI > Extract Authors By Year
Documents in ISI datasets will often specify a publication year and a set of authors. Thus, it can be inferred that certain authors will have authored x many works in certain years.
This algorithm extracts the names of authors, years they authored documents included in the given dataset, and how many works they authored in those years out of an ISI database. The result is a table that consists of the following columns:
- Author: The author label is generated based on the author's full name, if supplied. If the author's full name was not supplied, the author label is then generated based on the author's abbreviated name as a second option and the author's e-mail address as a third option. If neither the author's full name, abbreviated name, nor e-mail address was supplied, a generic label is generated.
- Year: One of potentially several years the corresponding author authored works listed in the ISI dataset.
- Author ID: The primary key of the author People Table record in the ISI database.
- Number Of Documents: The number of works listed in the ISI dataset that the corresponding author authored in the corresponding year.
- Citations Received In This Year: This author's citation count for this year only. This count includes all of this author's works, including those written in other years.
- Times Documents From This Year Are Cited: This author's citation count for papers written in this year only. This count disregards all of this author's works not written in this year.
Load an ISI file into the tool, then create a database from it using the ISI database loader.
It is strongly recommended that the database be cleaned before extracting the authors by year.
For a quick analysis of a small dataset you may wish to merge together author entities with identical names. For a scientifically sound analysis of a larger dataset, you can find author entity merging suggestions (or manually set your own merging orders from scratch) and perform the merge.
The specific query run by the tool can be found in the source code.