=20

- =20
- Loading Da= ta (Files)=20 =20
- Data Preparat=
ion=20
- =20
- Text Files = =20

=20
- Preprocessing=20 =20
- Analysis=20
- =20
- Temporal = =20
- Geospatial=20
- Topical =20
- Networks=20 =20

=20
- Modeling=20
- =20
- Networks = =20

=20
- R =20
- Visualization=20 =20

**Facebook**

- Access Token - A= llows the user to authenticate with Facebook, simplifying the process of us= ing the other Facebook algorithms.
- Facebook Friends Dat= a - Pulls the name, user ID, latest update, gender, current location, h= ometown, birth date, interests, religion, political views, relationship sta= tus, and attended events of all the friends in a user's network.
- Mutual Friends=
- Returns a CSV with every single connection between a user and his or her=
friends, and calculates the number of mutual friends for each connection.<=
br>

**Googl= e Scholar**

- Google Citati= on User ID Search Algorithm - The Google citation user ID is a web read= er algorithm that retrieves the Google citation for the specified aut= hor.
- Attach Citation Indices from Google Scholar - Reads the citation indi= ces from the Google scholar profile for a given set of authors and creates = a csv file with the corresponding citation indices.
- =
Attach Citation Table from Google Scholar - Reads citations in the Goog=
le scholar profile for a given author and returns the citation information =
in the form of a series of tables for all the queried Google Citation user =
ID

**Flickr= Reader**- Flickr reader - Reads a =
list of Flickr User IDs from a CSV file loaded into Sci2 and attempts to ga=
ther the image URLs of every image those users uploaded

- Flickr reader - Reads a =
list of Flickr User IDs from a CSV file loaded into Sci2 and attempts to ga=
ther the image URLs of every image those users uploaded
*T**witter Reader*- Twitter reader - Reads = a list of Twitter handles from a CSV file and attempts to pull their recent= status updates and information about the individual status updates. The us= er may also specify hashtags.

*Text Files*- Convert to Generic Publ= ication - A plugin which takes as input a path to a .hmap properties file, then uses the header map= ping in the file to standardize the headings in a table.
- Remove ISI Du= plicate Records =E2=80=93 Removes duplicate publications form ISI recor= ds based on ISI Unique ID attribute.
- Remo=
ve Rows with Multitudinous Fields =E2=80=93 Removes rows having at leas=
t N entries within a given field.

-------------------------------------= -------- - Extract Directed = Network =E2=80=93 Creates a directed network by placing a directed edge= between the values in a given column to the values of a different column.<= /li>
- Extract Bipartit= e Network =E2=80=93 Creates an unweighted bipartite network by placing = a directed edge between the values in a given column to the values of a dif= ferent column.
- Extract Pap= er Citation Network =E2=80=93 Extracts an unweighted directed network f= rom papers to their citations.
- Extract Autho=
r Paper Network =E2=80=93 Extracts an unweighted directed network from =
authors to their papers.

---------------------------------------------<= /li> - Extract Co-O= ccurrence Network =E2=80=93 Extracts a network from a delimited table.<= /li>
- Extract= Word Co-Occurrence Network =E2=80=93 Creates a weighted network where = each node is a word and edges connect words to each other. The strength of = an edge represents how often two words occur in the same body of text toget= her.
- Extract Co-Author N= etwork =E2=80=93 Extracts a weighted network with authors as nodes and = edge weights as the number of times those authors co-wrote a paper.
- Extract Reference Co-Occurrence (Bibliographic C=
oupling) Network =E2=80=93 Extracts a weighted network from a Paper Cit=
ation network, with papers as nodes and edge weights as the number of citat=
ions two papers share.

--------------------------------------------- - Extra=
ct Document Co-Citation Network =E2=80=93 Extracts a weighted network f=
rom a Paper Citation network, with papers as nodes and edge weights as the =
number of times two papers are cited together.

------------------------= --------------------- - Detect Duplicate No= des =E2=80=93 Cleans graph data by detecting and preparing to merge nod= es that are likely to represent the same entity.
- Update Net=
work by Merging Nodes =E2=80=93 Creates a new network by running the al=
gorithm with both the Merge Table from "
*Detect Duplicate Nodes*" an= d the original network selected.

*General*- Extract Top N% Reco= rds =E2=80=93 Returns the top N% rows of a table by selecting the perce= ntage of rows to keep and column to sort by.
- Extract Top N Record= s =E2=80=93 Returns the top N rows of a table by selecting the number o= f rows to keep and column to sort by.
- Aggregate Data =E2=80= =93 Summarizes the input table by column, allowing the aggregation of value= s such as "Cited Reference Count," "Number of Pages," "Publication Year," "= Times Cited," as well as values represented by many other delimiters.

*Temporal**Geospatial*<= /h3>- Extract ZIP Code =E2= =80=93 Extracts a ZIP code from a given address.

*Topical*- Reconciled Journal Names -= This algorithm maps the given journal names to the equivalent journal name= s in the UCSD Map of Science standard.
- Lowercase, Tokenize, Stem, and Stopword Text =E2=80=93 Replace= s spaces and punctuation from a field with a standard delimiter of the user= 's choosing.

*Networks*- Extract Top Nodes = =E2=80=93 Extracts the top N nodes from a graph, based on a given attribute= .
- Extract= Nodes Above or Below Value =E2=80=93 Extracts nodes with an attribute = above or below a certain value.
- Delete Isolates =E2=80= =93 Removes nodes which are not connected to any other in the graph.
- Extract Top Edges = =E2=80=93 Extracts the top N edges from a graph, based on a given attribute= .
- Extract= Edges Above or Below Value =E2=80=93 Extracts all edges with an attrib= ute above or below a certain number from a graph.
- Remove Self Loops = =E2=80=93 Removes edges whose source and target nodes are equivalent from a= graph.
- Trim by Degree -- = Deletes edges at random until each node has at most N edges.
- MST-Pathfin= der Network Scaling =E2=80=93 Prunes a network using the MST-Pathfinder= algorithm.
- Fast Pathf=
inder Network Scaling =E2=80=93 Prunes a network using the Fast Pathfin=
der algorithm.

--------------------------------------------- - Snowball Sampling (= N nodes) =E2=80=93 Picks a random node and traverses its edges iterativ= ely until N nodes are extracted.
- Node Sampling =E2=80=93 = Extracts N random nodes and their intervening edges, and then deletes isola= tes.
- Edge Sampling =E2=80=93 =
Extracts N random edges and their target and source nodes.

------------= --------------------------------- - Symmetrize-- Turns a direct= ed network into an undirected network.
- Dichotomize =E2=80=93 Trim= s edges above, equal to, or below a certain value.
- Multipartite Joining<=
/a> =E2=80=93 Joins a multipartite graph for one node type across another n=
ode type.

--------------------------------------------- - Merge 2 Networks =E2= =80=93 Merges identical networks. Emphasis is put on edge attributes mergin= g.

*Temporal*- Burst Detection =E2=80= =93 Determines periods of increased activity in a table with dates/timestam= ps.

*Geospatial*- Geocoder =E2=80=93 Converts p= lace names to latitudes and longitudes.
- Congressio=
nal District Geocoder - Converts the given
**9-digits U.S. ZIP co= des (ZIP+4 codes)**into its congressional districts and geographica= l coordinates (latitude and longitude). - Yahoo! Geocoder = ;- Converts place names and addresses into latitudes and longitudes (requir= es Yahoo! API Key)

*Topical*- Burst Detection =E2=80= =93 Determines periods of increased activity in a table with dates/timestam= ps.

*Networks*- Network Analysis To= olkit (NAT) =E2=80=93 Calculates basic network statistics, such as numb= er of nodes (and isolated nodes), node attributes, number of edges, presenc= e of self loops and parallel edges, average degree ("total," "in," and "out= "), strength of component connections, and overall density.
*U= nweighted & Undirected*- Node Degree =E2=80=93 Calc= ulates the amount of edges adjacent to a node, and then appends that value = to each node.
- Degree Distribution =E2=80=93 Builds a histogram of the degree values of all nodes.

-----= ---------------------------------------- - K-Nearest Neighbor<= /a> =E2=80=93 Calculates the correlation between the degree of a node and t= hat of its neighbors, and then appends that value to each node.
- Watt= s-Strogatz Clustering Coefficient =E2=80=93 Calculates the degree to wh= ich nodes tend to cluster together, and then appends that value to each nod= e.
- Watts Strogatz Clustering Coefficient over K =E2=80=93 Correlates th=
e clustering coefficient and the degree of the nodes of a network.

----= ----------------------------------------- - Diameter =E2=80=93 Calculates= the length of the longest shortest path between pairs of nodes in a networ= k.
- Average Shortest Pat= h =E2=80=93 Calculates the average length of the shortest path between = pairs of nodes in a network.
- Shortest Path D= istribution =E2=80=93 Builds a histogram of the lengths of shortest pat= hs between pairs of nodes in a network.
- Node Betweenne=
ss Centrality =E2=80=93 Appends a value to each node which correlates t=
o the number of shortest paths that node resides on. The more shortest path=
s between node-pairs a certain node resides on, the higher its betweenness =
centrality.

--------------------------------------------- - Weak Component C= lustering =E2=80=93 Extracts the N largest weakly connected components = of a network.
- Global Connect=
ed Components =E2=80=93 Calculates the number of connected components o=
r subgraphs with a path between each pair of nodes.

--------------------= ------------------------- - Blondel Commun= ity Detection =E2=80=93 Extracts a hierarchical community structur= e for a large network.
- Louvain Commun= ity Detection - Large network community detection
- Louvain Multilevel Refinement Community Detection - Large = network community detection
- SLM Community Dete= ction - Large network community detection
- ---------------------------------------------
- Extract K-Core =E2=80= =93 Extracts the kth K-Core from a graph. The kth K-Core is what remains of= the graph after every node with fewer than k edges connected to it is remo= ved from the graph recursively.
- Annotate K-Coreness-- Appends to each node the K-Core that node belongs to.

-------------= -------------------------------- - HITS =E2=80=93 Computes authority= and hub score for every node.

*Wei= ghted & Undirected*- Clustering Coeffici= ent =E2=80=93 Calculates the degree to which nodes tend to cluster toge= ther, and then appends that value to each node.
- Nearest Neighbor D= egree - Determines the average nearest neighbor degree.
- Strength vs Degree = - This algorithm determines the strength distribution.
- Degree & Streng= th - Determines the degree and strength of each node.
- Average= Weight vs End-point Degree - Determines the average weight as a functi= on of end-point-degree.
- Strength Distributio= n - Determines the strength distribution.
- Weight Distribution - Determines the weight distribution.
- Randomize Weights - =
Redistributes the weights, while keeping the topology invariant.

------= --------------------------------------- - Node Betweenness Centralit=
y - We use Brandes' algorithm to calculate the 'betweenness centrality'=
for vertices.

--------------------------------------------- - Blondel Commun= ity Detection =E2=80=93 Extracts a hierarchical community structure for= a large network.
- Louvain Commun= ity Detection - Large network community detection
- Louvain Multilevel Refinement Community Detection - Large = network community detection
- SLM Community Dete=
ction - Large network community detection

--------------------= ------------------------- - HITS =E2=80=93 Computes authority= and hub score for every node.

*Unw= eighted & Directed*- Node Indegree =E2=80=93 = Appends the number of incoming edges to each node.
- Node Outdegree =E2=80= =93 Appends the number of outgoing edges to each node.
- Indegree Distributio= n =E2=80=93 Builds a histogram of the values of the indegree of all nod= es.
- Outdegree Distribut=
ion =E2=80=93 Builds a histogram of the values of the outdegree of all =
nodes.

--------------------------------------------- - K-Nearest Neighbor = =E2=80=93 Calculates the correlation between the degree of a node and that = of its neighbors, and then appends that value to each node.
- Sin=
gle Node In-Out Degree Correlations =E2=80=93 Calculates the correlatio=
ns between indegree and outdegree of a node.

--------------------------= ------------------- - Dyad Reciprocity =E2= =80=93 The ratio of dyads with a reciprocated tie to dyads with any tie.
- Arc Reciprocity =E2=80= =93 The ratio of reciprocal edges to total edges.
- Adjacency Transitiv=
ity =E2=80=93 The ratio of transitive triads to intransitive triads (tr=
iads missing one edge).

--------------------------------------------- - Weak Component C= lustering =E2=80=93 Extracts the N largest weakly connected components = of a network.
- Strong Compone=
nt Clustering =E2=80=93 Extracts the N largest strongly connected compo=
nents of a network.

--------------------------------------------- - Blondel Commun= ity Detection =E2=80=93 Extracts a hierarchical community structur= e for a large network.
- Louvain Commun= ity Detection - Large network community detection
- Louvain Multilevel Refinement Community Detection - Large = network community detection
- SLM Community Dete=
ction - Large network community detection

--------------------= ------------------------- - Extract K-Core = =E2=80=93 Extracts the kth K-Core from a graph. The kth K-Core is what rema= ins of the graph after every node with fewer than k edges connected to it i= s removed from the graph recursively.
- Annotate K-Coreness=
=E2=80=93 Appends to each node the K-Core that node belongs to.

--= ------------------------------------------- - HITS =E2=80=93 Computes authority= and hub score for every node.
- PageRank =E2=80=93 Ranks the = importance of a node by how many other important nodes point to it.

*Weigh= ted & Directed*- Blondel Commun= ity Detection =E2=80=93 Extracts a hierarchical community structur= e for a large network.
- Louvain Commun= ity Detection - Large network community detection
- Louvain Multilevel Refinement Community Detection - L= arge network community detection
- SLM Community Dete=
ction - Large network community detection

---------------------= ------------------------

### Networks

- Random Graph =E2=80=93 Ge= nerates a graph with a fixed number of nodes connected randomly by undirect= ed edges.
- Watts-Strogatz =
Small World =E2=80=93 Generates a graph whose majority of nodes are not=
*directly*connected to one another, but are still connected to one= another via relatively few edges. - Barab=C3=A1si-Alber=
t Scale-Free =E2=80=93 Generates a scale-free network by incorporating =
growth and preferential attachment.

-----------------------------------= -- - TARL (Topics, Aging= and Recursive Linking) =E2=80=93 Incorporates "aging" to generate bipa= rtite coevolving networks of authors and papers. Can also be applied to oth= er datasets with different aging distribution.

- Create an R Instance - This a= lgorithm creates an R instance that is usable from the CIShell environment.=
- Run Rgui - This algorithm opens the RGui = for an already created R instance.
- Send a Table to R - This algorit= hm imports a CSV into a running R instance.
- Get a Table from R - This algor= ithm exports a CSV from a running R instance back onto the Data Manager.

### General

- GnuPlot =E2=80=93 Used to plot two-dimensional functions and data point= s in many different formats. For full documentation of this open-source sof= tware, please visit http://www.gnuplot.info/documentat= ion.html

### Temporal

- Temporal Bar Graph<= /li>
- Horizontal Bar Graph<= /a>
- Horizontal Bar Grap= h (not included version) =E2=80=93 Uses csv (tabular) datasets (includi= ng NSF grant data and the output of burst detections) to visualize numeric = data over time, generating labeled horizontal bars that correspond to recor= ds in the original dataset.

### Geospatial

- Proportional Symbo= l Map =E2=80=93 Maps geospatial coordinates as circles that can be size= - and color-coded in proportion to associated numeric data.. Result is a Po= stScript file.
- Choropleth Map =E2=80= =93 Color-codes named regions on a geographical map in proportion to associ= ated numeric data. Result is a PostScript file.
- Ge= ospatial Network Layout with Base Map =E2=80=93 Allows for the geospati= al visualization of network data, by producing a network file and correspon= ding blank map.

### Topical

Map of Science via Jour= nals - The Map of Science is a visual representation of a network of 55= 4 subdisciplines of science (grouped into 13 overarching disciplines) and t= heir relationships to one another.

- Map of Science via 554 - This visualization works exactly like Map of Science via Journals except instead of taking = a collection of journal names as input and mapping them to the 554 fields i= t directly takes IDs of the 554 fields, which are integers from 1 to 554.

### Networks

- GUESS =E2=80=93 Interactive data= analysis and visualization tool.
- Gephi - Interactive data an=
alysis and visualization tool at http://gephi.org.

--------------------= ------------------------- - Radial Tree/Graph (= prefuse alpha) =E2=80=93 A single node is placed at the center and all = others are laid around it in a tree structure.
- Radial Tree/Graph with Annotati= on (prefuse beta) =E2=80=93 A single node is placed at the center and a= ll others are laid around it in a tree structure, with labels.
- Tree View (prefuse = beta) =E2=80=93 Visualizes directory hierarchies in a tree structure. W= arning: Does not work on Macs.
- Tree Map (prefuse b= eta) =E2=80=93 Visualizes hierarchies using the Treemap algorithm. Warn= ing: Does not work on Macs.
- Force Directed with= Annotation (prefuse beta) =E2=80=93 Sorts randomly placed nodes into a= more aesthetically pleasing visual layout.
- Fruchterman-Reingol=
d with Annotation (prefuse beta) =E2=80=93 Visualization which lays out=
nodes based on some force between them.

------------------------------= --------------- - DrL (VxOrd) =E2= =80=93 A force-directed graph layout toolbox focused on real-world large-sc= ale graphs.
- Specified (prefuse =
beta) =E2=80=93 Visualization tool for use with graphs having pre-speci=
fied node coordinates.

--------------------------------------------- - Circular Hierarchy = =E2=80=93 Generates a circular visualization of the output produced by a mu= lti-level aggregation method such as Blondel Community Detection. Result is= a Postscript file.
- Bipartite Network = Graph - Generates a bipartite network visualization of the output produ= ced by Extract = Bipartite Network algorithm.