Network scaling algorithms such as the Pathfi=
nder algorithm are used to prune many different kinds of networks, incl=
uding citation networks, random networks, and social networks. However, thi=
s algorithm suffers from run time problems for large networks and online pr=
ocessing due to its O(n^{4}) time complexity. Hence another alterna=
tive, Fast Pathfinder algorithm, is used which prunes the original network =
to get its PFNET(r,n-1) in just O(n^{3}) time. Since it essentially=
employs different algorithm than MST based PAthfinder scaling its resultsm=
ight differ from it.

The underlying idea of this approach is based on a classical algorithm i= n graph theory for shortest path computation - Floyd-Warshall's Shortest Path Algorithm.

=20There is a substantial speed increase due to the changes made in shortes=
t path distance matrix computation. This leads to a reduction in the *ti=
me complexity* from O(n^{4}) to O(n^{3}). Also the *=
space complexity* is drastically reduced from (2 * n) - 1 matrices in t=
he original algorithm to 2 matrices.

However, the user cannot control the ** q** paramet=
er, which constrains the number of indirect proximities examined in generat=
ing the network. The q parameter is an integer value between 2 and n-1, inc=
lusive where n is the number of nodes or items. Value of q is fixed to n-1.=
Also we have found that networks that have a low standard deviation for ed=
ge weights or if many of the edge weights are equal to the minimum edge wei=
ght, then the network might not be scaled as much as expected. This causes =
unweighted networks to be not scaled at all.

If the networks being processed are undirected then MST based Pathfinder Network scalin= g algorithm can be used. This will give results in 30 times faster than= Fast pathfinder algorithm. Also it does not face the disadvantages faced b= y Fast Pathfinder algorithm.

=20Once the input file is validated the algorithm is started. It works as f= ollows,

=20- =20
- Initialize matrices for edge weights & minimum distance costs with = dimensions being number of nodes x number of nodes. Initialize their defaul= t values as POSITIVE_INFINITY. =20
- When reading through the nodes, map node ids to matrix indexes. =20
- Next, when reading through the edges set weight & distance matrices= with edge weights corresponding to a pair of nodes. Depending upon users c= hoice of how edge weight should be represented, normalize the edge weight i= .e. If they chose similarity, normalize edge weights to be dissimilarity ba= sed by inverting the edge weight. =20
- Now run 3 nested "for" loops with indexes i, j, k such that it returns =
the shortest possible path from node i to node j using only vertices 1 thro=
ugh k as intermediate points along the way. Now, given this function, our g=
oal is to find the shortest path from each node i to each node j using only=
nodes 1 through k.=20
- =20
- For this, either the true shortest path only uses nodes in the set (1..= .k); or there exists some path that goes from node i to node k, then from k= to j that is better. =20
- Use Minkowski's distance, r parameter, to calculate the distance betwee= n the intermediate nodes. =20

=20
- Once the distances for all pairs of nodes are calculated, during genera= ting the output file, output only those edges whose distance & weight m= atrix values are equal. =20

The user has to provide 4 inputs; the file storing the information about=
the network, value of ** r** parameter, the column na=
me that represents the edge weight and how the edge weight should be consid=
ered (Dissimilarity or Similarity). The provided network should not contain=
edge weights of less than equal to 0 else a warning is generated and defau=
lt edge weight is assumed for that edge. The value of

- =20
- Source Code =20

The original Pathfinder Network Scaling algorithm authored by Roger Schvaneveldt et al. was=
adapted with a Fast based approach by *Arnaud Quirin et al.* is imp=
lemented, integrated and documented by Chintan Tank. Also thanks to Micah L=
innemeier and Russell Duhon for providing guidance during implementation. F=
or the description I acknowledge Wikipedia.

- =20
- Schvaneveldt, Roger. Pathfinder associative networks : Studies in Knowl= edge organization. 1990. Link =20
- Quirin, Arnaud., Cordon, Oscar., Guerrero-Bote, Vicente., Vargas-Quesad=
a, Benjamin., Moya-Anegnon, Felix. A quick MST-based algorithm to obtain Pa=
thfinder networks ((infinity), n - 1). In
*Journal of the American Socie= ty for Information Science and Technology*, volume 59, pages 1912-1924,= 2008. Link =20
- Quirin, Arnaud., Cordon, Oscar., Santamaria, J., Vargas-Quesada, Benjam=
in., Moya-Anegnon, Felix. A new variant of the Pathfinder algorithm to gene=
rate large visual science maps in cubic time. In
*Information processing= & management*, volume 44, pages 1611-1623, 2008. Link =20

The license could not be verified: License Certificate has expired!=20
Generate a Free license now.