Message-ID: <71622274.11909.1606414735429.JavaMail.confluence@wiki.cns.iu.edu> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_11908_1919195623.1606414735428" ------=_Part_11908_1919195623.1606414735428 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html K-Nearest Neighbor (Java)

# K-Nearest Neighbor (Java)

###### Description
=20

The K-Nearest-Neighbor is a measure of the correlation between = the degree of a node and that of its neighbors. In many systems strong corr= elations are observed, and one distinguishes networks into assortative<= /em> and disassortative. A network is assortative if large (small)= degree nodes tend to be linked with large (small) degree nodes. Social net= works are typical examples of assortative networks. A network is disassorta= tive if large (small) degree nodes tend to be linked with small (large) deg= ree nodes. Biological and technological networks (like the Internet) are ex= amples of disassortative networks.
The undirected K-Nearest-Neighbor is= defined as follows: a node is selected and the average degree of all its n= eighbors is calculated. By repeating the procedure for all nodes of the net= work one derives a pair for each node, where is the degree of the node. By = averaging over nodes with equal degree one derives the function , which all= ows to study the correlation.
If grows with , the network is assortativ= e; if decreases with , the network is disassortative. A flat curve would in= dicate the absence of correlation.
In the absence of correlations, the = function would be the same as for a random network with equal degree distri= bution . In this case it is possible to prove that

=20

where is the second moment of the distribution (i.e. the expected value = of the degree squared) and is the average degree of the network. For this r= eason, the function calculated by the algorithm is normalized in that we di= vide it by the constant .

=20
###### Pros & Cons
=20

The network to analyze must be undirected, otherwise there are no specia= l constraints.

=20
###### Applications
=20

The algorithm is used to disclose affinities/diversities between neighbo= ring nodes. Many properties of networks and of processes that take place on= networks are affected by the presence of degree-degree correlations.

= =20
###### Implementation Det= ails
=20

The algorithm requires two inputs, the file where the edges of the netwo= rk are listed and the number of points one wishes to have in the binned cor= relation function described below. A first read-in of the inputfile will se= t the values of the number of nodes and edges of the network. In the second= read-in the degrees of all nodes will be calculated and the edges are stor= ed in an array. Then the of all nodes are calculated.
The program gener= ates two output files, corresponding to two different ways of partitioning = the interval spanned by the values of degree. In the first output, the is a= veraged among all nodes with equal degree; the output displays all degree v= alues with their average s (normalized as described in the Description sect= ion).
The second output gives the binned correlation function,= i.e. the interval spanned by the values of degree is divided into bins who= se size grows while going to higher values of the variable. The size of eac= h bin is obtained by multiplying by a fixed number the size of the previous= bin. The program calculates the average for all nodes whose degree falls w= ithin each bin.
This technique is particularly suitable to study skewed= correlation functions: the fact that the size of the bins grows large for = large degree values compensates for the fact that not many nodes have high = degree values, so it suppresses the fluctuations that one would observe by = using bins of equal size. On a double logarithmic scale the points of will = appear equally spaced on the x-axis.
The program runs in a time O(m), m= being the number of edges of the network.

=20
###### Usage Hints
=20

A simple application of this algorithm could be to calculate the for net= works created by the modeling algorithms of the NWB. For instance, the inpu= tfile can be created through the Barabasi-Albert model.

=20
=20
=20
###### Acknowledgements
= =20

The algorithm was implemented and documented by S. Fortunato, integrated= by S. Fortunato and W. Huang.

=20
###### References
=20

Pastor-Satorras, R., Vazquez, A., Vespignani, A. (2001) Dynamical and Correlation Properties of the Intern= et.
Physical Review Letters 87:258701.

=20