Pros & Cons
The network to analyze must be undirected, otherwise there are no special constraints.
Basic analysis tool, not particular for special disciplines or problems.
The algorithm requires two inputs, the file where the edges of the network are listed and the number of points for the binned distribution described below. A first read-in of the inputfile will set the values of the number of nodes and edges of the network. In the second read-in the degree of all nodes is calculated and the edges are stored in an array. Then the clustering coefficients for all nodes are calculated and listed in one of the three output files, together with the corresponding node index. The average of the clustering coefficients is determined and displayed in the NWB console.
Next the distribution of the clustering coefficients is calculated. The program generates two other output files, corresponding to two different ways of partitioning the interval spanned by the values of the clustering coefficient. In the first output, one divides the range of variation of the clustering coefficient in equal bins and determines the number of nodes with clustering coefficients inside each bin: the scores are then divided by the number of nodes of the network, so to obtain the probability: the output displays such probability values together with the center points of the bins they refer to.
The second output gives the binned distribution, i.e. the interval spanned by the values of the clustering coefficient is divided into bins whose size grows while going to higher values of the variable. The size of each bin is obtained by multiplying by a fixed number the size of the previous bin. The program calculates the fraction of nodes whose clustering coefficient falls within each bin. Because of the different sizes of the bins, these fractions must be divided by the respective bin size, to have meaningful averages.
This technique is particularly suitable to study skewed distributions: the fact that the size of the bins grows large for large values of the variable compensates for the fact that not many nodes have high values, so it suppresses the fluctuations that one would observe by using bins of equal size. On a double logarithmic scale, which is very useful to determine the possible power law behavior of the distribution, the points of the latter will appear equally spaced on the x-axis.
The algorithm runs in a time , where is the number of nodes of the network and is the average degree squared.
A simple application of this algorithm could be to calculate the clustering coefficient and its distribution for networks created by the modeling algorithms of the NWB. For instance, the inputfile can be created through the Barabasi-Albert model.
The algorithm was implemented and documented by S. Fortunato, integrated by S. Fortunato and W. Huang. For the description we acknowledge Wikipedia.
Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.