Delta modularity calculation (singleton)
π¨ This page is a work in progress. TODO: Maybe add a graphic here to explain the various variables.
In the modularity optimization phase, we only rely on local information. Recalculating the global modularity value for every possible neighbor of each vertex would significantly degrade the performance of the Louvain algorithm. This is why Blondel et al. employ a delta modularity formula that can be applied to millions of nodes. In this section, we will explain and derive the formula and show how it can be simplified for usage in a program.
We have seen here that the modularity for a partition is given by:
We are only intrested in the contribution of a single community to the overall modularity. With a slight abuse of notation, let denote modularity of community 1, such that . Therefore, for the modularity of a community , we get:
Next, we look closely at the process of βmovingβ a vertex to the community of one of its neighbors . We assume the vertex to be located in a singleton community. The formula Blondel et al. presented is for exactly this case, where we remove from its singleton community and then insert it into the neighborβs community. This is actually only applicable for the first iteration of the modularity optimization phase where in the beginning of every new pass we deal with singleton communities. The authors state that they use a similar formula to calculate the modularity change when is removed from any community (also those with more than one vertex in it), but do not reveal the expression used. We will therefore derive the generalized version in the next section.
For now, let us consider the case where we move a vertex from its singleton community that we will denote by to any other community . The modularity change (delta) is then given by
where is the quality of community before the merge (and hence without vertex in that community) and is the quality of after the merge (and hence after vertex was integrated into community ). As the singleton community does not exist anymore after the merge, we also subtract the modularity of the singleton community . In the following, let the prime symbol always denote the state of a variable after the merge.
With the above formula, this gives us:
which is the equation presented in the original Louvain paper. If we consider that after the merging process, the sum of the weights of the edges between node and community β that we will denote2 by β now adds to the community accommodating vertex , we obtain
Likewise, we find that the weighted vertex degree adds to the total weighted vertex degree of after the merge, thus:
With this, we can simplify our expression for the modularity change. Note that we use , as there are no edges inside a singleton community since we do not allow self-loops in the original graph.
Finally, we have
We can ignore the constant since we only compare different modularity increases with each other and therefore merely require a relative measure. This saves us one division in the algorithm. After one complete pass, the new global modularity is calculated using the formula from here3.
Our short formula is used in the algorithm to efficiently calculate the delta modularity gain . To remove a vertex from its previous community or to insert it into a new community , only and have to be updated. is adjusted for the global modularity calculation and not for the delta modularity . Note that we can precalculate the sum of the weights of edges between and community or ( and ) before calling the remove
- or insert
-function. This is crucial for the speed of Louvain as has to be computed frequently. As stated above, we also dropped
the factor to save one division. For the calculation of global modularity, we do not omit the factor in order to obtain the absolute global modularity .
We can distinguish the two functions by looking at the argument, which is either a community or a partition , i.e. a set of communities .
This is done in conformity with these notes on modularity. The arrow does not indicate a direction; we still deal with an undirected graph.
In newer versions of the algorithm, this is not the case anymore and we only use the relative modularity calculation. TODO