Girvan–Newman algorithm: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 06:27, 30 October 2009 edit Michael Hardy (talk \| contribs) Administrators 210,577 edits No edit summary ← Previous edit		Latest revision as of 19:55, 12 October 2024 edit undo JJMC89 bot III (talk \| contribs) Bots, Administrators 4,309,085 edits m Moving Category:Algorithms in graph theory to Category:Graph algorithms per Wikipedia:Categories for discussion/Log/2024 October 4 Tag: Manual revert
(27 intermediate revisions by 27 users not shown)
Line 1: {{short description\|Community detection algorithm}} The '''Girvan–Newman algorithm''' is one of the methods used to detect [[communities]] in [[complex system]]s.<ref name=newman>Girvan M. and Newman M. E. J., [http://dx.doi.org/10.1073/pnas.122653799 Community structure in social and biological networks], Proc. Natl. Acad. Sci. USA '''99''', 7821–7826 (2002)</ref> The notion of a "community structure" is related to that of clustering, though it isn't quite the same. A community consists of a subset of nodes within which the node-node connections are dense, and the edges to nodes in other communities are less dense.<ref name=newman/> There are numerous alternative method for detecting communities in networks. These include [[Hierarchical Clustering \| hierarchical clustering]], partitioning graphs to maximize quality functions such as [[Modularity (networks)\|network modularity]], [[clique percolation method\|''k''-clique percolation]], etc. The '''Girvan–Newman algorithm''' (named after [[Michelle Girvan]] and [[Mark Newman]]) is a hierarchical method used to detect [[Community structure\|communities]] in [[complex system]]s.<ref name=newman>Girvan M. and Newman M. E. J., [https://dx.doi.org/10.1073/pnas.122653799 Community structure in social and biological networks], Proc. Natl. Acad. Sci. USA '''99''', 7821–7826 (2002)</ref> == Edge betweenness and community structure == The Girvan–Newman algorithm detects communities by progressively removing edges from the original network. The connected components of the remaining network are the communities. Instead of trying to construct a measure that tells us which edges are the most central to communities, the Girvan–Newman algorithm focuses on edges that are most likely "between" communities. [[Betweenness centrality\|Vertex betweenness]] is an indicator of highly [[centrality\|central]] nodes in networks. For any node <math>i</math>, vertex betweenness is defined as the fraction of shortest paths between pairs of nodes that run through it. It is relevant to models where the network modulates transfer of goods between known start and end points, under the assumption that such transfer seeks the shortest available route. The hierarchical clustering method is based on assigning a weight for every edge and placing these edges into an initially empty network, starting from edges with strong weights and progressing towards the weakest ones. The edges with the greatest weights within the community are the most central ones. Although traditional in community detection, the method presents some pathologies. One of them for instance, is the inability to classify in a community a node which is connected to the network with only one edge. The Girvan–Newman algorithm ~~works~~extends this definition to the ~~opposite~~case ~~way.~~of ~~Instead~~edges, defining the "edge betweenness" of ~~trying~~an toedge ~~construct~~as the number of shortest paths between pairs of nodes that run along it. If there is more than one shortest path between a ~~measure~~pair of nodes, each path is assigned equal weight such that ~~tells~~the ustotal ~~which~~weight ~~edges~~of ~~are~~all of the ~~most~~paths ~~central~~is equal to ~~communities,~~unity. itIf ~~focuses~~a onnetwork ~~these~~contains ~~edges~~communities or groups that are ~~least~~only ~~central,~~loosely ~~the~~connected by a few inter-group edges, ~~that~~then ~~are~~all ~~most~~shortest paths "between" different communities must go along one of these few edges. ~~The~~Thus, the edges connecting communities ~~are~~will ~~detected~~have byhigh ~~progressively~~edge betweenness (at least '''one''' of them). By removing these edges ~~from~~, the ~~original~~groups ~~graph,~~are ~~rather~~separated ~~than~~from byone ~~adding~~another ~~the~~and ~~strongest~~so ~~edges~~the tounderlying ancommunity ~~initially~~structure ~~empty~~of the network is revealed. Vertex betweenness has been studied in the past as a measure of the [[Centrality \| centrality]] and influence of nodes in networks. For any node <math>i</math>, vertex betweenness is defined as the number of shortest paths between pairs of nodes that run through it. It is a measure of the influence of a node over the flow of information between other nodes, especially in cases where information flow over a network primarily follows the shortest available path. The Girvan–Newman algorithm extends this definition to the case of edges, defining the "edge betweenness" of an edge as the number of shortest paths between pairs of nodes that run along it. If there is more than one shortest path between a pair of nodes, each path is assigned equal weight such that the total weight of all of the paths is equal to unity. If a network contains communities or groups that are only loosely connected by a few intergroup edges, then all shortest paths between different communities must go along one of these few edges. Thus, the edges connecting communities will have high edge betweenness (at least '''one''' of them). By removing these edges, the groups are separated from one another and so the underlying community structure of the network is revealed. The algorithm's steps for community detection are summarized below # The betweenness of all existing edges in the network is calculated first. # The edge(s) with the highest betweenness isare removed. # The betweenness of all edges affected by the removal is recalculated. # Steps 2 and 3 are repeated until no edges remain. Line 21: == See also == * [[Closeness (mathematics)\|Closeness]] * [[Betweenness]] * [[Closeness]] * [[Hierarchical clustering]] * [[Modularity (networks)\|Modularity]] == References == Line 30 ⟶ 29: <references/> {{DEFAULTSORT:Girvan-Newman algorithm}} [[Category:Graph algorithms]] [[Category:Networks]] [[Category:Network analysis]]