Tarjan's strongly connected components algorithm

Tarjan's Algorithm (named for its discoverer, Robert Tarjan) is a graph theory algorithm for finding the strongly connected components of a graph. Although it precedes it chronologically, it can be seen as an improved version of Kosaraju's algorithm, and is comparable in efficiency to Gabow's algorithm.

Idea

The basic idea of the algorithm is this: a depth-first search begins from a start node. The strongly connected components form the subtrees of the search tree, the roots of which are the roots of the strongly connected components.

The nodes are placed on a stack in the order in which they are visited. When the search returns from a subtree, the nodes are taken from the stack and it is determined whether each node is the root of a strongly connected component. If a node is the root of a strongly connected component, then it and all of the nodes taken off before it form that strongly connected component.

The root property

The crux of the algorithm comes in determining whether a node is the root of a strongly connected component. To do this, each node is given a depth search index v.index, which numbers the nodes consecutively in the order in which they are discovered. In addition, each node is assigned a value v.lowlink that satisfies v.lowlink := min {v'.index: v' is reachable from v}. Therefore v is the root of a strongly connected component if and only if v.lowlink = v.index. The value v.lowlink is computed during the depth first search such that it is always known when needed.

The algorithm in pseudocode

Input: Graph G = (V, E)

index = 0                         // DFS node number counter 
S = empty                         // An empty stack of nodes
forall v in V do
  if (v.index is undefined)       // Start a DFS at each node
    tarjan(v)                     // we haven't visited yet

procedure tarjan(v)
  v.index = index                 // Set the depth index for v
  v.lowlink = index
  index = index + 1
  S.push(v)                       // Push v on the stack
  forall (v, v') in E do          // Consider successors of v
    if (v'.index is undefined)  // Was successor v' visited?
        tarjan(v')                // Recurse
        v.lowlink = min(v.lowlink, v'.lowlink)
    elif (v' is in S)           // Was successor v' in stack S? 
        v.lowlink = min(v.lowlink, v'.index)
  if (v.lowlink == v.index)       // Is v the root of an SCC?
    print "SCC:"
    repeat
      v' = S.pop
      print v'
    until (v' == v)

Remarks

Complexity: The tarjan procedure is called once for each node; the forall statement considers each edge at most twice. The algorithm's running time is therefore linear in the number of edges in G, ie $O(|V|+|E|)$ .

The test for whether v' is on the stack should be done in constant time, for example, by testing a flag stored on each node that indicates whether it is on the stack.

Literature

Robert Tarjan: Depth-first search and linear graph algorithms. In: SIAM Journal on Computing. Vol. 1 (1972), No. 2, P. 146-160.

Links

Description of Tarjan's Algorithm
Implementation of Tarjan's Algorithm in Java