Main path analysis

This is an old revision of this page, as edited by Johnliu.tw (talk | contribs) at 07:18, 15 August 2017 (Added more texts.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Main path analysis was first proposed by Hummon and Doreian[1]. It is a mathematical tool to identify the major paths in a directed acyclic graph (DAG), typically citation network. The method begins by measuring the significance of all the links in a citation network through the concept of ‘traversal count’ and then sequentially chains the most significant links into a ‘main path’, which is deemed the most significant historical path in the target DAG. The method is applicable to any human activity that can be organized in the form of a DAG. The most common applications are tracing the knowledge flow paths or development trajectories of a science or technology field through bibliographic citations or patent citations[2][3][4]. It has also been applied to judicial decision to trace the evolving changes of legal opinion[5].

History

TBW

The method

Main path analysis operates in two steps. The first step obtains the traversal count of each link in a DAG. Several algorithms are mentioned in the literature to calculate traversal count. The second step searches for the main paths by linking the significant links according to the size of traversal counts. Hereafter, the method is explained assuming the the DAG is a citation network.

Citation Network

A citation network is constructed xxx.

Several terms are defined here before proceeding further .

Heads and tails: Heads are the nodes the direction arrow leads to. Tails are the nodes on other end of the direction arrow.  

Sources and sinks: Sources are the nodes that are cited, but cite no others. Sinks cite other nodes, but are not cited.

Ancestors and descendants: Ancestors are the nodes that can be traced back to. Descendants are the nodes that one can reach following the links.

Traversal counts

Traversal counts measures the significance of a link. Literature discusses several types of traversal counts, including search path count (SPC), search path link count (SPLC), search path node pair (SPNP), and other variations.

Search path count (SPC)

A link’s SPC is the number of times the link is traversed if one runs through all possible paths from all the sources to all the sinks. SPC is first proposed by Vladimir Batagelj[6].

Search path link count (SPLC)

A link’s SPLC is the number of times the link is traversed if one runs through all possible paths from all the ancestors of the tail node (including itself) to all the sinks. SPLC is first proposed by Hummon and Doreian[1].

Search path node pair (SPNP)

A link’s SPNP is the number of times the link is traversed if one runs through all possible paths from all the ancestors of the tail node (including itself) to all the descendants of the head node (including itself). SPNP is first proposed by Hummon and Doreian[1].

The Variants

TBW

Key-route approach

Decay diffusion approach

Applications

TBW

Bibliographic citation

Patent citation

Judicial case citation

Software Implementation

Main path analysis is implemented in Pajek, a widely used freeware written by Vladimir Batagelj and  Andrej Mrvar.

References

  1. ^ a b c Hummon, Norman P.; Doreian, Patrick. "Connectivity in a citation network: The development of DNA theory". Social Networks. 11 (1): 39–63. doi:10.1016/0378-8733(89)90017-8.
  2. ^ Liu, John S.; Lu, Louis Y.Y.; Lu, Wen-Min; Lin, Bruce J.Y. "Data envelopment analysis 1978–2010: A citation-based literature survey". Omega. 41 (1): 3–15. doi:10.1016/j.omega.2010.12.006.
  3. ^ Verspagen, Bart (2007-03-01). "Mapping technological trajectories as patent citation networks: a study on the history of fuel cell research". Advances in Complex Systems. 10 (01): 93–115. doi:10.1142/S0219525907000945. ISSN 0219-5259.
  4. ^ Lucio-Arias, Diana; Leydesdorff, Loet (2008-10-01). "Main-path analysis and path-dependent transitions in HistCite™-based historiograms". Journal of the American Society for Information Science and Technology. 59 (12): 1948–1962. doi:10.1002/asi.20903. ISSN 1532-2890.
  5. ^ Liu, John S.; Chen, Hsiao-Hui; Ho, Mei Hsiu-Ching; Li, Yu-Chen (2014-12-01). "Citations with different levels of relevancy: Tracing the main paths of legal opinions". Journal of the Association for Information Science and Technology. 65 (12): 2479–2488. doi:10.1002/asi.23135. ISSN 2330-1643.
  6. ^ Batagelj, V. (2003). Efficient algorithms for citation network analysis. arXiv preprint cs/0309023.