Content deleted Content added
m →Definition: Add originally proposed note |
Streamlined text; corrected grammar and punctuation |
||
Line 2:
'''Cyclomatic complexity''' is a [[software metric]] used to indicate the [[Programming complexity|complexity of a program]]. It is a quantitative measure of the number of linearly independent paths through a program's [[source code]]. It was developed by [[Thomas J. McCabe, Sr.]] in 1976.
Cyclomatic complexity is computed using the [[control-flow graph]] of the program
One [[software testing|testing]] strategy, called [[basis path testing]] by McCabe who first proposed it, is to test each linearly independent path through the program
url=http://users.csc.calpoly.edu/~jdalbey/206/Lectures/BasisPathTutorial/index.html|
title=Basis Path Testing|
Line 12:
===Definition===
[[Image:control flow graph of function with loop and an if statement without loop back.svg|thumb|upright=1.1|alt=See caption|A control-flow graph of a simple program. The program begins executing at the red node, then enters a loop (group of three nodes immediately below the red node). Exiting the loop, there is a conditional statement (group below the loop) and the program exits at the blue node. This graph has nine edges, eight nodes and one [[connected component (graph theory)|connected component]], so the program's cyclomatic complexity is {{math|1=9 − 8 + 2×1 = 3}}.]]
There are
Another way to define the cyclomatic complexity of a program is to look at
<math display="block">M = E - N + 2P,</math>
Line 25:
[[Image:control flow graph of function with loop and an if statement.svg|thumb|upright=1.1|The same function, represented using the alternative formulation where each exit point is connected back to the entry point. This graph has 10 edges, eight nodes and one [[connected component (graph theory)|connected component]], which also results in a cyclomatic complexity of 3 using the alternative formulation ({{math|1=10 − 8 + 1 = 3}}).<!-- Do not change this to 4. This has been done thrice, but 3 is the correct answer.-->]]
An alternative formulation of this, as originally proposed, is to use a graph in which each exit point is connected back to the entry point. In this case, the graph is [[strongly connected]]
<math display="block">M = E - N + P.</math>
This may be seen as calculating the number of [[linearly independent cycle]]s that exist in the graph: those cycles that do not contain other cycles within themselves. Because each exit point loops back to the entry point, there is at least one such cycle for each exit point.
For a single program (or subroutine or method), {{mvar|P}}
<math display="block">M = E - N + 2.</math>
Cyclomatic complexity may be applied to several such programs or subprograms at the same time (to all of the methods in a class, for example)
McCabe showed that the cyclomatic complexity of a structured program with only one entry point and one exit point is equal to the number of decision points ("if" statements or conditional loops) contained in that program plus one. This is true only for decision points counted at the lowest, machine-level instructions.<ref>{{cite web|
url=https://www.froglogic.com/blog/tip-of-the-week/what-is-cyclomatic-complexity/| title=What exactly is cyclomatic complexity?|quote=To compute a graph representation of code, we can simply disassemble its assembly code and create a graph following the rules: ... |first=Sébastien|last=Fricker|date=April 2018|website=froglogic GmbH|access-date=October 27, 2018}}</ref> Decisions involving compound predicates like those found in high-level languages like <code>IF cond1 AND cond2 THEN ...</code> should be counted in terms of predicate variables involved
title=Encyclopedia of Computer Science and Technology|
author1=J. Belzer |author2=A. Kent |author3=A. G. Holzman |author4=J. G. Williams|
Line 42:
pages=367–368}}</ref>
Cyclomatic complexity may be extended to a program with multiple exit points
<math display="block">\pi - s + 2,</math>
where <math>\pi</math> is the number of decision points in the program and {{mvar|s}} is the number of exit points.<ref name="ecst" /><ref name="harrison">{{cite journal | journal=Software: Practice and Experience | title=Applying Mccabe's complexity measure to multiple-exit programs | author=Harrison | date=October 1984 | doi=10.1002/spe.4380141009 | volume=14 | issue=10 | pages=1004–1007 | s2cid=62422337}}</ref>
==={{anchor|Explanation in terms of algebraic topology}}Algebraic topology===
An even subgraph of a graph (also known as an [[Eulerian path|Eulerian subgraph]]) is one
The set of all even subgraphs of a graph is closed under [[symmetric difference]], and may thus be viewed as a vector space over [[GF(2)]]
A basis for the cycle space is easily constructed by first fixing a [[Glossary of graph theory#Trees|spanning forest]] of the graph, and then considering the cycles formed by one edge not in the forest and the path in the forest connecting the endpoints of that edge
|last=Diestel
|first=Reinhard
Line 68:
which is read as "the rank of the first [[Homology (mathematics)|homology]] group of the graph ''G'' relative to the [[Tree (data structure)#Terminology|terminal nodes]] ''t''". This is a technical way of saying "the number of linearly independent paths through the flow graph from an entry to an exit", where:
* "linearly independent" corresponds to homology
* "paths" corresponds to first homology
* "relative" means the path must begin and end at an entry (or exit) point.
Line 78:
=== Interpretation ===
In his presentation "Software Quality Metrics to Identify Risk"<ref>{{cite web |url=http://www.mccabe.com/ppt/SoftwareQualityMetricsToIdentifyRisk.ppt |title=Software Quality Metrics to Identify Risk |author=Thomas McCabe Jr. |year=2008 |archive-url=https://web.archive.org/web/20220329072759/http://www.mccabe.com/ppt/SoftwareQualityMetricsToIdentifyRisk.ppt |archive-date=2022-03-29 |url-status=live}}</ref> for the Department of Homeland Security, Tom McCabe
* 1 - 10: Simple procedure, little risk
Line 87:
==Applications==
===Limiting complexity during development===
One of McCabe's original applications was to limit the complexity of routines during program development
url=http://www.mccabe.com/pdf/mccabe-nist235r.pdf| title=Structured Testing: A Testing Methodology Using the Cyclomatic Complexity Metric|author1=Arthur H. Watson |author2=Thomas J. McCabe | year=1996|publisher=NIST Special Publication 500-235}}</ref>
===Measuring the "structuredness" of a program===
{{Main|Essential complexity (numerical measure of "structuredness")}} <!-- please update the link when that article is split, as it should be -->
Section VI of McCabe's 1976 paper is concerned with determining what the control-flow graphs (CFGs) of non-[[structured programming|structured program]]s look like in terms of their subgraphs, which McCabe
===Implications for software testing===
Line 120:
[[File:Control flow graph of function with two if else statements.svg|thumb|250px|right|The control-flow graph of the source code above; the red circle is the entry point of the function, and the blue circle is the exit point. The exit has been connected to the entry to make the graph strongly connected.]]
In this example, two test cases are sufficient to achieve a complete branch coverage, while four are necessary for complete path coverage. The cyclomatic complexity of the program is 3 (as the strongly connected graph for the program contains 9 edges, 7 nodes, and 1 connected component) ({{math|9 − 7 + 1}}).
In general, in order to fully test a module, all execution paths through the module should be exercised. This implies a module with a high complexity number requires more testing effort than a module with a lower value since the higher complexity number indicates more pathways through the code. This also implies that a module with higher complexity is more difficult
Unfortunately, it is not always practical to test all possible paths through a program. Considering the example above, each time an additional if-then-else statement is added, the number of possible paths grows by a factor of 2. As the program grows in this fashion, it quickly reaches the point where testing all of the paths becomes impractical.
One common testing strategy, espoused for example by the NIST Structured Testing methodology, is to use the cyclomatic complexity of a module to determine the number of [[white-box testing|white-box tests]] that are required to obtain sufficient coverage of the module. In almost all cases, according to such a methodology, a module should have at least as many tests as its cyclomatic complexity
As an example of a function that requires more than
* <code>c1()</code> returns true and <code>c2()</code> returns true
Line 141:
===Correlation to number of defects===
|journal=IEEE Transactions on Software Engineering|author1=Norman E Fenton |author2=Martin Neil |
url=http://www.eecs.qmul.ac.uk/~norman/papers/defects_prediction_preprint105579.pdf|
title=A Critique of Software Defect Prediction Models|
year=1999|volume=25|issue=3|pages=675–689|doi=10.1109/32.815326|citeseerx=10.1.1.548.2998 }}</ref> Some studies<ref name="schroeder99">{{cite journal| title=A Practical guide to object-oriented metrics|author=Schroeder, Mark|s2cid=14945518|year=1999|volume=1|issue=6|pages=30–36|journal=IT Professional |doi=10.1109/6294.806902}}</ref> find a positive correlation between cyclomatic complexity and defects
{{cite web |url=http://www.leshatton.org/TAIC2008-29-08-2008.html |title=The role of empiricism in improving the reliability of future software |author=Les Hatton |year=2008 |at=version 1.1}}</ref> that complexity has the same predictive ability as lines of code.
Studies that controlled for program size (i.e., comparing modules that have different complexities but similar size) are generally less conclusive, with many finding no significant correlation, while others do find correlation. Some researchers question the validity of the methods used by the studies finding no correlation.<ref name="kan">
|