Cyclomatic complexity: Difference between revisions

Content deleted Content added
Thashley (talk | contribs)
Streamlined text; corrected grammar and punctuation
Vstephen B (talk | contribs)
Line 1:
{{Short description|Measure of the structural complexity of a software program}}
'''Cyclomatic complexity''' is a [[software metric]] used to indicate the [[Programming complexity|complexity of a program]]. It is a quantitative measure of the number of linearly independent paths[[path (graph theory)|path]]s through a program's [[source code]]. It was developed by [[Thomas J. McCabe, Sr.]] in 1976.
 
Cyclomatic complexity is computed using the [[control-flow graph]] of the program. The nodes of the [[Graphdirected (discrete mathematics)graph|graph]] correspond to indivisible groups of commands of a program, and a [[Directeddirected graph|directededge]] edge connects two nodes if the second command might be executed immediately after the first command. Cyclomatic complexity may also be applied to individual [[function (computer science)|functionsfunction]]s, [[modular programming|modules]], [[method (computer science)|methodsmethod]]s, or [[class (computer science)|classesclass]]es within a program.
 
One [[software testing|testing]] strategy, called [[basis path testing]] by McCabe who first proposed it, is to test each linearly independent path through the program. In this case, the number of test cases will equal the cyclomatic complexity of the program.<ref>{{cite web|
Line 12:
===Definition===
[[Image:control flow graph of function with loop and an if statement without loop back.svg|thumb|upright=1.1|alt=See caption|A control-flow graph of a simple program. The program begins executing at the red node, then enters a loop (group of three nodes immediately below the red node). Exiting the loop, there is a conditional statement (group below the loop) and the program exits at the blue node. This graph has nine edges, eight nodes and one [[connected component (graph theory)|connected component]], so the program's cyclomatic complexity is {{math|1=9 − 8 + 2×1 = 3}}.]]
There are multiple ways to define cyclomatic complexity of a section of [[source code]]. One common way is the number of linearly independent [[path (graph theory)|paths]] within it. A set <math>S</math> of paths is linearly independent if the edge set of any path <math>P</math> in <math>S</math> is not the union of edge sets of the paths in some subset of <math>S/P</math>. If the source code contained no [[Control flow|control flow statements]] (conditionals or decision points) the complexity would be 1, since there would be only a single path through the code. If the code had one single-condition [[if statement|IF statement]], there would be two paths through the code: one where the IF statement is TRUE and another one where it is FALSE. Here, the complexity would be 2. Two nested single-condition IFs, or one IF with two conditions, would produce a complexity of 3.
 
Another way to define the cyclomatic complexity of a program is to look at its [[control-flow graph]], a [[directed graph]] containing the [[basic block]]s of the program, with an edge between two basic blocks if control may pass from the first to the second. The complexity {{mvar|M}} is then defined as<ref name="mccabe76">{{cite journal| last=McCabe|date=December 1976| journal=IEEE Transactions on Software Engineering|issue=4| pages=308–320| title=A Complexity Measure | volume=SE-2| doi=10.1109/tse.1976.233837|s2cid=9116234}}</ref>
Line 21:
*{{mvar|E}} = the number of edges of the graph.
*{{mvar|N}} = the number of nodes of the graph.
*{{mvar|P}} = the number of [[connected component (graph theory)|connected components]].
 
[[Image:control flow graph of function with loop and an if statement.svg|thumb|upright=1.1|The same function, represented using the alternative formulation where each exit point is connected back to the entry point. This graph has 10 edges, eight nodes and one [[connected component (graph theory)|connected component]], which also results in a cyclomatic complexity of 3 using the alternative formulation ({{math|1=10 − 8 + 1 = 3}}).<!-- Do not change this to 4. This has been done thrice, but 3 is the correct answer.-->]]
Line 36:
 
McCabe showed that the cyclomatic complexity of a structured program with only one entry point and one exit point is equal to the number of decision points ("if" statements or conditional loops) contained in that program plus one. This is true only for decision points counted at the lowest, machine-level instructions.<ref>{{cite web|
url=https://www.froglogic.com/blog/tip-of-the-week/what-is-cyclomatic-complexity/| title=What exactly is cyclomatic complexity?|quote=To compute a graph representation of code, we can simply disassemble its assembly code and create a graph following the rules:&nbsp;... |first=Sébastien|last=Fricker|date=April 2018|website=froglogic GmbH|access-date=October 27, 2018}}</ref> Decisions involving compound predicates like those found in [[high-level languageslanguage]]s like <code>IF cond1 AND cond2 THEN ...</code> should be counted in terms of predicate variables involved. In this example, one should count two decision points because at machine level it is equivalent to <code>IF cond1 THEN IF cond2 THEN ...</code>.<ref name="mccabe76"/><ref name="ecst">{{cite book|
title=Encyclopedia of Computer Science and Technology|
author1=J. Belzer |author2=A. Kent |author3=A. G. Holzman |author4=J. G. Williams|
Line 47:
 
==={{anchor|Explanation in terms of algebraic topology}}Algebraic topology===
An even subgraph of a graph (also known as an [[Eulerian path|Eulerian subgraph]]) is one in which every [[Vertexvertex (graph theory)|vertex]] is [[Graph (discrete mathematics)#Graph|incident]] with an even number of edges. Such subgraphs are unions of cycles and isolated vertices. Subgraphs will be identified with their edge sets, which is equivalent to only considering those even subgraphs which contain all vertices of the full graph.
 
The set of all even subgraphs of a graph is closed under [[symmetric difference]], and may thus be viewed as a [[vector space]] over [[GF(2)]]. This vector space is called the cycle space of the graph. The [[cyclomatic number]] of the graph is defined as the [[dimension (vector space)|dimension]] of this space. Since GF(2) has two elements and the cycle space is necessarily finite, the cyclomatic number is also equal to the [[Naturalbinary logarithm of 2|2-logarithm]] of the number of elements in the cycle space.
 
A [[basis (linear algebra)|basis]] for the cycle space is easily constructed by first fixing a [[Glossary of graph theory#Trees|spanning forest]] of the graph, and then considering the cycles formed by one edge not in the forest and the path in the forest connecting the endpoints of that edge. These cycles form a basis for the cycle space. The cyclomatic number also equals the number of edges not in a maximal spanning forest of a graph. Since the number of edges in a maximal spanning forest of a graph is equal to the number of vertices minus the number of components, the formula <math>E-N+P</math> defines the cyclomatic number.<ref>{{cite book
|last=Diestel
|first=Reinhard