Cyclomatic complexity: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 17:38, 11 May 2020 edit DannyS712 bot (talk \| contribs) Bots 133,273 edits m Task 70: Update syntaxhighlight tags - remove use of deprecated <source> tags Tag: AWB ← Previous edit		Latest revision as of 22:16, 10 March 2025 edit undo Synapse554 (talk \| contribs) Extended confirmed users 743 edits m Moved to section on algebraic topology to its own heading, given its different context compared to the rest of the article Tag: Visual edit
(36 intermediate revisions by 27 users not shown)
Line 1: {{Short description\|Measure of the structural complexity of a software program}} '''Cyclomatic complexity''' is a [[software metric]] used to indicate the [[Programming complexity\|complexity of a program]]. It is a quantitative measure of the number of linearly independent ~~paths~~[[path (graph theory)\|path]]s through a program's [[source code]]. It was developed by [[Thomas J. McCabe, Sr.]] in 1976. Cyclomatic complexity is computed using the [[control -flow graph]] of the program:. ~~the~~The nodes of the [[~~Graph~~directed ~~(discrete mathematics)~~graph\|graph]] correspond to indivisible groups of commands of a program, and a [[~~Directed~~directed ~~graph\|directed~~edge]] ~~edge~~ connects two nodes if the second command might be executed immediately after the first command. Cyclomatic complexity may also be applied to individual [[function (computer science)\|~~functions~~function]]s, [[modular programming\|modules]], [[method (computer science)\|~~methods~~method]]s, or [[class (computer science)\|~~classes~~class]]es within a program. One [[software testing\|testing]] strategy, called [[basis path testing]] by McCabe who first proposed it, is to test each linearly independent path through the program;. inIn this case, the number of test cases will equal the cyclomatic complexity of the program.<ref>{{cite web\| url=http://users.csc.calpoly.edu/~jdalbey/206/Lectures/BasisPathTutorial/index.html\| title=Basis Path Testing\| Line 9 ⟶ 10: ==Description== === Definition ===▼ ~~{{repetition\|date=July 2014}}~~ [[Image:control flow graph of function with loop and an if statement without loop back.svg\|thumb\|~~250px~~upright=1.1\|~~right~~alt=See caption\|A control -flow graph of a simple program. The program begins executing at the red node, then enters a loop (group of three nodes immediately below the red node). ~~On exiting~~Exiting the loop, there is a conditional statement (group below the loop), and ~~finally~~ the program exits at the blue node. This graph has 9nine edges, 8eight nodes, and 1one [[connected component (graph theory)\|connected component]], so the program's cyclomatic complexity ~~of the program~~ is {{math\|1=9 -− 8 + ~~21~~2×1 = 3}}.]]▼ ~~The~~There are multiple ways to define cyclomatic complexity of a section of [[source code]]. One common way is the number of linearly independent [[path (graph theory)\|paths]] within ~~it—where~~it. A set <math>S</math> of paths is "linearly independent" ~~means~~if ~~that~~the ~~each~~edge ~~path~~set ~~has~~of atany ~~least~~path ~~one~~<math>P</math> ~~edge~~in ~~that~~<math>S</math> is not inthe ~~one~~union of ~~the~~edge ~~other~~sets of the paths. ~~For~~in ~~instance,~~some ifsubset of <math>S/P</math>. If the source code contained no [[Control flow\|control flow statements]] (conditionals or decision points), the complexity would be 1, since there would be only a single path through the code. If the code had one single-condition [[if statement\|IF statement]], there would be two paths through the code: one where the IF statement ~~evaluates to~~is TRUE and another one where it ~~evaluates~~is toFALSE. ~~FALSE~~Here, so the complexity would be 2. Two nested single-condition IFs, or one IF with two conditions, would produce a complexity of 3.▼ ~~Mathematically,~~Another way to define the cyclomatic complexity of a ~~[[Structured programming\|structured~~ program~~]]{{efn\|1=Here "structured" means in particular "with a single exit ([[return statement]]) per function".}}~~ is ~~defined~~to ~~with~~look ~~reference~~at ~~to the~~its [[control -flow graph]] ~~of the program~~, a [[directed graph]] containing the [[basic block]]s of the program, with an edge between two basic blocks if control may pass from the first to the second. The complexity ~~'''~~{{mvar\|M~~'''~~}} is then defined as<ref name="mccabe76">{{cite journal\| last=McCabe\|date=December 1976\| journal=IEEE Transactions on Software Engineering\|issue=4\| pages=308–320\| title=A Complexity Measure \| volume=SE-2\| doi=10.1109/tse.1976.233837\|s2cid=9116234}}</ref>▼ ▲=== Definition === ▲[[Image:control flow graph of function with loop and an if statement without loop back.svg\|thumb\|250px\|right\|A control flow graph of a simple program. The program begins executing at the red node, then enters a loop (group of three nodes immediately below the red node). On exiting the loop, there is a conditional statement (group below the loop), and finally the program exits at the blue node. This graph has 9 edges, 8 nodes, and 1 [[connected component (graph theory)\|connected component]], so the cyclomatic complexity of the program is 9 - 8 + 21 = 3.]] ▲The cyclomatic complexity of a section of [[source code]] is the number of linearly independent [[path (graph theory)\|paths]] within it—where "linearly independent" means that each path has at least one edge that is not in one of the other paths. For instance, if the source code contained no [[Control flow\|control flow statements]] (conditionals or decision points), the complexity would be 1, since there would be only a single path through the code. If the code had one single-condition IF statement, there would be two paths through the code: one where the IF statement evaluates to TRUE and another one where it evaluates to FALSE, so the complexity would be 2. Two nested single-condition IFs, or one IF with two conditions, would produce a complexity of 3. <math display="block">M = E - N + 2P,</math> ▲Mathematically, the cyclomatic complexity of a [[Structured programming\|structured program]]{{efn\|1=Here "structured" means in particular "with a single exit ([[return statement]]) per function".}} is defined with reference to the [[control flow graph]] of the program, a [[directed graph]] containing the [[basic block]]s of the program, with an edge between two basic blocks if control may pass from the first to the second. The complexity '''M''' is then defined as<ref name="mccabe76">{{cite journal\| last=McCabe\|date=December 1976\| journal=IEEE Transactions on Software Engineering\|issue=4\| pages=308–320\| ~~title=A Complexity Measure\| url=https://books.google.com/books?id=vtNWAAAAMAAJ&pg=PA3 \| format=\| doi=10.1109/tse.1976.233837}}</ref>~~ ~~:''M'' = ''E'' − ''N'' + 2''P'',~~ where ~~:''~~{{mvar\|E''}} = the number of edges of the graph. ~~:''~~{{mvar\|N''}} = the number of nodes of the graph. ~~:''~~{{mvar\|P''}} = the number of [[~~connected~~ component (graph theory)\|connected components]]. [[Image:control flow graph of function with loop and an if statement.svg\|thumb\|~~250px~~upright=1.1\|The same function ~~as above~~, represented using the alternative formulation, where each exit point is connected back to the entry point. This graph has 10 edges, 8eight nodes, and 1one [[connected component (graph theory)\|connected component]], which also results in a cyclomatic complexity of 3 using the alternative formulation ({{math\|1=10 -− 8 + 1 = 3}}).]] <!-- Do not change this to 4. This has been done thrice, but 3 is the correct answer. ~~Read the paragraph below for the explanation of why:~~ -->]] An alternative formulation of this, as originally proposed, is to use a graph in which each exit point is connected back to the entry point. In this case, the graph is [[strongly connected]],. ~~and~~Here, the cyclomatic complexity of the program is equal to the [[cyclomatic number]] of its graph (also known as the [[Betti number#Example 2: the first Betti number in graph theory\|first Betti number]]), which is defined as<ref name="mccabe76" /> ~~:''~~<math display="block">M'' = ''E'' ~~−~~- ''N'' + ''P''.</math> This may be seen as calculating the number of [[linearly independent cycle]]s that exist in the graph~~, i.e.~~: those cycles that do not contain other cycles within themselves. ~~Note that because~~Because each exit point loops back to the entry point, there is at least one such cycle for each exit point. For a single program (or subroutine or method), ''{{mvar\|P~~'' is~~}} always ~~equal to~~equals 1~~. So~~; a simpler formula for a single subroutine is<ref name="Laplante2007">{{cite book\|author=Philip A. Laplante\|title=What Every Engineer Should Know about Software Engineering\|date=25 April 2007\|publisher=CRC Press\|isbn=978-1-4200-0674-2\|page=176}}</ref> <math display="block">M = E - N + 2.</math> :''M'' = ''E'' − ''N'' + 2.<ref name="Laplante2007">{{cite book\|author=Philip A. Laplante\|title=What Every Engineer Should Know about Software Engineering\|date=25 April 2007\|publisher=CRC Press\|isbn=978-1-4200-0674-2\|page=176}}</ref> Cyclomatic complexity may~~, however,~~ be applied to several such programs or subprograms at the same time (~~e.g.,~~ to all of the methods in a class), ~~and~~for inexample). In these cases, ''{{mvar\|P''}} will be equal to the number of programs in question, asand each subprogram will appear as a disconnected subset of the graph. McCabe showed that the cyclomatic complexity of ~~any~~a structured program with only one entry point and one exit point is equal to the number of decision points (~~i.e.,~~ "if" statements or conditional loops) contained in that program plus one. ~~However, this~~This is true only for decision points counted at the lowest, machine-level instructions.<ref>{{cite web\| url=https://www.froglogic.com/blog/tip-of-the-week/what-is-cyclomatic-complexity/\| title=What exactly is cyclomatic complexity?\|quote=To compute a graph representation of code, we can simply disassemble its assembly code and create a graph following the rules: ... \|first=Sébastien\|last=Fricker\|date=April 2018\|website=froglogic GmbH\|access-date=October 27, 2018}}</ref> Decisions involving compound predicates like those found in [[high-level ~~languages~~language]]s like <code>IF cond1 AND cond2 THEN ...</code> should be counted in terms of predicate variables involved~~, i.e~~. inIn this example, one should count two decision points, because at machine level it is equivalent to <code>IF cond1 THEN IF cond2 THEN ...</code>.<ref name="mccabe76"/><ref name="ecst">{{cite book\|▼ ~~url=https://www.froglogic.com/blog/tip-of-the-week/what-is-cyclomatic-complexity/\|~~ ▲title=What exactly is cyclomatic complexity?\|quote=To compute a graph representation of code, we can simply disassemble its assembly code and create a graph following the rules: ... \|first=Sébastien\|last=Fricker\|date=April 2018\|website=froglogic GmbH\|access-date=October 27, 2018}}</ref> Decisions involving compound predicates like those found in high-level languages like <code>IF cond1 AND cond2 THEN ...</code> should be counted in terms of predicate variables involved, i.e. in this example one should count two decision points, because at machine level it is equivalent to <code>IF cond1 THEN IF cond2 THEN ...</code>.<ref name="mccabe76"/><ref name="ecst">{{cite book\| title=Encyclopedia of Computer Science and Technology\| ~~author~~author1=J. Belzer, \|author2=A. Kent, \|author3=A. G. Holzman ~~and~~\|author4=J. G. Williams\| publisher=CRC Press\|year=1992\| pages=367–368}}</ref> Cyclomatic complexity may be extended to a program with multiple exit points;. inIn this case, it is equal to <math display="block">\pi - s + 2,</math> ~~:π − ''s'' + 2,~~ where π<math>\pi</math> is the number of decision points in the program, and ''{{mvar\|s''}} is the number of exit points.<ref name="ecst" /><ref name="harrison">{{cite journal \| journal=Software: Practice and Experience \| title=Applying Mccabe's complexity measure to multiple-exit programs \| author=Harrison \| date=October 1984 \| doi=10.1002/spe.4380141009 \| volume=14 \| issue=10 \| pages=1004–1007 \| s2cid=62422337}}</ref> ~~journal=Software: Practice and Experience\|~~ ~~title=Applying Mccabe's complexity measure to multiple-exit programs\|~~ ~~author=Harrison\|~~ ~~date=October 1984\|doi=10.1002/spe.4380141009~~ ~~\|volume=14~~ \| ~~issue=10\|pages=1004–1007}}</ref>~~ === Interpretation === === Explanation in terms of algebraic topology ===▼ In his presentation "Software Quality Metrics to Identify Risk"<ref>{{cite web \|url=http://www.mccabe.com/ppt/SoftwareQualityMetricsToIdentifyRisk.ppt \|title=Software Quality Metrics to Identify Risk \|author=Thomas McCabe Jr. \|year=2008 \|archive-url=https://web.archive.org/web/20220329072759/http://www.mccabe.com/ppt/SoftwareQualityMetricsToIdentifyRisk.ppt \|archive-date=2022-03-29 \|url-status=live}}</ref> for the Department of Homeland Security, Tom McCabe introduced the following categorization of cyclomatic complexity: An ''even subgraph'' of a graph (also known as an [[Eulerian path\|Eulerian subgraph]]) is one where every [[Vertex (graph theory)\|vertex]] is incident with an even number of edges; such subgraphs are unions of cycles and isolated vertices. In the following, even subgraphs will be identified with their edge sets, which is equivalent to only considering those even subgraphs which contain all vertices of the full graph.▼ 1 - 10: Simple procedure, little risk The set of all even subgraphs of a graph is closed under [[symmetric difference]], and may thus be viewed as a vector space over [[GF(2)]]; this vector space is called the ''cycle space'' of the graph. The [[cyclomatic number]] of the graph is defined as the dimension of this space. Since GF(2) has two elements and the cycle space is necessarily finite, the cyclomatic number is also equal to the 2-logarithm of the number of elements in the cycle space.▼ * 11 - 20: More complex, moderate risk * 21 - 50: Complex, high risk * > 50: Untestable code, very high risk ▲=== {{anchor\|Explanation in terms of algebraic topology}}In algebraic topology === A basis for the cycle space is easily constructed by first fixing a [[Glossary of graph theory#Trees\| spanning forest]] of the graph, and then considering the cycles formed by one edge not in the forest and the path in the forest connecting the endpoints of that edge; these cycles constitute a basis for the cycle space. Hence, the cyclomatic number also equals the number of edges not in a maximal spanning forest of a graph. Since the number of edges in a maximal spanning forest of a graph is equal to the number of vertices minus the number of components, the formula <math>E-N+P</math> above for the cyclomatic number follows.<ref>{{cite book▼ ▲An ''even subgraph'' of a graph (also known as an [[Eulerian path\|Eulerian subgraph]]) is one ~~where~~in which every [[~~Vertex~~vertex (graph theory)\|vertex]] is [[Graph (discrete mathematics)#Graph\|incident]] with an even number of edges;. ~~such~~Such subgraphs are unions of cycles and isolated vertices. ~~In the following, even subgraphs~~Subgraphs will be identified with their edge sets, which is equivalent to only considering those even subgraphs which contain all vertices of the full graph. ~~\|last=Diestel~~ ~~\|first=Reinhard~~ ~~\|title=Graph theory~~ ~~\|edition=2~~ ~~\|series=Graduate texts in mathematics '''173'''~~ ~~\|year=2000~~ ~~\|publisher=Springer~~ ~~\|___location=New York~~ ~~\|isbn=978-0-387-98976-1~~ ~~}}</ref>~~ ▲The set of all even subgraphs of a graph is closed under [[symmetric difference]], and may thus be viewed as a [[vector space]] over [[GF(2)]];. ~~this~~This vector space is called the ''cycle space'' of the graph. The [[cyclomatic number]] of the graph is defined as the [[dimension (vector space)\|dimension]] of this space. Since GF(2) has two elements and the cycle space is necessarily finite, the cyclomatic number is also equal to the [[binary logarithm\|2-logarithm]] of the number of elements in the cycle space. For the more topologically inclined, cyclomatic complexity can alternatively be defined as a relative [[Betti number]], the size of a [[relative homology]] group:▼ :<math>M := b_1(G,t) := \operatorname{rank}H_1(G,t),</math>▼ which is read as "the rank of the first homology group of the graph ''G'', relative to the terminal nodes ''t''". This is a technical way of saying "the number of linearly independent paths through the flow graph from an entry to an exit", where:▼ * "linearly independent" corresponds to homology, and means one does not double-count backtracking;▼ * "paths" corresponds to ''first'' homology: a path is a 1-dimensional object;▼ * "relative" means the path must begin and end at an entry or exit point.▼ ▲A [[basis (linear algebra)\|basis]] for the cycle space is easily constructed by first fixing a [[Glossary of graph theory#Trees\| spanning forest]] of the graph, and then considering the cycles formed by one edge not in the forest and the path in the forest connecting the endpoints of that edge;. ~~these~~These cycles ~~constitute~~form a basis for the cycle space. ~~Hence, the~~The cyclomatic number also equals the number of edges not in a maximal spanning forest of a graph. Since the number of edges in a maximal spanning forest of a graph is equal to the number of vertices minus the number of components, the formula <math>E-N+P</math> ~~above for~~defines the cyclomatic number ~~follows~~.<ref>{{cite book \|last=Diestel \|first=Reinhard \|title=Graph theory \|publisher=Springer \|year=2000 \|isbn=978-0-387-98976-1 \|edition=2 \|series=Graduate texts in mathematics '''173''' \|___location=New York}}</ref> ~~This corresponds to the intuitive notion of cyclomatic complexity, and can be calculated as above.~~ ▲~~For the more topologically inclined, cyclomatic~~Cyclomatic complexity can ~~alternatively~~also be defined as a relative [[Betti number]], the size of a [[relative homology]] group:<math display="block">M := b_1(G,t) := \operatorname{rank}H_1(G,t),</math> Alternatively, one can compute this via absolute Betti number (absolute homology – not relative) by identifying (gluing together) all the terminal nodes on a given component (or equivalently, draw paths connecting the exits to the entrance), in which case (calling the new, augmented graph <math>\tilde G</math>, which is {{clarify\|reason=something is missing here\|date=November 2013}}), one obtains ~~:<math>M = b_1(\tilde G) = \operatorname{rank}H_1(\tilde G).</math>~~ ▲which is read as "the rank of the first [[Homology (mathematics)\|homology]] group of the graph ''G'', relative to the [[Tree (data structure)#Terminology\|terminal nodes]] ''t''". This is a technical way of saying "the number of linearly independent paths through the flow graph from an entry to an exit", where: It can also be computed via [[homotopy]]. If one considers the control flow graph as a 1-dimensional [[CW complex]] called <math>X</math>, then the [[fundamental group]] of <math>X</math> will be <math>\pi_1(X) = \Z^n</math>. The value of <math>n+1</math> is the cyclomatic complexity. The fundamental group counts how many loops there are through the graph, up to homotopy, and hence aligns with what we would intuitively expect.▼ ▲* "linearly independent" corresponds to homology, and ~~means~~backtracking ~~one does~~is not double-~~count backtracking~~counted; ▲* "paths" corresponds to ''first'' homology: (a path is a 1one-dimensional object); and ▲* "relative" means the path must begin and end at an entry (or exit) point. This cyclomatic complexity can be calculated. It may also be computed via absolute [[Betti number]] by identifying the terminal nodes on a given component, or drawing paths connecting the exits to the entrance. The new, augmented graph <math>\tilde G</math> obtains ~~This corresponds to the characterization of cyclomatic complexity as "number of loops plus number of components".~~ ▲:<math display="block">M := b_1(\tilde G,t) := \operatorname{rank}H_1(\tilde G,t),.</math> ▲It can also be computed via [[homotopy]]. If ~~one~~a ~~considers the~~(connected) control -flow graph asis considered a 1one-dimensional [[CW complex]] called <math>X</math>, ~~then~~ the [[fundamental group]] of <math>X</math> will be <math>\pi_1(X) =\cong \Z^{n}</math>. The value of <math>n+1</math> is the cyclomatic complexity. The fundamental group counts how many loops there are through the graph, up to homotopy, ~~and~~aligning ~~hence aligns with what we would intuitively~~as ~~expect~~expected. ==Applications== === Limiting complexity during development === One of McCabe's original applications was to limit the complexity of routines during program development;. heHe recommended that programmers should count the complexity of the modules they are developing, and split them into smaller modules whenever the cyclomatic complexity of the module exceeded 10.<ref name="mccabe76" /> This practice was adopted by the [[NIST]] Structured Testing methodology, ~~with~~which ~~an observation~~observed that since McCabe's original publication, the figure of 10 had received substantial corroborating evidence. However, ~~but~~it also noted that in some circumstances it may be appropriate to relax the restriction and permit modules with a complexity as high as 15. As the methodology acknowledged that there were occasional reasons for going beyond the agreed-upon limit, it phrased its recommendation as "For each module, either limit cyclomatic complexity to [the agreed-upon limit] or provide a written explanation of why the limit was exceeded."<ref name="nist">{{cite web\|▼ ▲One of McCabe's original applications was to limit the complexity of routines during program development; he recommended that programmers should count the complexity of the modules they are developing, and split them into smaller modules whenever the cyclomatic complexity of the module exceeded 10.<ref name="mccabe76" /> This practice was adopted by the [[NIST]] Structured Testing methodology, with an observation that since McCabe's original publication, the figure of 10 had received substantial corroborating evidence, but that in some circumstances it may be appropriate to relax the restriction and permit modules with a complexity as high as 15. As the methodology acknowledged that there were occasional reasons for going beyond the agreed-upon limit, it phrased its recommendation as "For each module, either limit cyclomatic complexity to [the agreed-upon limit] or provide a written explanation of why the limit was exceeded."<ref name="nist">{{cite web\| url=http://www.mccabe.com/pdf/mccabe-nist235r.pdf\| title=Structured Testing: A Testing Methodology Using the Cyclomatic Complexity Metric\|author1=Arthur H. Watson \|author2=Thomas J. McCabe \| year=1996\|publisher=NIST Special Publication 500-235}}</ref> === Measuring the "structuredness" of a program === {{~~main~~Main\|Essential complexity (numerical measure of "structuredness")}} <!-- please update the link when that article is split, as it should be --> Section VI of McCabe's 1976 paper is concerned with determining what the control -flow graphs (CFGs) of non-[[structured programming\|structured program]]s look like in terms of their subgraphs, which McCabe ~~identifies~~identified. (For details ~~on that part~~, see [[structured program theorem]].) McCabe ~~concludes~~concluded that section by proposing a numerical measure of how close to the structured programming ideal a given program is, i.e. its "structuredness" ~~using McCabe's neologism~~. McCabe called the measure he devised for this purpose [[Essential complexity (numerical measure of "structuredness")\|essential complexity]].<ref name="mccabe76" /> In order to calculate this measure, the original CFG is iteratively reduced by identifying subgraphs that have a single-entry and a single-exit point, which are then replaced by a single node. This reduction corresponds to what a human would do if they extracted a subroutine from the larger piece of code. (Nowadays such a process would fall under the umbrella term of [[refactoring]].) McCabe's reduction method was later called ''condensation'' in some textbooks, because it was seen as a generalization of the [[condensation (graph theory)\|condensation to components used in graph theory]].<ref>{{cite book\|author=Paul C. Jorgensen\|title=Software Testing: A Craftsman's Approach, Second Edition\|url=https://books.google.com/books?id=Yph_AwAAQBAJ&pg=PA150\|year=2002\|publisher=CRC Press\|isbn=978-0-8493-0809-3\|pages=150–153\|edition=2nd}}</ref> If a program is structured, then McCabe's reduction/condensation process reduces it to a single CFG node. In contrast, if the program is not structured, the iterative process will identify the irreducible part. The essential complexity measure defined by McCabe is simply the cyclomatic complexity of this irreducible graph, so it will be precisely 1 for all structured programs, but greater than one for non-structured programs.<ref name="nist"/>{{rp\|80}}<!--note that the original paper has an error here in that it subtracts the number of nodes from that [and thus can give negative essential complexity measures for some non-structured programs], but the later technical report fixed that]-->▼ ▲~~In order to~~To calculate this measure, the original CFG is iteratively reduced by identifying subgraphs that have a single-entry and a single-exit point, which are then replaced by a single node. This reduction corresponds to what a human would do if they extracted a subroutine from the larger piece of code. (Nowadays such a process would fall under the umbrella term of [[refactoring]].) McCabe's reduction method was later called ''condensation'' in some textbooks, because it was seen as a generalization of the [[condensation (graph theory)\|condensation to components used in graph theory]].<ref>{{cite book\|author=Paul C. Jorgensen\|title=Software Testing: A Craftsman's Approach, Second Edition\|url=https://books.google.com/books?id=Yph_AwAAQBAJ&pg=PA150\|year=2002\|publisher=CRC Press\|isbn=978-0-8493-0809-3\|pages=150–153\|edition=2nd}}</ref> If a program is structured, then McCabe's reduction/condensation process reduces it to a single CFG node. In contrast, if the program is not structured, the iterative process will identify the irreducible part. The essential complexity measure defined by McCabe is simply the cyclomatic complexity of this irreducible graph, so it will be precisely 1 for all structured programs, but greater than one for non-structured programs.<ref name="nist"/>{{rp\|80}}<!--note that the original paper has an error here in that it subtracts the number of nodes from that [and thus can give negative essential complexity measures for some non-structured programs], but the later technical report fixed that]--> === Implications for software testing ===▼ ▲=== Implications for software testing === Another application of cyclomatic complexity is in determining the number of test cases that are necessary to achieve thorough test coverage of a particular module. It is useful because of two properties of the cyclomatic complexity, ''{{mvar\|M''}}, for a specific module: '' {{mvar\|M''}} is an upper bound for the number of test cases that are necessary to achieve a complete [[branch coverage]]. '' {{mvar\|M''}} is a lower bound for the number of paths through the control -flow graph (CFG). Assuming each test case takes one path, the number of cases needed to achieve [[path coverage]] is equal to the number of paths that can actually be taken. But some paths may be impossible, so although the number of paths through the CFG is clearly an upper bound on the number of test cases needed for path coverage, this latter number (of ''possible'' paths) is sometimes less than ''{{mvar\|M''}}. All three of the above numbers may be equal: branch coverage <math>\leq</math> cyclomatic complexity <math>\leq</math> number of paths. Line 125 ⟶ 107: </syntaxhighlight> [[File:Control flow graph of function with two if else statements.svg\|thumb\|250px\|right\|The control -flow graph of the source code above; the red circle is the entry point of the function, and the blue circle is the exit point. The exit has been connected to the entry to make the graph strongly connected.]] In this example, two test cases are sufficient to achieve a complete branch coverage, while four are necessary for complete path coverage. The cyclomatic complexity of the program is 3 (as the strongly connected graph for the program contains 98 edges, 7 nodes, and 1 connected component) (9{{math\|8 − 7 + 12}}). In general, in order to fully test a module, all execution paths through the module should be exercised. This implies a module with a high complexity number requires more testing effort than a module with a lower value since the higher complexity number indicates more pathways through the code. This also implies that a module with higher complexity is more difficult ~~for a programmer~~ to understand since the programmer must understand the different pathways and the results of those pathways. Unfortunately, it is not always practical to test all possible paths through a program. Considering the example above, each time an additional if-then-else statement is added, the number of possible paths grows by a factor of 2. As the program grows in this fashion, it quickly reaches the point where testing all of the paths becomes impractical. One common testing strategy, espoused for example by the NIST Structured Testing methodology, is to use the cyclomatic complexity of a module to determine the number of [[white-box testing\|white-box tests]] that are required to obtain sufficient coverage of the module. In almost all cases, according to such a methodology, a module should have at least as many tests as its cyclomatic complexity;. inIn most cases, this number of tests is adequate to exercise all the relevant paths of the function.<ref name="nist"/> As an example of a function that requires more than ~~simply~~mere branch coverage to test accurately, ~~consider again~~reconsider the above function. However, ~~but~~ assume that to avoid a bug occurring, any code that calls either <code>f1()</code> or <code>f3()</code> must also call the other.{{efn\|This is a fairly common type of condition; consider the possibility that <code>f1</code> allocates some resource which <code>f3</code> releases.}} Assuming that the results of <code>c1()</code> and <code>c2()</code> are independent, ~~that means that~~ the function as presented above contains a bug. Branch coverage ~~would~~allows ~~allow~~the usmethod to ~~test~~be ~~the method~~tested with just two tests, ~~and~~such ~~one~~as ~~possible~~the ~~set of tests would be to~~following test ~~the following~~ cases: <code>c1()</code> returns true and <code>c2()</code> returns true Line 147 ⟶ 129: ===Correlation to number of defects=== ~~A number of~~Multiple studies have investigated the correlation between McCabe's cyclomatic complexity number with the frequency of defects occurring in a function or method.<ref name="fenton">{{cite journal \|journal=IEEE Transactions on Software Engineering\|author1=Norman E Fenton \|author2=Martin Neil \| url=http://www.eecs.qmul.ac.uk/~norman/papers/defects_prediction_preprint105579.pdf\| title=A Critique of Software Defect Prediction Models\| year=1999\|volume=25\|issue=3\|pages=675–689~~\|issn=~~\|doi=10.1109/32.815326\|citeseerx=10.1.1.548.2998 }}</ref> Some studies<ref name="schroeder99">{{cite ~~article~~journal\| title=A Practical guide to object-oriented metrics\|author=Schroeder, Mark\|s2cid=14945518\|year=1999\|volume=1\|issue=6\|pages=~~30-36~~30–36\|journal=IT Professional ~~\| url=https://pdfs.semanticscholar.org/e3d6/6c47ee0ddb37868c51ca30840084263ee1f1.pdf~~\|doi=10.1109/6294.806902}}</ref> find a positive correlation between cyclomatic complexity and defects:; functions and methods that have the highest complexity tend to also contain the most defects. However, the correlation between cyclomatic complexity and program size (typically measured in [[lines of code]]) has been demonstrated many times. [[Les Hatton]] has claimed<ref name="taic"> {{cite web \|url=http://www.leshatton.org/TAIC2008-29-08-2008.html \|title=The role of empiricism in improving the reliability of future software \|author=Les Hatton \|year=2008 \|at=version 1.1}}</ref> that complexity has the same predictive ability as lines of code. Studies that controlled for program size (i.e., comparing modules that have different complexities but similar size) are generally less conclusive, with many finding no significant correlation, while others do find correlation. Some researchers ~~who have studied the area~~ question the validity of the methods used by the studies finding no correlation.<ref name="kan"> {{cite book \|title=Metrics and Models in Software Quality Engineering \|author=Kan \|pages=316–317 \|publisher=Addison-Wesley \|year=2003 \|isbn=978-0-201-72915-3}}</ref> Although this relation islikely ~~probably true~~exists, it ~~isn't~~is not easily ~~utilizable~~used in practice.<ref name=cherf>{{cite journal\| journal=Journal of Software Quality\| author=G.S. Cherf\| title=An Investigation of the Maintenance and Support Characteristics of Commercial Software\| year=1992\|volume=1\|issue=3\|pages=147–158\| issn=1573-1367\|doi=10.1007/bf01720922\| ~~issn~~s2cid=~~1573-1367\|doi=10.1007/bf01720922~~37274091}}</ref> Since program size is not a controllable feature of commercial software, the usefulness of ~~McCabes~~McCabe's number has been ~~called to question~~questioned.<ref name="fenton" /> The essence of this observation is that larger programs tend to be more complex and to have more defects. Reducing the cyclomatic complexity of code is [[correlation does not imply causation\|not proven]] to reduce the number of errors or bugs in that code. International safety standards like [[ISO 26262~~]] and [[IEC 62304~~]], however, mandate coding guidelines that enforce low code complexity.<ref name="ISO26262Part3">{{cite book \| title =ISO 26262-3:2011(en) Road vehicles — Functional safety — Part 3: Concept phase\| publisher =International Standardization Organization \| url =https://www.iso.org/obp/ui/#iso:std:iso:26262:-3:ed-1:v1:en}}</ref><ref>{{cite book\|title=Medical device software – Software life cycle processes\|chapter-url=http://webstore.iec.ch/preview/info_iec62304%7Bed1.0%7Den_d.pdf\|work=INTERNATIONAL IEC STANDARD 62304 First edition 2006-05\|publisher=International Electrotechnical Commission\|author=International Electrotechnical Commission\|year=2006\|page=51\|chapter=B.5.5 SOFTWARE UNIT implementation and verification}}</ref> ~~==Artificial intelligence==~~ Cyclomatic complexity may also be used for the evaluation of the semantic complexity of artificial intelligence programs.<ref>{{cite journal \|doi=10.1016/j.compag.2011.11.009 \|title=Artificial Intelligence in modelling the complexity of Mediterranean landscape transformations \|journal=Computers and Electronics in Agriculture \|volume=81 \|pages=87–96 \|year=2012 \|last1=Papadimitriou \|first1=Fivos }}</ref> ~~==Ultrametric topology==~~ Cyclomatic complexity has proven useful in geographical and landscape-ecological analysis, after it was shown that it can be implemented on graphs of [[Ultrametric space\|ultrametric]] distances.<ref>{{cite journal \|doi=10.1080/1747423X.2011.637136 \|title=Mathematical modelling of land use and landscape complexity with ultrametric topology \|journal=Journal of Land Use Science \|volume=8 \|issue=2 \|pages=234–254 \|year=2013 \|last1=Papadimitriou \|first1=Fivos }}</ref> ==See also== Line 180 ⟶ 157: * [[Software testing]] * [[Static program analysis]] * [[Synchronization complexity]] * [[Maintainability]] Line 192 ⟶ 168: * [http://www.mathworks.com/discovery/cyclomatic-complexity.html Generating cyclomatic complexity metrics with Polyspace] * [http://www.leshatton.org/Documents/TAIC2008-29-08-2008.pdf The role of empiricism in improving the reliability of future software] * [https://web.archive.org/web/20230203195937/https://www.cqse.eu/en/news/blog/mccabe-cyclomatic-complexity// McCabe's Cyclomatic Complexity and Why We Don't Use It]▼ * [https://sites.google.com/site/maltapplication/ A small C/C++ source code analyzer using the cyclometric complexity metric (windows only, no source code)] ▲* [https://www.cqse.eu/en/blog/mccabe-cyclomatic-complexity/ McCabe's Cyclomatic Complexity and Why We Don't Use It] [[Category:Software metrics]]