Bisection (software engineering): Difference between revisions

Content deleted Content added
Hdu hh (talk | contribs)
OAbot (talk | contribs)
m Open access bot: doi added to citation with #oabot.
 
(39 intermediate revisions by 28 users not shown)
Line 1:
{{Short description|Software engineering}}
Code Bisection is a method used in [[Software_development|software development]] to identify change sets that result in a specific behavior change. It is mostly employed for finding the patch that introduced a bug. Another application area is finding the patch that indirectly fixed a bug.
{{Other uses|Bisect (disambiguation){{!}}Bisect}}
Code '''Bisection''' is a method used in [[Software_development|software development]] to identify [[Patch (computing)|change sets]] that result in a specific behavior change. It is mostly employed for finding the patch that introduced a [[software bug|bug]]. Another application area is finding the patch that indirectly fixed a bug.
 
==Overview==
The process of locating the [[changeset]] that introduced a specific [[software regression|regression]] was described as "source change isolation" in 1997 by Brian Ness and Viet Ngo of [[Cray Research]]. [[Regression testing]] was performed on Cray's [[compiler]]s in editions comprising one or more changesets. Editions with known regressions could not be validated until developers addressed the problem. Source change isolation narrowed the cause to a single changeset that could then be excluded from editions, unblocking them with respect to this problem, while the author of the change worked on a fix. Ness and Ngo outlined [[linear search]] and [[binary search]] methods of performing this isolation.<ref name="Ness 97">{{cite conference |title=Regression containment through source change isolation |last=Ness |first=Brian |last2=Ngo |first2=Viet |date=1997 |conference=Computer Software and Applications Conference |doi=10.1109/CMPSAC.1997.625082 |publisher=IEEE}}</ref>
The code bisection method is a [[Divide_and_conquer_algorithm|divide and conquer]] algorithm that tries to minimize the effort to find a specific change set.
 
Code bisection has the goal of minimizing the effort to find a specific change set.
It depends on the having access to the code history which is usually preserved by [[Revision_control|revision control]] in a [[Repository_(version_control)|code repository]].
It employs a [[divide and conquer algorithm]] that
It depends on the having access to the code history which is usually preserved by [[Revision_control|revision control]] in a [[Repository_(version_control)|code repository]].
[[revision control]] in a [[Repository (version control)|code repository]].
 
==The Bisection Methodmethod==
===Code Bisectionbisection Algorithmalgorithm===
Code history has the structure of ana [[Acyclic_directed_graph|acyclic directed acyclic graph]] which can be [[Topological_orderingTopological ordering|topologically sorted]]. This makes it possible to use a divide and conquer search algorithm which:
* splits up athe [[Search_spaceComputational geometry#Geometric query problems|search space]] of candidate revisions
* tests for the behavior in question
* reduces the search space depending on the test result
* re-iterates the steps above until a range with at most one bisectable [[Candidate_solutionCandidate solution|patch candidate]] remains
 
===Algorithmic complexity===
Code bisection as a divide and conquer algorithmBisection is in [[L_L (complexity)|LSPACE]] having an [[Algorithmic_complexityAnalysis of algorithms|algorithmic complexity]] of <math>O(\log_2log N)</math> with <math>N</math> denoting the number of revisions in the search space, and is similar to a [[binary search]].
 
===Desirable Repositoryrepository Propertiesproperties===
For code bisection it is desirable that each revision in the search space can be compiledbuilt and tested independently.
 
===Monotonicity===
==Support by Revision Control Systems==
Revision control systems like [[Git_(software)|Git]] or [[Mercurial]] directly support <ref>http://www.kernel.org/pub/software/scm/git/docs/v1.7.10/git-bisect.html</ref><ref>http://www.selenic.com/mercurial/hg.1.html#bisect</ref> code bisection.
 
For the bisection algorithm to identify a single changeset which caused the behavior being tested to change, the behavior must change [[Monotonic function|monotonically]] across the search space. For a Boolean function such as a pass/fail test, this means that it only changes once across all changesets between the start and end of the search space.
Other revision control systems like [[Bazaar_(software)|Bazaar] or [[Apache_Subversion|Subversion]] support it indirectly employing plugins<ref>http://doc.bazaar.canonical.com/plugins/en/bisect-plugin.html</ref> or external scripts<ref>http://search.cpan.org/dist/App-SVN-Bisect/bin/svn-bisect</ref>.
 
If there are multiple changesets across the search space where the behavior being tested changes between false and true, then the bisection algorithm will find one of them, but it will not necessarily be the [[root cause analysis|root cause]] of the change in behavior between the start and the end of the search space. The root cause could be a different changeset, or a combination of two or more changesets across the search space. To help deal with this problem, automated tools allow specific changesets to be ignored during a bisection search.
 
==Automation support==
Although the bisection method can be completed manually, one of its main advantages is that it can be easily automated.<ref name="Ness 97" /> It can thus fit into existing [[test automation]] processes: failures in exhaustive automated regression tests can trigger automated bisection to localize faults. Ness and Ngo focused on its potential in Cray's [[continuous delivery]]-style environment in which the automatically isolated bad changeset could be automatically excluded from builds.<ref name="Zeller 99">{{cite conference |first=Andreas |last=Zeller |title=Yesterday, my program worked. Today, it does not. Why? |date=1999 |conference=European Software Engineering Conference |___location=Toulouse, France |doi=10.1145/318774.318946|doi-access=free }}</ref>
 
The revision control systems [[Fossil SCM|Fossil]], [[Git (software)|Git]] and [[Mercurial]] have built-in functionality for code bisection.<ref>{{Cite web|title=Fossil: Help: bisect|url=https://www.fossil-scm.org/fossil/help/bisect|access-date=2020-09-03|website=www.fossil-scm.org}}</ref><ref>{{cite web|url=https://git-scm.com/docs/git-bisect |title=git-bisect(1) |website=git-scm.com |accessdate=2017-08-05}}</ref><ref>{{cite web|url=https://www.selenic.com/mercurial/hg.1.html#bisect |title=hg |website=Selenic.com |date= |accessdate=2017-01-09}}</ref> The user can start a bisection session with a specified range of revisions from which the revision control system proposes a revision to test, the user tells the system whether the revision tested as "good" or "bad", and the process repeats until the specific "bad" revision has been identified. Other revision control systems, such as [[Bazaar (software)|Bazaar]] or [[Apache Subversion|Subversion]], support bisection through plugins<ref>{{cite web|url=http://doc.bazaar.canonical.com/plugins/en/bisect-plugin.html |title=bisect - Find the revision introducing a bug using a binary search — Bazaar 2.8.0dev1 documentation |website=Doc.bazaar.canonical.com |date= |accessdate=2017-01-09}}</ref> or external scripts.<ref>{{cite web|url=https://metacpan.org/dist/App-SVN-Bisect/view/bin/svn-bisect |title=svn-bisect |website=Metacpan.org |date= |accessdate=2022-08-03}}</ref>
 
[[Phoronix Test Suite]] can do bisection automatically to find performance regressions.
 
==See also==
* [[Delta debugging]] (generalization of finding a minimal cause of a bug)
* {{section link|Annotation|Source control}} (determining changesets that edited a line in a file)
 
==References==
{{Reflist}}
 
[[Category:RevisionVersion control]]
[[Category:RevisionVersion control systems]]
[[Category:Software development process]]
[[Category:Algorithms]]