Content deleted Content added
Mentions techniques to filter out overfitting patches in a top-level section "Overfitting" |
Jnestorius (talk | contribs) |
||
(45 intermediate revisions by 22 users not shown) | |||
Line 1:
{{short description|Automatic repair of software bugs}}
'''Automatic bug-fixing''' is the automatic [[Patch (computing)|repair]] of [[software bug]]s without the intervention of a human programmer.<ref>{{Cite journal |last=Rinard |first=Martin C. |year=2008 |title=Technical perspective ''Patching'' program errors |journal=Communications of the ACM |volume=51 |issue=12 |pages=86 |doi=10.1145/1409360.1409381 |s2cid=28629846}}</ref><ref>{{Cite journal |last=Harman |first=Mark |year=2010 |title=Automated patching techniques |journal=Communications of the ACM |volume=53 |issue=5 |pages=108 |doi=10.1145/1735223.1735248 |s2cid=9729944}}</ref>
== Specification ==
Automatic bug fixing is made according to a specification of the expected behavior which can be for instance a [[formal specification]] or a [[test suite]].<ref name="genprog2009">{{Cite book |
A test-suite – the input/output pairs specify the functionality of the program, possibly captured in [[Assertion (software development)|assertions]] can be used as a [[test oracle]] to drive the search. This oracle can in fact be divided between the ''bug oracle'' that exposes the faulty behavior, and the ''regression oracle'', which encapsulates the functionality any program repair method must preserve. Note that a test suite is typically incomplete and does not cover all possible cases. Therefore, it is often possible for a validated patch to produce expected outputs for all inputs in the test suite but incorrect outputs for other inputs.<ref name="kali">{{Cite book |
Another way to specify the expected behavior is to use [[formal specification]]s<ref name="autofixe">{{Cite journal |
== Techniques ==
Line 14:
=== Generate-and-validate ===
Generate-and-validate approaches compile and test each candidate patch to collect all validated patches that produce expected outputs for all inputs in the test suite.<ref name="genprog2009" /><ref name="kali" /> Such a technique typically starts with a test suite of the program, i.e., a set of [[test
<!-- mutation based repair -->
One way to generate candidate patches is to apply [[program mutation|mutation operators]] on the original program. Mutation operators manipulate the original program, potentially via its [[abstract syntax tree]] representation, or a more coarse-grained representation such as operating at the [[Statement (programming)|statement]]-level or [[Block (programming)|block]]-level. Earlier [[Genetic improvement (computer science)|genetic improvement]] approaches operate at the statement level and carry out simple delete/replace operations such as deleting an existing statement or replacing an existing statement with another statement in the same source file.<ref name=genprog2009 /><ref name="genprog2012">{{Cite book |
<!-- fix templates -->
Another way to generate candidate patches consists of using fix templates. Fix templates are typically predefined changes for fixing specific classes of bugs.<ref name=par /> Examples of fix templates include inserting a [[Conditional (computer programming)|conditional statement]] to check whether the value of a variable is null to fix null pointer exception, or changing an integer constant by one to fix off-by-one errors.<ref name="par" />
=== Synthesis-based ===
Repair techniques exist that are based on symbolic execution. For example, Semfix<ref name="semfix">{{Cite book |
<!-- repair and synthesis -->
Under certain assumptions, it is possible to state the repair problem as a synthesis problem.
SemFix<ref name="semfix
Dynamoth
S3<ref>{{Cite book |
SearchRepair<ref name="searchrepair">{{Cite book |
=== Data-driven ===
Line 40 ⟶ 37:
[[Machine learning]] techniques can improve the effectiveness of automatic bug-fixing systems.<ref name="prophet" /> One example of such techniques learns from past successful patches from human developers collected from [[open-source software|open source]] [[software repository|repositories]] in [[GitHub]] and [[SourceForge]].<ref name="prophet" /> It then use the learned information to recognize and prioritize potentially correct patches among all generated candidate patches.<ref name="prophet" /> Alternatively, patches can be directly mined from existing sources. Example approaches include mining patches from donor applications<ref name="codephage" /> or from QA web sites.<ref name="QAFix" />
Getafix<ref name=":0">{{Cite journal |
=== Template-based repair ===
▲Getafix<ref name=":0">{{Cite journal |last=Bader |first=Johannes |last2=Scott |first2=Andrew |last3=Pradel |first3=Michael |last4=Chandra |first4=Satish |date=2019-10-10 |title=Getafix: learning to fix bugs automatically |url=https://doi.org/10.1145/3360585 |journal=Proceedings of the ACM on Programming Languages |volume=3 |issue=OOPSLA |pages=159:1–159:27 |doi=10.1145/3360585|doi-access=free }}</ref> is a language-agnostic approach developed and used in production at [[Facebook, Inc.|Facebook]]. Given a sample of [[Commit (version control)|code commits]] where engineers fixed a certain kind of bug, it learns human-like fix patterns that apply to future bugs of the same kind. Besides using Facebook's own [[Repository (version control)|code repositories]] as training data, Getafix learnt some fixes from [[open source]] Java repositories. When new bugs get detected, Getafix applies its previously learnt patterns to produce candidate fixes and ranks them within seconds. It presents only the top-ranked fix for final validation by tools or an engineer, in order to save resources and ideally be so fast that no human time was spent on fixing the same bug, yet.
For specific classes of errors, targeted automatic bug-fixing techniques use specialized templates:
* [[integer overflow]] repair<ref name="codephage">{{Cite book |last1=Sidiroglou |first1=Stelios |title=Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation |last2=Lahtinen |first2=Eric |last3=Long |first3=Fan |last4=Rinard |first4=Martin |date=2015 |chapter=Automatic Error Elimination by Multi-Application Code Transfer}}</ref>
▲Targeted automatic bug-fixing techniques generate repairs for specific classes of errors such as [[null pointer exception]]<ref name="rcv">{{Cite book |last=Long |first=Fan |title=Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation |last2=Sidiroglou-Douskos |first2=Stelios |last3=Rinard |first3=Martin |date=2014 |publisher=ACM |isbn=978-1-4503-2784-8 |series=PLDI '14' |___location=New York, New York |pages=227–238 |chapter=Automatic Runtime Error Repair and Containment via Recovery Shepherding |doi=10.1145/2594291.2594337 |s2cid=6252501}}</ref><ref name="nullfix">{{Cite book |last=Dobolyi |first=Kinga |title=2008 19th International Symposium on Software Reliability Engineering (ISSRE) |last2=Weimer |first2=Westley |year=2008 |pages=47–56 |chapter=Changing Java's Semantics for Handling Null Pointer Exceptions |citeseerx=10.1.1.147.6158 |doi=10.1109/ISSRE.2008.59 |s2cid=1454939}}</ref><ref name="par">{{Cite book |last=Kim |first=Dongsun |title=Proceedings of the 2013 International Conference on Software Engineering |last2=Nam |first2=Jaechang |last3=Song |first3=Jaewoo |last4=Kim |first4=Sunghun |date=2013 |publisher=IEEE Press |isbn=978-1-4673-3076-3 |series=ICSE '13' |pages=802–811 |chapter=Automatic Patch Generation Learned from Human-written Patches}}</ref> [[integer overflow]] ,<ref name="codephage">{{Cite book |last=Sidiroglou |first=Stelios |title=Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation |last2=Lahtinen |first2=Eric |last3=Long |first3=Fan |last4=Rinard |first4=Martin |date=2015 |chapter=Automatic Error Elimination by Multi-Application Code Transfer}}</ref> [[buffer overflow]] ,<ref name="codephage" /> [[memory leak]] ,<ref name="leakfix">{{Cite book |last=Gao |first=Qing |title=Proceedings of the 37th International Conference on Software Engineering – Volume 1 |last2=Xiong |first2=Yingfei |last3=Mi |first3=Yaqing |last4=Zhang |first4=Lu |last5=Yang |first5=Weikun |last6=Zhou |first6=Zhaoping |last7=Xie |first7=Bing |last8=Mei |first8=Hong |date=2015 |publisher=IEEE Press |isbn=978-1-4799-1934-5 |series=ICSE '15' |___location=Piscataway, New Jersey |pages=459–470 |chapter=Safe Memory-leak Fixing for C Programs}}</ref> etc.. Such techniques often use empirical fix templates to fix bugs in the targeted scope. For example, insert a [[Conditional (computer programming)|conditional statement]] to check whether the value of a variable is null<ref name="par" /> or insert missing memory deallocation statements.<ref name="leakfix" /> Comparing to generate-and-validate techniques, targeted techniques tend to have better bug-fixing accuracy but a much narrowed scope.<ref name="kali" /><ref name="leakfix" />
* [[buffer overflow]] repair<ref name="codephage" />
* [[memory leak]] repair,<ref name="leakfix">{{Cite book |last1=Gao |first1=Qing |title=Proceedings of the 37th International Conference on Software Engineering – Volume 1 |last2=Xiong |first2=Yingfei |last3=Mi |first3=Yaqing |last4=Zhang |first4=Lu |last5=Yang |first5=Weikun |last6=Zhou |first6=Zhaoping |last7=Xie |first7=Bing |last8=Mei |first8=Hong |date=2015 |publisher=IEEE Press |isbn=978-1-4799-1934-5 |series=ICSE '15' |___location=Piscataway, New Jersey |pages=459–470 |chapter=Safe Memory-leak Fixing for C Programs}}</ref> with automated insertion of missing memory deallocation statements.
Comparing to generate-and-validate techniques, template-based techniques tend to have better bug-fixing accuracy but a much narrowed scope.<ref name="kali" /><ref name="leakfix" />
== Use ==
There are multiple uses of automatic bug fixing:
* In a development environment: When encountering a bug the developer activates a feature to search for a patch (for instance by clicking on a button). This search can also happen in the background, when the IDE proactively searches for solutions to potential problems, without waiting for explicit action from the developer.<ref>{{Cite journal |
* At runtime: When a failure happens at runtime, a binary patch can be searched for and [[Self-modifying code|applied online]]. An example of such a repair system is ClearView,<ref name="clearview">{{Cite book |last=Perkins |first=Jeff H. |title=Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles |date=2009 |publisher=ACM |isbn=978-1-60558-752-3 |pages=87–102 |chapter=Automatically patching errors in deployed software |citeseerx=10.1.1.157.5877 |doi=10.1145/1629575.1629585 |display-authors=etal |s2cid=7597529}}</ref> which does repair on x86 code, with x86 binary patches.
== Search space ==
In essence, automatic bug fixing is a search activity, whether deductive-based or heuristic-based. The search space of automatic bug fixing is composed of all edits that can be possibly made to a program. There have been studies to understand the structure of this search space. Qi et al.<ref>{{Cite book |
== Overfitting ==
Sometimes, in test-suite based program repair, tools generate patches that pass the test suite, yet are actually incorrect, this is known as the "overfitting" problem.<ref name="overfitting">{{Cite book |
One way to avoid overfitting is to filter out the generated patches. This can be done based on dynamic analysis.<ref>{{Cite
Alternatively, Tian et al. propose heuristic approaches to assess patch correctness.<ref>{{cite news |last1=Tian |first1=Haoye |last2=Liu |first2=Kui |last3=Kaboré |first3=Abdoul Kader |last4=Koyuncu |first4=Anil |last5=Li |first5=Li |last6=Klein |first6=Jacques |last7=Bissyandé |first7=Tegawendé F. |title=Evaluating representation learning of code changes for predicting patch correctness in program repair |url=https://dl.acm.org/doi/abs/10.1145/3324884.3416532 |work=Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering |publisher=Association for Computing Machinery |date=27 January 2021 |pages=981–992 |doi=10.1145/3324884.3416532|isbn=9781450367684 }}</ref><ref>{{cite book |last1=Tian |first1=Haoye |last2=Tang |first2=Xunzhu |last3=Habib |first3=Andrew |last4=Wang |first4=Shangwen |last5=Liu |first5=Kui |last6=Xia |first6=Xin |last7=Klein |first7=Jacques |last8=BissyandÉ |first8=TegawendÉ F. |title=Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering |chapter=Is this Change the Answer to that Problem?: Correlating Descriptions of Bug and Code Changes for Evaluating Patch Correctness |date=5 January 2023 |pages=1–13 |doi=10.1145/3551349.3556914 |chapter-url=https://dl.acm.org/doi/abs/10.1145/3551349.3556914 |publisher=Association for Computing Machinery|s2cid=251403079 |arxiv=2208.04125 |isbn=9781450394758 }}</ref>
== Limitations of automatic bug-fixing ==
Automatic bug-fixing techniques that rely on a test suite do not provide patch correctness guarantees, because the test suite is incomplete and does not cover all cases.<ref name="kali" /> A weak test suite may cause generate-and-validate techniques to produce validated but incorrect patches that have negative effects such as eliminating desirable functionalities, causing memory leaks, and introducing security vulnerabilities.<ref name="kali" /> One possible approach is to amplify the failing test suite by automatically generating further test cases that are then labelled as passing or failing. To minimize the human labelling effort, an automatic [[test oracle]] can be trained that gradually learns to automatically classify test cases as passing or failing and only engages the bug-reporting user for uncertain cases.<ref name="learn2fix">{{Cite book |
A limitation of generate-and-validate repair systems is the search space explosion.<ref name="spaceanalysis">{{Cite book |
The limitation of approaches based on symbolic analysis<ref name="semfix" /><ref name="angelix" /> is that real world programs are often converted to intractably large formulas especially for modifying statements with [[side effect (computer science)|side effects]].
Line 79:
In C, the Manybugs benchmark collected by GenProg authors contains 69 real world defects and it is widely used to evaluate many other bug-fixing tools for C.<ref name=genprog2012 /><ref name=prophet /><ref name=spr /><ref name=angelix />
In [[Java (programming language)|Java]], the main benchmark is Defects4J
== Example tools ==
Line 93:
* LeakFix:<ref name=leakfix /> A tool that automatically fixes memory leaks in C programs.
* Prophet:<ref name=prophet /> The first generate-and-validate tool that uses machine learning techniques to learn useful knowledge from past human patches to recognize correct patches. It is evaluated on the same benchmark as GenProg and generate correct patches (i.e., equivalent to human patches) for 18 out of 69 cases.<ref name=prophet />
* SearchRepair:<ref name=searchrepair /> A tool for replacing buggy code using snippets of code from elsewhere. It is evaluated on the IntroClass benchmark<ref name="introclassmanybugs">{{Cite journal |
* Angelix:<ref name=angelix /> An improved solver-based bug-fixing tool. It is evaluated on the GenProg benchmark. For 10 out of the 69 cases, it generate patches that is equivalent to human patches.
* Learn2Fix:<ref name=learn2fix /> The first human-in-the-loop semi-automatic repair tool. Extends GenProg to learn the condition under which a semantic bug is observed by systematic queries to the user who is reporting the bug. Only works for programs that take and produce integers.
Line 99:
=== Java ===
* PAR:<ref name=par /> A generate-and-validate tool that uses a set of manually defined fix templates.
*
*
* NpeFix:<ref name="npefix">{{Cite book |last=Durieux |first=Thomas |title=2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER) |date=2017 |isbn=978-1-5090-5501-2 |pages=349–358 |chapter=Dynamic Patch Generation for Null Pointer Exceptions Using Metaprogramming |arxiv=1812.00409 |doi=10.1109/SANER.2017.7884635 |s2cid=2736203}}</ref> An automatic repair tool for NullPointerException in Java, available [https://github.com/Spirals-Team/npefix on Github].
Line 113 ⟶ 111:
=== Proprietary ===
* DeepCode integrates public and private [[GitHub]], [[GitLab]] and [[Bitbucket]] [[Software repository|repositories]] to identify code-fixes and improve software.<ref>{{Cite web |title=AI is coming for your coding job |url=https://sifted.eu/articles/ai-is-coming-for-your-coding-job/ |access-date=2019-04-15 |website=Sifted |date=13 March 2019 |language=en-US}}</ref>
* [[Kodezi, Inc.|Kodezi]] utilizes opensource data from [[GitHub]] [[Software repository|repositories]], [[Stack Overflow]], and private trained models to analyze code, provide solutions, and descriptions about the coding bugs instantly.<ref>{{Cite web |title=Ishraq Khan, Revolutionizing the Programming Scene in 2021 |url=https://www.techtimes.com/articles/265325/20210913/ishraq-khan-revolutionizing-the-programming-scene-in-2021.htm |access-date=2022-10-15 |website=TechTimes|date=13 September 2019 |language=en-US}}</ref>
== References ==
Line 120:
== External links ==
* {{URL|
[[Category:Debugging]]
|