Automatic bug fixing: Difference between revisions

Content deleted Content added
WikiCleanerBot (talk | contribs)
m v2.04b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation)
m General fixes, typo(s) fixed: eg. → e.g. (2)
Line 38:
=== Data-driven ===
 
[[Machine learning]] techniques can improve the effectiveness of automatic bug-fixing systems.<ref name="prophet" /> One example of such techniques learns from past successful patches from human developers collected from [[open-source software|open source]] [[software repository|repositories]] in [[GitHub]] and [[SourceForge]].<ref name="prophet" /> It then use the learned information to recognize and prioritize potentially correct patches among all generated candidate patches.<ref name="prophet" /> Alternatively, patches can be directly mined from existing sources. Example approaches include mining patches from donor applications<ref name="codephage" /> or from QA web sites.<ref name="QAFix" /> Learning can done online, aka continual learning, with the known precedent of online learning of patches from the stream of open source build results from continuous integration.<ref>{{Cite journal|last=Baudry|first=Benoit|last2=Chen|first2=Zimin|last3=Etemadi|first3=Khashayar|last4=Fu|first4=Han|last5=Ginelli|first5=Davide|last6=Kommrusch|first6=Steve|last7=Martinez|first7=Matias|last8=Monperrus|first8=Martin|last9=Ron Arteaga|first9=Javier|last10=Ye|first10=He|last11=Yu|first11=Zhongxing|date=2021|title=A Software-Repair Robot Based on Continual Learning|url=https://arxiv.org/abs/2012.06824|journal=IEEE Software|volume=38|issue=4|pages=28–35|doi=10.1109/MS.2021.3070743|issn=0740-7459|arxiv=2012.06824}}</ref>
 
SequenceR uses [[Neural machine translation|sequence-to-sequence learning]] on source code in order to generate one-line patches.<ref>{{Cite journal |last=Chen |first=Zimin |last2=Kommrusch |first2=Steve James |last3=Tufano |first3=Michele |last4=Pouchet |first4=Louis-Noel |last5=Poshyvanyk |first5=Denys |last6=Monperrus |first6=Martin |date=2019 |title=SEQUENCER: Sequence-to-Sequence Learning for End-to-End Program Repair |journal=IEEE Transactions on Software Engineering |pages=1 |arxiv=1901.01808 |doi=10.1109/TSE.2019.2940179 |issn=0098-5589 |s2cid=57573711}}</ref> It defines a neural network architecture that works well with source code, with the copy mechanism that allows to produce patches with tokens that are not in the learned vocabulary. Those tokens are taken from the code of the Java class under repair.
Line 51:
There are multiple uses of automatic bug fixing:
* In a development environment: When encountering a bug the developer activates a feature to search for a patch (for instance by clicking on a button). This search can also happen in the background, when the IDE proactively searches for solutions to potential problems, without waiting for explicit action from the developer.<ref>{{Cite journal |last=Muşlu |first=Kıvanç |last2=Brun |first2=Yuriy |last3=Holmes |first3=Reid |last4=Ernst |first4=Michael D. |last5=Notkin |first5=David |last6=Muşlu |first6=Kıvanç |last7=Brun |first7=Yuriy |last8=Holmes |first8=Reid |last9=Ernst |first9=Michael D. |last10=Notkin |first10=David |date=19 October 2012 |title=Speculative analysis of integrated development environment recommendations, Speculative analysis of integrated development environment recommendations |journal=ACM SIGPLAN Notices |volume=47 |issue=10 |pages=669, 669–682, 682 |citeseerx=10.1.1.259.6341 |doi=10.1145/2384616.2384665 |issn=0362-1340 |s2cid=5795141}}</ref>
* In a continuous integration server: When a build fails during continuous integration, a patch search can be attempted as soon as the build has failed. If the search is successful, the patch is provided to the developer.<ref>{{Cite book |last=Urli |first=Simon |title=How to design a program repair bot?: insights from the repairnator project |last2=Yu |first2=Zhongxing |last3=Seinturier |first3=Lionel |last4=Monperrus |first4=Martin |date=27 May 2018 |isbn=9781450356596 |pages=95–104 |chapter=How to design a program repair bot? |arxiv=1811.09852 |doi=10.1145/3183519.3183540 |chapter-url=https://hal.archives-ouvertes.fr/hal-01691496/document |s2cid=49237449}}</ref> When a synthesized patch is suggested to the developers as pull-request, an explanation has to be provided in addition to the code changes (ege.g. a pull request title and description).<ref>{{Cite book |last=Monperrus |first=Martin |title=2019 IEEE/ACM 1st International Workshop on Bots in Software Engineering (BotSE) |year=2019 |isbn=978-1-7281-2262-5 |pages=12–15 |chapter=Explainable Software Bot Contributions: Case Study of Automated Bug Fixes |arxiv=1905.02597 |bibcode=2019arXiv190502597M |doi=10.1109/BotSE.2019.00010 |s2cid=146808763}}</ref> An experiment has shown that generated patches can be accepted by open-source developers and merged in the code repository.<ref>{{Cite journal |last=Monperrus |first=Martin |last2=Urli |first2=Simon |last3=Durieux |first3=Thomas |last4=Martinez |first4=Matias |last5=Baudry |first5=Benoit |last6=Seinturier |first6=Lionel |date=2019 |title=Repairnator patches programs automatically |url=https://hal.archives-ouvertes.fr/hal-02267512/document |journal=Ubiquity |volume=2019 |issue=July |pages=1–12 |arxiv=1910.06247 |bibcode=2019arXiv191006247M |doi=10.1145/3349589 |s2cid=198986312}}</ref>
* At runtime: When a failure happens at runtime, a binary patch can be searched for and [[Self-modifying code|applied online]]. An example of such a repair system is ClearView,<ref name="clearview">{{Cite book |last=Perkins |first=Jeff H. |title=Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles |date=2009 |publisher=ACM |isbn=978-1-60558-752-3 |pages=87–102 |chapter=Automatically patching errors in deployed software |citeseerx=10.1.1.157.5877 |doi=10.1145/1629575.1629585 |display-authors=etal |s2cid=7597529}}</ref> which does repair on x86 code, with x86 binary patches. The Itzal system<ref>{{Cite book |last=Durieux |first=Thomas |title=2017 IEEE/ACM 39th International Conference on Software Engineering: New Ideas and Emerging Technologies Results Track (ICSE-NIER) |last2=Hamadi |first2=Youssef |last3=Monperrus |first3=Martin |year=2017 |isbn=978-1-5386-2675-7 |pages=23–26 |chapter=Production-driven patch generation |arxiv=1812.04475 |doi=10.1109/icse-nier.2017.8 |chapter-url=https://hal.archives-ouvertes.fr/hal-01463689/document |s2cid=7737476}}</ref> is different from Clearview: while the repair search happens at runtime, in production, the produced patches are at the source code level. The BikiniProxy system does online repair of JavaScript errors happening in the browser.<ref>{{Cite book |last=Durieux |first=Thomas |title=2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE) |last2=Hamadi |first2=Youssef |last3=Monperrus |first3=Martin |year=2018 |isbn=978-1-5386-8321-7 |pages=1–12 |chapter=Fully Automated HTML and Javascript Rewriting for Constructing a Self-Healing Web Proxy |arxiv=1803.08725 |bibcode=2018arXiv180308725D |doi=10.1109/ISSRE.2018.00012 |s2cid=4268784}}</ref>
 
Line 58:
In essence, automatic bug fixing is a search activity, whether deductive-based or heuristic-based. The search space of automatic bug fixing is composed of all edits that can be possibly made to a program. There have been studies to understand the structure of this search space. Qi et al.<ref>{{Cite book |last=Qi |first=Yuhua |title=The strength of random search on automated program repair |last2=Mao |first2=Xiaoguang |last3=Lei |first3=Yan |last4=Dai |first4=Ziying |last5=Wang |first5=Chengsong |date=2014-05-31 |publisher=ACM |isbn=9781450327565 |pages=254–265 |doi=10.1145/2568225.2568254 |s2cid=14976851}}</ref> showed that the original fitness function of Genprog is not better than random search to drive the search. Martinez et al.<ref>{{Cite journal |last=Martinez |first=Matias |last2=Monperrus |first2=Martin |date=2013-11-28 |title=Mining software repair models for reasoning on the search space of automated program fixing |url=https://hal.archives-ouvertes.fr/hal-00903808/document |journal=Empirical Software Engineering |language=en |volume=20 |issue=1 |pages=176–205 |arxiv=1311.3414 |bibcode=2013arXiv1311.3414M |doi=10.1007/s10664-013-9282-8 |issn=1382-3256 |s2cid=1676168}}</ref> explored the imbalance between possible repair actions, showing its significant impact on the search. Long et al.'s<ref name="spaceanalysis" /> study indicated that correct patches can be considered as sparse in the search space and that incorrect overfitting patches are vastly more abundant (see also discussion about overfitting below).
 
If one explicitly enumerates all possible variants in a repair algorithm, this defines a design space for program repair.<ref name="Martinez2019">{{Cite journal |last=Martinez |first=Matias |last2=Monperrus |first2=Martin |date=2019 |title=Astor: Exploring the design space of generate-and-validate program repair beyond GenProg |journal=Journal of Systems and Software |volume=151 |pages=65–80 |arxiv=1802.03365 |doi=10.1016/j.jss.2019.01.069 |s2cid=3619320}}</ref> Each variant selects an algorithm involved at some point in the repair process (ege.g. the fault localization algorithm), or selects a specific heuristic which yields different patches. For instance, in the design space of generate-and-validate program repair, there is one variation point about the granularity of the program elements to be modified: an expression, a statement, a block, etc.<ref name="Martinez2019" />
 
== Overfitting ==
Line 103:
* QACrashFix:<ref name="QAFix">{{Cite book |last=Gao |first=Qing |title=2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE) |last2=Zhang |first2=Hansheng |last3=Wang |first3=Jie |last4=Xiong |first4=Yingfei |last5=Zhang |first5=Lu |last6=Mei |first6=Hong |date=2015 |publisher=IEEE |isbn=978-1-5090-0025-8 |pages=307–318 |chapter=Fixing Recurring Crash Bugs via Analyzing Q&A Sites |doi=10.1109/ASE.2015.81 |s2cid=2513924}}</ref> A tool that fixes Java crash bugs by mining fixes from Q&A web site.
* Astor:<ref name="astor">{{Cite book |last=Martinez |first=Matias |title=Proceedings of ISSTA, Demonstration Track |last2=Monperrus |first2=Martin |date=2016 |isbn=978-1-4503-4390-9 |pages=441–444 |chapter=ASTOR: A Program Repair Library for Java |doi=10.1145/2931037.2948705 |chapter-url=https://hal.archives-ouvertes.fr/hal-01321615/file/astor.pdf |s2cid=7322935}}</ref> An automatic repair library for Java, containing jGenProg, a Java implementation of GenProg.
* ARJA:<ref name="arja">{{Cite journal |last=Yuan |first=Yuan |last2=Banzhaf |first2=Wolfgang |date=2020 |title=ARJA: Automated Repair of Java Programs via Multi-Objective Genetic Programming |url=https://doi.org/10.1109/TSE.2018.2874648 |journal=IEEE Transactions on Software Engineering |volume=46 |issue=10 |pages=1040-10671040–1067 |arxiv=1712.07804 |doi=10.1109/TSE.2018.2874648}}</ref> A repair tool for Java based on multi-objective genetic programming.
* NpeFix:<ref name="npefix">{{Cite book |last=Durieux |first=Thomas |title=2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER) |date=2017 |isbn=978-1-5090-5501-2 |pages=349–358 |chapter=Dynamic Patch Generation for Null Pointer Exceptions Using Metaprogramming |arxiv=1812.00409 |doi=10.1109/SANER.2017.7884635 |s2cid=2736203}}</ref> An automatic repair tool for NullPointerException in Java, available [https://github.com/Spirals-Team/npefix on Github].