Automatic bug fixing: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 08:35, 24 April 2025 edit Nehz XZX (talk \| contribs) Extended confirmed users 3,675 edits mNo edit summary ← Previous edit		Latest revision as of 11:56, 16 August 2025 edit undo Jnestorius (talk \| contribs) Autopatrolled, Extended confirmed users, Pending changes reviewers 80,487 edits test case (software)s
(2 intermediate revisions by 2 users not shown)
Line 6: Automatic bug fixing is made according to a specification of the expected behavior which can be for instance a [[formal specification]] or a [[test suite]].<ref name="genprog2009">{{Cite book \|last1=Weimer \|first1=Westley \|title=Proceedings of the 31st International Conference on Software Engineering \|last2=Nguyen \|first2=ThanhVu \|last3=Le Goues \|first3=Claire\|author3-link=Claire Le Goues \|last4=Forrest \|first4=Stephanie \|date=2009 \|publisher=IEEE \|isbn=978-1-4244-3453-4 \|pages=364–374 \|chapter=Automatically finding patches using genetic programming \|citeseerx=10.1.1.147.8995 \|doi=10.1109/ICSE.2009.5070536 \|s2cid=1706697}}</ref> A test-suite – the input/output pairs specify the functionality of the program, possibly captured in [[Assertion (software development)\|assertions]] can be used as a [[test oracle]] to drive the search. This oracle can in fact be divided between the ''bug oracle'' that exposes the faulty behavior, and the ''regression oracle'', which encapsulates the functionality any program repair method must preserve. Note that a test suite is typically incomplete and does not cover all possible cases. Therefore, it is often possible for a validated patch to produce expected outputs for all inputs in the test suite but incorrect outputs for other inputs.<ref name="kali">{{Cite book \|last1=Qi \|first1=Zichao \|title=Proceedings of the 2015 International Symposium on Software Testing and Analysis \|last2=Long \|first2=Fan \|last3=Achour \|first3=Sara \|last4=Rinard \|first4=Martin \|date=2015 \|publisher=ACM \|isbn=978-1-4503-3620-8 \|chapter=An ~~Anlysis~~Analysis of Patch Plausibility and Correctness for Generate-and-Validate Patch Generation Systems \|citeseerx=10.1.1.696.5616 \|doi=10.1145/2771783.2771791 \|s2cid=6845282}}</ref> The existence of such validated but incorrect patches is a major challenge for generate-and-validate techniques.<ref name="kali" /> Recent successful automatic bug-fixing techniques often rely on additional information other than the test suite, such as information learned from previous human patches, to further identify correct patches among validated patches.<ref name="prophet">{{Cite book \|last1=Long \|first1=Fan \|title=Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages \|last2=Rinard \|first2=Martin \|date=2016 \|publisher=ACM \|isbn=978-1-4503-3549-2 \|pages=298–312 \|chapter=Automatic patch generation by learning correct code \|doi=10.1145/2837614.2837617 \|s2cid=6091588}}</ref> Another way to specify the expected behavior is to use [[formal specification]]s<ref name="autofixe">{{Cite journal \|last1=Pei \|first1=Yu \|last2=Furia \|first2=Carlo A. \|last3=Nordio \|first3=Martin \|last4=Wei \|first4=Yi \|last5=Meyer \|first5=Bertrand \|last6=Zeller \|first6=Andreas \|date=May 2014 \|title=Automated Fixing of Programs with Contracts \|journal=IEEE Transactions on Software Engineering \|volume=40 \|issue=5 \|pages=427–449 \|arxiv=1403.1117 \|bibcode=2014arXiv1403.1117P \|doi=10.1109/TSE.2014.2312918 \|s2cid=53302638}}</ref><ref>{{Cite journal \|title=Contract-based Data Structure Repair Using Alloy \|citeseerx=10.1.1.182.4390}}</ref> Verification against full specifications that specify the whole program behavior including functionalities is less common because such specifications are typically not available in practice and the computation cost of such [[formal verification\|verification]] is prohibitive. For specific classes of errors, however, implicit partial specifications are often available. For example, there are targeted bug-fixing techniques validating that the patched program can no longer trigger overflow errors in the same execution path.<ref name="codephage" /> Line 14: === Generate-and-validate === Generate-and-validate approaches compile and test each candidate patch to collect all validated patches that produce expected outputs for all inputs in the test suite.<ref name="genprog2009" /><ref name="kali" /> Such a technique typically starts with a test suite of the program, i.e., a set of [[test ~~cases~~case (software)\|test case]]s, at least one of which exposes the bug.<ref name="genprog2009" /><ref name="prophet" /><ref name="rsrepair">{{Cite book \|last1=Qi \|first1=Yuhua \|title=Proceedings of the 36th International Conference on Software Engineering \|last2=Mao \|first2=Xiaoguang \|last3=Lei \|first3=Yan \|last4=Dai \|first4=Ziying \|last5=Wang \|first5=Chengsong \|date=2014 \|publisher=ACM \|isbn=978-1-4503-2756-5 \|series=ICSE 2014 \|___location=Austin, Texas \|pages=254–265 \|chapter=The Strength of Random Search on Automated Program Repair \|doi=10.1145/2568225.2568254 \|s2cid=14976851}}</ref><ref name="spr">{{Cite book \|last1=Long \|first1=Fan \|title=Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering \|last2=Rinard \|first2=Martin \|date=2015 \|publisher=ACM \|isbn=978-1-4503-3675-8 \|series=ESEC/FSE 2015 \|___location=Bergamo, Italy \|pages=166–178 \|chapter=Staged Program Repair with Condition Synthesis \|citeseerx=10.1.1.696.9059 \|doi=10.1145/2786805.2786811 \|s2cid=5987616}}</ref> An early generate-and-validate bug-fixing systems is GenProg.<ref name="genprog2009" /> The effectiveness of generate-and-validate techniques remains controversial, because they typically do not provide [[#Limitations of automatic bug-fixing\|patch correctness guarantees]].<ref name="kali" /> Nevertheless, the reported results of recent state-of-the-art techniques are generally promising. For example, on systematically collected 69 real world bugs in eight large [[C (programming language)\|C software programs]], the state-of-the-art bug-fixing system Prophet generates correct patches for 18 out of the 69 bugs.<ref name="prophet" /> <!-- mutation based repair --> Line 79: In C, the Manybugs benchmark collected by GenProg authors contains 69 real world defects and it is widely used to evaluate many other bug-fixing tools for C.<ref name=genprog2012 /><ref name=prophet /><ref name=spr /><ref name=angelix /> In [[Java (programming language)\|Java]], the main benchmark is Defects4J now extensively used in most research papers on program repair for Java.<ref name="capgen">{{Cite book \|last1=Wen \|first1=Ming \|last2=Chen \|first2=Junjie \|last3=Wu \|first3=Rongxin \|last4=Hao \|first4=Dan \|last5=Cheung \|first5=Shing-Chi \|title=Proceedings of the 40th International Conference on Software Engineering \|chapter=Context-aware patch generation for better automated program repair \|date=2018 \|___location=New York, New York, USA \|publisher=ACM Press \|pages=1–11 \|doi=10.1145/3180155.3180233 \|isbn=9781450356381 \|s2cid=3374770\|url=https://repository.hkust.edu.hk/ir/Record/1783.1-92186 \|chapter-url=~~http~~https://repository.ust.hk/ir/Record/1783.1-92186 }}</ref><ref>{{Cite book \|last1=Hua \|first1=Jinru \|last2=Zhang \|first2=Mengshi \|last3=Wang \|first3=Kaiyuan \|last4=Khurshid \|first4=Sarfraz \|title=Proceedings of the 40th International Conference on Software Engineering \|chapter=Towards practical program repair with on-demand candidate generation \|date=2018 \|___location=New York, New York, USA \|publisher=ACM Press \|pages=12–23 \|doi=10.1145/3180155.3180245 \|isbn=9781450356381 \|s2cid=49666327\|doi-access=free }}</ref> Alternative benchmarks exist, such as the Quixbugs benchmark,<ref>{{Cite book \|last1=Lin \|first1=Derrick \|last2=Koppel \|first2=James \|last3=Chen \|first3=Angela \|last4=Solar-Lezama \|first4=Armando \|title=Proceedings Companion of the 2017 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity \|chapter=QuixBugs: A multi-lingual program repair benchmark set based on the quixey challenge \|date=2017 \|___location=New York, New York, USA \|publisher=ACM Press \|pages=55–56 \|doi=10.1145/3135932.3135941 \|isbn=9781450355148 \|doi-access=free}}</ref> which contains original bugs for program repair. Other benchmarks of Java bugs include Bugs.jar,<ref>{{Cite book \|last1=Saha \|first1=Ripon K. \|last2=Lyu \|first2=Yingjun \|last3=Lam \|first3=Wing \|last4=Yoshida \|first4=Hiroaki \|last5=Prasad \|first5=Mukul R. \|title=Proceedings of the 15th International Conference on Mining Software Repositories \|chapter=Bugs.jar \|date=2018 \|chapter-url=http://dl.acm.org/citation.cfm?doid=3196398.3196473 \|series=MSR '18 \|language=en \|pages=10–13 \|doi=10.1145/3196398.3196473 \|isbn=9781450357166 \|s2cid=50770093}}</ref> based on past commits. == Example tools == Line 120: == External links == * {{URL\|~~http~~https://program-repair.org/}} datasets, tools, etc., related to automated program repair research. [[Category:Debugging]]