Revision as of 09:41, 11 August 2021 edit Martin.monperrus (talk \| contribs) 437 edits →Overfitting: mentions test generation based on reference patch Tag: 2017 wikitext editor ← Previous edit		Revision as of 11:33, 12 August 2021 edit undo Martin.monperrus (talk \| contribs) 437 edits →Data-driven: add continual learning for repair Tag: 2017 wikitext editor Next edit →
Line 38: === Data-driven === [[Machine learning]] techniques can improve the effectiveness of automatic bug-fixing systems.<ref name="prophet" /> One example of such techniques learns from past successful patches from human developers collected from [[open-source software\|open source]] [[software repository\|repositories]] in [[GitHub]] and [[SourceForge]].<ref name="prophet" /> It then use the learned information to recognize and prioritize potentially correct patches among all generated candidate patches.<ref name="prophet" /> Alternatively, patches can be directly mined from existing sources. Example approaches include mining patches from donor applications<ref name="codephage" /> or from QA web sites.<ref name="QAFix" /> Learning can done online, aka continual learning, with the known precedent of online learning of patches from the stream of open source build results from continuous integration.<ref>{{Cite journal\|last=Baudry\|first=Benoit\|last2=Chen\|first2=Zimin\|last3=Etemadi\|first3=Khashayar\|last4=Fu\|first4=Han\|last5=Ginelli\|first5=Davide\|last6=Kommrusch\|first6=Steve\|last7=Martinez\|first7=Matias\|last8=Monperrus\|first8=Martin\|last9=Ron Arteaga\|first9=Javier\|last10=Ye\|first10=He\|last11=Yu\|first11=Zhongxing\|date=2021\|title=A Software-Repair Robot Based on Continual Learning\|url=https://arxiv.org/abs/2012.06824\|journal=IEEE Software\|volume=38\|issue=4\|pages=28–35\|doi=10.1109/MS.2021.3070743\|issn=0740-7459}}</ref> SequenceR uses [[Neural machine translation\|sequence-to-sequence learning]] on source code in order to generate one-line patches.<ref>{{Cite journal \|last=Chen \|first=Zimin \|last2=Kommrusch \|first2=Steve James \|last3=Tufano \|first3=Michele \|last4=Pouchet \|first4=Louis-Noel \|last5=Poshyvanyk \|first5=Denys \|last6=Monperrus \|first6=Martin \|date=2019 \|title=SEQUENCER: Sequence-to-Sequence Learning for End-to-End Program Repair \|journal=IEEE Transactions on Software Engineering \|pages=1 \|arxiv=1901.01808 \|doi=10.1109/TSE.2019.2940179 \|issn=0098-5589 \|s2cid=57573711}}</ref> It defines a neural network architecture that works well with source code, with the copy mechanism that allows to produce patches with tokens that are not in the learned vocabulary. Those tokens are taken from the code of the Java class under repair.

Automatic bug fixing: Difference between revisions