Content deleted Content added
mention the concept if AST based merge |
introduce section about usages with "AST differencing", "Structured Merging", "Clone Detection" |
||
Line 44:
To support compiler verification it should be possible to unparse an AST into source code form. The source code produced should be sufficiently similar to the original in appearance and identical in execution, upon recompilation.
The AST is used intensively during [[Semantic analysis (compilers)|semantic analysis]], where the compiler checks for correct usage of the elements of the program and the language. The compiler also generates [[symbol table]]s based on the AST during semantic analysis. A complete traversal of the tree allows verification of the correctness of the program.
After verifying correctness, the AST serves as the base for code generation. The AST is often used to generate an intermediate representation (IR), sometimes called an [[intermediate language]], for the code generation.
== AST differencing ==▼
AST differencing, or for short tree differencing, consists of computing the list of differences between two ASTs. This list of differences is typically called an edit script. The edit script directly refers to the AST of the code. For instance, an edit action may the addition of a new AST node representing a function. The problem of computing good AST edit scripts is hard, with two main challenges: handling move actions, and scaling to fine-grained ASTs with thousands of nodes.<ref>{{Cite journal|last=Falleri|first=Jean-Rémy|last2=Morandat|first2=Floréal|last3=Blanc|first3=Xavier|last4=Martinez|first4=Matias|last5=Monperrus|first5=Martin|date=2014-09-15|title=Fine-grained and accurate source code differencing|url=https://dl.acm.org/doi/10.1145/2642937.2642982|journal=Proceedings of the 29th ACM/IEEE international conference on Automated software engineering|language=en|___location=Vasteras Sweden|publisher=ACM|pages=313–324|doi=10.1145/2642937.2642982|isbn=978-1-4503-3013-8}}</ref>▼
== Other usages ==
▲=== AST differencing ===
▲AST differencing, or for short tree differencing, consists of computing the list of differences between two ASTs.<ref>{{Cite journal |last=Fluri |first=Beat |last2=Wursch |first2=Michael |last3=PInzger |first3=Martin |last4=Gall |first4=Harald |date=2007 |title=Change Distilling:Tree Differencing for Fine-Grained Source Code Change Extraction |url=http://dx.doi.org/10.1109/tse.2007.70731 |journal=IEEE Transactions on Software Engineering |volume=33 |issue=11 |pages=725–743 |doi=10.1109/tse.2007.70731 |issn=0098-5589}}</ref> This list of differences is typically called an edit script. The edit script directly refers to the AST of the code. For instance, an edit action may the addition of a new AST node representing a function. The problem of computing good AST edit scripts is hard, with two main challenges: handling move actions, and scaling to fine-grained ASTs with thousands of nodes.<ref>{{Cite journal|last=Falleri|first=Jean-Rémy|last2=Morandat|first2=Floréal|last3=Blanc|first3=Xavier|last4=Martinez|first4=Matias|last5=Monperrus|first5=Martin|date=2014-09-15|title=Fine-grained and accurate source code differencing|url=https://dl.acm.org/doi/10.1145/2642937.2642982|journal=Proceedings of the 29th ACM/IEEE international conference on Automated software engineering|language=en|___location=Vasteras Sweden|publisher=ACM|pages=313–324|doi=10.1145/2642937.2642982|isbn=978-1-4503-3013-8}}</ref>
=== Structured merge ===
With proper AST differencing, ASTs can be used for merging versions and branches, this is known as [[Merge (version control)|structured merge]]. The advantage is to avoid spurious conflicts due to lines and formatting.<ref>{{Cite journal|last=Larsen|first=Simon|last2=Falleri|first2=Jean-Remy|last3=Baudry|first3=Benoit|last4=Monperrus|first4=Martin|date=2022|title=Spork: Structured Merge for Java with Formatting Preservation|url=https://ieeexplore.ieee.org/document/9684709/|journal=IEEE Transactions on Software Engineering|pages=1–1|doi=10.1109/TSE.2022.3143766|issn=0098-5589}}</ref>
=== Clone detection ===
An AST is a powerful abstraction to perform code [[clone detection]].<ref>{{Cite journal |last=Koschke |first=Rainer |last2=Falke |first2=Raimar |last3=Frenzel |first3=Pierre |date=2006 |title=Clone Detection Using Abstract Syntax Suffix Trees |url=http://dx.doi.org/10.1109/wcre.2006.18 |journal=2006 13th Working Conference on Reverse Engineering |publisher=IEEE |doi=10.1109/wcre.2006.18}}</ref>
== See also ==
Line 77 ⟶ 81:
* {{cite web |last=Jones |first=Joel |title=Abstract Syntax Tree Implementation Idioms |url=http://www.hillside.net/plop/plop2003/Papers/Jones-ImplementingASTs.pdf }} (overview of AST implementation in various language families)
* {{cite conference |last1=Neamtiu |first1=Iulian |last2=Foster |first2=Jeffrey S. |last3=Hicks |first3=Michael |date=May 17, 2005 |title=Understanding Source Code Evolution Using Abstract Syntax Tree Matching |conference=MSR'05 |___location=Saint Louis, Missouri |publisher=ACM |citeseerx=10.1.1.88.5815}}
* {{cite thesis |last=Würsch |first=Michael |degree=Diploma |title=Improving Abstract Syntax Tree based Source Code Change Detection |url=http://www.ifi.uzh.ch/seal/research/tools/archive/changeDetection.html}}
* {{cite web |last=Lucas |first=Jason |title=Thoughts on the Visual C++ Abstract Syntax Tree (AST) |date=16 August 2006 |url=https://devblogs.microsoft.com/cppblog/thoughts-on-the-visual-c-abstract-syntax-tree-ast/}}
|