Abstract syntax tree

This is an old revision of this page, as edited by 128.226.4.39 (talk) at 17:25, 23 September 2005. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In computer science, an abstract syntax tree (AST) is a finite, labeled, directed tree, where the nodes are labeled by operators, and the edges represent the operands of the node operators. Thus, the leaves have nullary operators, i.e., variables or constants. In computing, it is used in a parser as an intermediate between a parse tree and a data structure, the latter which is often used as a compiler or interpreter's internal representation of a computer program while it is being optimized and from which code generation is performed. The range of all possible such structures is described by the abstract syntax. An AST differs from a parse tree by omitting nodes and edges for syntax rules that do not affect the semantics of the program. The classic example of such an omission is grouping parentheses, since in an AST the grouping of operands is explicit in the tree structure.

Creating an AST in a parser for a language described by a context free grammar, as nearly all programming languages are, is straightforward. Most rules in the grammar create a new node with the nodes edges being the symbols in the rule. Rules that do not contribute to the AST, such as grouping rules, merely pass through the node for one of their symbols. Alternatively, a parser can create a full parse tree, and a post-pass over the parse tree can convert it to an AST by removing the nodes and edges not used in the abstract syntax.

See also

References

This article is based on material taken from the Free On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of the GFDL, version 1.3 or later.