DMS Software Reengineering Toolkit: Difference between revisions

Content deleted Content added
Disambiguated: GLRGLR parser using Dab solver
m rm some extra spaces throughout the article
Line 24:
| website = {{URL|http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html}}
}}
 
The '''DMS Software Reengineering Toolkit'''<ref>[http://portal.acm.org/citation.cfm?id=999466&dl=GUIDE&coll=GUIDE&CFID=55567354&CFTOKEN=76359207 ''DMS: Program Transformations for Practical Scalable Software Evolution''. Proceedings International Conference on Software Engineering 2004] [http://www.semanticdesigns.com/Company/Publications/DMS-for-ICSE2004-reprint.pdf Reprint]</ref> is a proprietary set of [[program transformation]] tools available for automating custom source program analysis, modification, translation or generation of software systems for arbitrary mixtures of source languages for large scale software systems.
 
DMS has been used to implement a wide variety of practical tools, include [[___domain-specific language]]s (such as code generation for factory control), test coverage<ref>[http://www.semanticdesigns.com/Company/Publications/TestCoverage.pdf Branch Coverage for Arbitrary Languages Made Easy]</ref> and profiling tools, [[Duplicate code | clone detection]],<ref>[http://www.computer.org/portal/web/csdl/doi/10.1109/ICSM.1998.738528 ''Clone Detection Using Abstract Syntax Trees''. Proceedings International Conference on Software Maintenance 1998]</ref>, language migration tools, and C++ component reengineering.<ref>[http://linkinghub.elsevier.com/retrieve/pii/S0950584906001856 ''Case study: Re-engineering C++ component models via automatic program transformation''. Information and Software Technology 2007]</ref>.
 
The toolkit provides means for defining language grammars and will produce [[parser]]s which automatically construct [[abstract syntax trees]] (ASTs), and [[prettyprinter]]s to convert original or modified ASTs back into compilable source text. The parse trees capture, and the prettyprinters regenerate, complete detail about the original source program, including source position, comments, radix and format of numbers, etc., to ensure that regenerated source text is as recognizable to a programmer as the original text modulo any applied transformations.
Line 34 ⟶ 35:
DMS uses [[GLR parser|GLR]] parsing technology, enabling it to handle all practical context-free grammars. Semantic predicates extend this capability to interesting non-context-free grammars ([[Fortran]] requires matching of multiple DO loops with shared CONTINUE statements by label; GLR with semantic predicates enables the DMS Fortran parser to produce ASTs for correctly nested loops as it parses).
 
DMS provides [[attribute grammar]] evaluators for computing custom analyses over ASTs, such as metrics, and including special support for [[symbol table]] construction. Other program facts can be extracted by built-in control- and data- [[flow analysis]] engines, local and global [[pointer analysis]], whole-program [[call graph]] extraction, and symbolic range analysis by [[abstract interpretation]].
 
Changes to ASTs can be accomplished by both procedural methods coded in PARLANSE and source-to-source tree transformations coded as rewrite rules using surface-syntax conditioned by any extracted program facts. The rewrite rule engine handles associative and commutative rules. A rewrite rule for C to replace a complex condition by the '''?:''' operator be written as:
 
rule simplify_conditional_assignment(v:left_hand_side,e1:expression,e2:expression)
Line 44 ⟶ 45:
if no_side_effects(v);
 
Rewrite '''rule'''s have names, e.g. '''simplify_conditional_assignment'''. Each rule has a ''"match this"'' and ''"replace by that"'' pattern pair separated by '''->''', in our example, on separate lines for readability. The patterns must correspond to language syntax categories; in this case, both patterns must be of syntax category '''statement''' also separated in sympathy with the patterns by '''->'''. Target language (e.g., C) surface synaxsyntax is coded inside meta-quotes '''"''', to separate rewrite-rule syntax from that of the target language. Backslashes inside meta-quotes represent ___domain escapes, to indicate pattern meta variables (e.g., '''\v''', '''\e1''', '''\e2''') that match any language construct corresponding to the metavariable declaration in the signature line, e.g., '''e1''' must be of syntactic category: ''(any) expression''. If a metavariable is mentioned multiple times in the ''match'' pattern, it must match to identical subtrees; the same identically shaped '''v''' must occur in both assignments in the match pattern in this example. Metavariables in the ''replace'' pattern are replaced by the corresponding matches from the left side. A conditional clause '''if''' provides an additional condition that must be met for the rule to apply, e.g., that the matched metavariable '''v''', being an arbitrary left-hand side, must not have a side effect (e.g., cannot be of the form of '''a[i++]'''; the '''no_side_effects''' predicate is defined by an analyzer built with other DMS mechanisms).
 
Achieving a complex transformation on code is accomplished by providing a number of rules that cooperate to achieve the desired effect. The ruleset is focused on portions of the program by metaprograms coded in PARLANSE.
 
A [http://www.semanticdesigns.com/Products/DMS/SimpleDMSDomainExample.html complete example] of a language definition and source-to-source transformation rules defined and applied is shown using high school [[algebra]] and a bit of [[calculus]] as a ___domain-specific language.
Line 52 ⟶ 53:
DMS has a variety of predefined language front ends, covering most real dialects of [[C (programming language)|C]] and [[C++]] including [[C++0x]], [[C Sharp (programming language)|C#]], [[Java (programming language)|Java]], [[Python (programming language)|Python]], [[PHP]], [[EGL (programming language)|EGL]], [[Fortran]], [[COBOL]], [[Visual Basic]], [[Verilog]], [[VHDL]] and some 20 or more other languages. Predefined languages enable customizers to immediately focus on their reengineering task rather than on the details of the languages to be processed.
 
DMS is additionally unusual in being implemented in a [[parallel programming]] language, PARLANSE, that uses [[symmetric multiprocessor]]s available on commodity [[workstations]]. This enables DMS to provide faster answers for large system analyses and conversions.
 
DMS was originally motivated by a theory for maintaining designs of software called ''Design Maintenance Systems.''<ref>[http://portal.acm.org/citation.cfm?id=129859 ''Design Maintenance Systems''. Communications of the ACM 1992][http://www.semanticdesigns.com/Company/Publications/DMS-CACM-1992-baxter.pdf Reprint]