Definite assignment analysis: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 11:28, 14 May 2009 edit Rich Farmbrough (talk \| contribs) Edit filter managers, Autopatrolled, Extended confirmed users, File movers, IP block exemptions, Pending changes reviewers, Rollbackers, Template editors 1,733,915 edits m clean up- spelling "et al." using AWB ← Previous edit		Latest revision as of 17:40, 11 May 2020 edit undo DannyS712 bot (talk \| contribs) Bots 133,244 edits m Task 70: Update syntaxhighlight tags - remove use of deprecated <source> tags Tag: AWB
(15 intermediate revisions by 13 users not shown)
Line 1: In [[computer science]], '''definite assignment analysis''' is a [[data-flow analysis]] used by [[compiler]]s to conservatively ensure that a variable or ___location is always assigned to before it is used. ==Motivation== In [[C programming language\|C]] and [[C++]] programs, a source of particularly difficult-to-diagnose errors is the nondeterministic behavior that results from reading [[uninitialized ~~variables~~variable]]s; this behavior can vary between platforms, builds, and even from run to run. There are two common ways to solve this problem. One is to ensure that all locations are written before they are read. [[Rice's theorem]] establishes that this problem cannot be solved in general for all programs; however, it is possible to create a conservative (imprecise) analysis that will accept only programs that satisfy this constraint, while rejecting some correct programs, and definite assignment analysis is such an analysis. The [[Java programming language\|Java]]<ref>{{cite web \| ~~authors~~ author1= J. Gosling, \|author2=B. Joy, \|author3=G. Steele, \|author4=G. ~~Brachda~~Bracha \| title = The Java Language Specification, 3rd Edition \| url= http://java.sun.com/docs/books/jls/third_edition/html/defAssign.html \| accessdate = ~~Dec~~December 2 ~~\| accessyear =~~, 2008 \| pages = Chapter 16 (pp.527–552)}}</ref> and [[C Sharp (programming language)\|C#]]<ref>{{cite web \| title = Standard ECMA-334, C# Language Specification \| work = ECMA International \| url = http://www.ecma-international.org/publications/standards/Ecma-334.htm \| accessdate = ~~Dec~~December 2 ~~\| accessyear =~~, 2008 \| pages = Section 12.3 (pp.122–133)}}</ref> programming language specifications require that the compiler report a compile-time error if the analysis fails. Both languages require a specific form of the analysis that is spelled out in meticulous detail. In Java, this analysis was formalized by Stärk et al.,<ref>{{cite book \|title=Java and the Java Virtual Machine: Definition, Verification, Validation \|last=Stärk \|first=Robert F. \|~~coauthors~~author2=E. Borger, \|author3=Joachim Schmid \|year=2001 \|publisher=Springer-Verlag New York, Inc. \|___location=Secaucus, NJ, USA \|isbn=~~3540420886~~3-540-42088-6 \|pages=Section 8.3}}</ref>, and some correct programs are rejected and must be altered to introduce explicit unnecessary assignments. In C#, this analysis was formalized by Fruja, and is precise as well as sound, in the sense that all variables assigned along all control flow paths will be considered definitely assigned.<ref name="fruja">{{cite journal \|doi=10.5381/jot.2004.3.9.a2 \|last=Fruja \|first=Nicu G. \|~~year~~date=~~2004 \|month=Oct~~October 2004\|title=The Correctness of the Definite Assignment Analysis in C# \|journal=Journal of Object Technology \|volume=3 \|issue=9 \|pages=29–52 \|url=http://www.jot.fm/issues/issue_2004_10/article2 \|accessdate=2008-12-02 \| quote=We actually prove more than correctness: we show that the solution of the analysis is a perfect solution (and not only a safe approximation).\|citeseerx=10.1.1.165.6696 }}</ref> The [[Cyclone (programming language)\|Cyclone]] language also requires programs to pass a definite assignment analysis, but only on variables with pointer types, to ease porting of C programs.<ref>{{cite web \| title = Cyclone: Definite Assignment \| work = Cyclone User's Manual \| url = http://cyclone.thelanguage.org/wiki/Definite%20Assignment \| accessdate = ~~Dec~~December 16 ~~\| accessyear =~~, 2008 }}</ref> The second way to solve the problem is to automatically initialize all locations to some fixed, predictable value at the point at which they are defined, but this introduces new assignments that may impede performance. In this case, definite assignment analysis enables a [[compiler optimization]] where redundant assignments — assignments followed only by other assignments with no possible intervening reads — can be eliminated. In this case, no programs are rejected, but programs for which the analysis fails to recognize definite assignment may contain redundant initialization. The [[Common Language Infrastructure]] relies on this approach.<ref>{{cite web \| title = Standard ECMA-335, Common Language Infrastructure (CLI) \| work = ECMA International \| url = http://www.ecma-international.org/publications/standards/Ecma-335.htm \| accessdate = ~~Dec~~December 2 ~~\| accessyear =~~, 2008 \| pages=Section 1.8.1.1 (Partition III, pg. 19)}}</ref> ==Terminology== Line 32: \|} We supply data-flow equations that define the values of these functions on various expressions and statements, in terms of the values of the functions on their syntactic subexpressions. Assume for the moment that there are no ''goto'', ''break'', ''continue'', ''return'', or [[exception handling]] statements. Following are a few examples of these equations: * Any expression or statement ''e'' that does not affect the set of variables definitely assigned: ''after''(''e'') = ''before''(''e'') Line 45: The algorithm is complicated by the introduction of control-flow jumps like ''goto'', ''break'', ''continue'', ''return'', and exception handling. Any statement that can be the target of one of these jumps must intersect its ''before'' set with the set of definitely assigned variables at the jump source. When these are introduced, the resulting data flow may have multiple fixed points, as in this example: <syntaxhighlight lang="c" line> # int i = 1; # L: # goto L; </syntaxhighlight> Since the label L can be reached from two locations, the control-flow equation for goto dictates that ''before''(2) = ''after''(1) intersect ''before''(3). But ''before''(3) = ''before''(2), so ''before''(2) = ''after''(1) intersect ''before''(2). This has two fixed-points for ''before''(2), {i} and the empty set. However, it can be shown that because of the monotonic form of the data-flow equations, there is a unique maximal fixed point (fixed point of largest size) that provides the most possible information about the definitely assigned variables. Such a maximal (or maximum) fixed point may be computed by standard techniques; see [[data-flow analysis]]. An additional issue is that a control-flow jump may render certain control flows infeasible; for example, in this code fragment the variable ''i'' is definitely assigned before it is used: <syntaxhighlight lang="c" line> ~~# '''~~ int~~'''~~ i; ~~# '''~~ if~~'''~~ (j < 0) ~~'''~~return~~'''~~; ~~'''~~else~~'''~~ i = j; # print(i); </syntaxhighlight> The data-flow equation for ''if'' says that ''after''(2) = after('''return''') intersect after(''i'' = ''j''). To make this work out correctly, we define ''after''(''e'') = ''vars''(''e'') for all control-flow jumps; this is vacuously valid in the same sense that the equation ''false''('''true''') = ''vars''(''e'') is valid, because it is not possible for control to reach a point immediately after a control-flow jump. Line 63: <references /> {{DEFAULTSORT:Definite Assignment Analysis}} [[Category:~~Static code~~Data-flow analysis]]