Live-variable analysis: Difference between revisions

Content deleted Content added
m I feel in state and out state are interchanged in the table. Can you check it once
improve typesetting of constants and variables
 
(40 intermediate revisions by 30 users not shown)
Line 1:
In [[compiler theorycompilers]], '''live variable analysis''' (or simply '''liveness analysis''') is a classic [[data-flow analysis]] performed by [[compiler]]s to calculate for each program point the [[Variable (programming)|variables]] that mayare be''live'' potentiallyat readeach beforepoint theirin next write,the thatprogram. is,A thevariable variables that areis ''live'' at some point if it holds a value that may be needed in the exitfuture, fromor eachequivalently programif pointits value may be read before the next time the variable is written to.
 
== Example ==
Stated simply: a variable is '''live''' if it holds a value that may be needed in the future.
Consider the following program:
 
L1: b := 3;
It is a "backwards way" analysis. The analysis is done in a backwards order, and the dataflow [[confluence operator]] is [[set union]].
L2: c := 5;
L3: a := f(b +* c);
 
The set of live variables atbetween linelines 2 and L33 is {<code>b</code>, <code>c</code>} because both are used in the addition,multiplication andon therebyline 3. But the callset toof live variables after line 1 is only {<code>fb</code>}, andsince assignment tovariable <code>ac</code>. is Butupdated thelater, seton ofline live2. variablesThe atvalue lineof L1variable <code>a</code> is not used in this code.
<table>
<tr>
<td>
The set of live variables at line L3 is {<code>b</code>, <code>c</code>} because both are used in the addition, and thereby the call to <code>f</code> and assignment to <code>a</code>. But the set of live variables at line L1 is
only {<code>b</code>} since variable <code>c</code> is updated in L2. The value of variable <code>a</code> is never used. Note that <code>f</code> may be stateful, so the never-live assignment to <code>a</code> can be eliminated, but there is insufficient information to rule on the entirety of <code>L3</code>.
</td>
 
Note that the assignment to <code>a</code> may be eliminated as <code>a</code> is not used later, but there is insufficient information to justify removing all of line 3 as <code>f</code> may have [[Side effect (computer science)|side effects]] (printing <code>b * c</code>, perhaps).
<td>
<code>
L1: b := 3;
L2: c := 5;
L3: a := f(b + c);
goto L1;
</code>
</td>
</tr>
</table>
 
== Expression in terms of dataflow equations ==
The dataflow equations used for a given basic block ''s'' and exiting block ''f'' in live variable analysis are the following:
 
Liveness analysis is a "backwards may" analysis. The analysis is done in a [[Data-flow_analysis#Backward_Analysis|backwards]] order, and the dataflow [[confluence operator]] is [[set union]]. In other words, if applying liveness analysis to a function with a particular number of logical branches within it, the analysis is performed starting from the end of the function working towards the beginning (hence "backwards"), and a variable is considered live if any of the branches moving forward within the function might potentially (hence "may") need the variable's current value. This is in contrast to a "backwards must" analysis which would instead enforce this condition on all branches moving forward.
 
The dataflow equations used for a given basic block ''<math>s''</math> and exiting block ''f''<math>\mathit{final}</math> in live variable analysis are the following:
 
:<math>
{\mbox{GEN}}[s] </math>: The set of variables that are used in s before any assignment in the same basic block.
 
:<math>
{\mbox{KILL}}[s] </math>: The set of variables that are assigned a value in s (in many books that discuss compiler design, KILL (s) is also defined as the set of variables assigned a value in s ''before any use'', but this doesn'tdoes not change the solution of the dataflow equation):
 
 
:<math>
{\mbox{LIVE}}_{\mathrm{in}}[s] = {\mbox{GEN}}[s] \cup ({\mbox{LIVE}}_{\mathrm{out}}[s] - {\mbox{KILL}}[s])
</math>
:<math>
{\mbox{LIVE}}_{\mathrm{out}}[\mathit{final}] = {\emptyset}
</math>
:<math>
{\mbox{LIVE}}_{\mathrm{out}}[s] = \bigcup_{p \in \mathrm{succ}[Ss]} {\mbox{LIVE}}_{\mathrm{in}}[p]
</math>
 
Line 51 ⟶ 44:
The in-state of a block is the set of variables that are live at the start of the block. Its out-state is the set of variables that are live at the end of it. The out-state is the union of the in-states of the block's successors. The transfer function of a statement is applied by making the variables that are written dead, then making the variables that are read live.
 
== Second example ==
<table>
{|
<tr><td>
|
// in: {}
// in: {}; predecessor blocks: none
b1: a = 3;
b = 5;
Line 61 ⟶ 55:
// out: {a,b,d} //union of all (in) successors of b1 => b2: {a,b}, and b3:{b,d}
// in: {a,b}; predecessor blocks: b1
b2: c = a + b;
d = 2;
// out: {b,d}
// in: {b,d}; predecessor blocks: b1 and b2
b3: endif
c = 4;
return b * d + c;
// out:{}
|}
</td>
</tr>
</table>
 
The in-state of b3 only contains ''b'' and ''d'', since ''c'' has been written. The out-state of b1 is the union of the in-states of b2 and b3. The definition of ''c'' in b2 can be removed, since ''c'' is not live immediately after the statement.
 
Solving the data flow equations starts with initializing all in-states and out-states to the empty set. The work list is initialized by inserting the exit point (b3) in the work list (typical for backward flow). Its computed outin-state differs from the previous one, so its predecessors b1 and b2 are inserted and the process continues. The progress is summarized in the table below.
 
{| class="wikitable"
Line 115 ⟶ 107:
 
Initializing with the empty set is an optimistic initialization: all variables start out as dead. Note that the out-states cannot shrink from one iteration to the next, although the out-state can be smaller than the in-state. This can be seen from the fact that after the first iteration the out-state can only change by a change of the in-state. Since the in-state starts as the empty set, it can only grow in further iterations.
 
<!-- Is this paragraph relevant? -->
Recently {{As of|2006|lc=on}}, various program analyses such as live variable analysis have been solved using [[Datalog]]. The Datalog specifications for such analyses are generally an order of magnitude shorter than their imperative counterparts (e.g. [[iterative analysis]]), and are at least as efficient.<ref>{{cite book | author=Whaley | title=Using Datalog with Binary Decision Diagrams for Program Analysis | year=2004 | display-authors=1}}</ref>
 
==References==
{{reflist|30em}}
<references/>
{{cite book|last1=Aho|first1=Alfred|last2=Lam|first2=Monica|last3=Sethi|first3=Ravi|last4=Ullman|first4=Jeffrey|year=2007|edition=2|title=Compilers: Principles, Techniques, and Tools|page=608}}
 
{{Compiler optimizations}}
[[Category:Compiler optimizations]]
[[Category:Data-flow analysis]]
[[Category:Static program analysis]]