Content deleted Content added
Citation bot (talk | contribs) Removed parameters. | Use this bot. Report bugs. | #UCB_CommandLine |
For clarity, added reminder that the RAW dependency cannot be removed. |
||
(19 intermediate revisions by 13 users not shown) | |||
Line 1:
{{Short description|Programming situation where an instruction refers to a prior instruction's data}}
{{More citations needed|date=September 2024}}
A '''data dependency''' in [[computer science]] is a situation in which a [[program statement]] (instruction) refers to the data of a preceding statement. In [[compiler theory]], the technique used to discover data dependencies among statements (or instructions) is called [[dependence analysis]].
== Description ==
Assuming statement <math>S_1</math> and <math>S_2</math>, <math>S_2</math> depends on <math>S_1</math> if:
Line 15:
* there is a feasible run-time execution path from <math>S_1</math> to {{nobr|1=<math>S_2</math>.}}
These conditions are called Bernstein's Conditions, named after Arthur J. Bernstein.<ref>{{cite journal|last=Bernstein|first=Arthur J.|title=Analysis of Programs for Parallel Processing|journal=IEEE Transactions on Electronic Computers|date=1 October 1966|volume=EC-15|issue=5|pages=757–763|doi=10.1109/PGEC.1966.264565}}</ref>
Three cases exist:
Line 23:
* Output dependence: <math>O(S_1) \cap O(S_2) \neq \varnothing</math>, <math>S_1 \rightarrow S_2</math> and both write the same memory ___location.
== Types ==
=== Data hazards ===
Data hazards occur when instructions that exhibit data dependence modify data in different stages of a pipeline. Ignoring potential data hazards can result in [[Race condition|race conditions]] (also termed race hazards). There are three situations in which a data hazard can occur:
# read after write (RAW), a ''true dependency''
# write after read (WAR), an ''anti-dependency''
# write after write (WAW), an ''output dependency''
# read after read (RAR), a false dependency
Read after read (RAR) is not a hazard case.
Consider two instructions {{Mono|i1}} and {{Mono|i2}}, with {{Mono|i1}} occurring before {{Mono|i2}} in program order.
==== Read after write (RAW) ====
({{Mono|i2}} tries to read a source before {{Mono|i1}} writes to it) A read after write (RAW) data hazard refers to a situation where an instruction refers to a result that has not yet been calculated or retrieved. This can occur because even though an instruction is executed after a prior instruction, the prior instruction has been processed only partly through the pipeline.
===== Example =====
For example:
i1. '''R2''' <- R5 + R8
i2. R4 <- '''R2''' + R8
The first instruction is calculating a value to be saved in register {{Mono|R2}}, and the second is going to use this value to compute a result for register {{Mono|R4}}. However, in a [[Pipeline (computing)|pipeline]], when operands are fetched for the 2nd operation, the results from the first have not yet been saved, and hence a data dependency occurs.
A data dependency occurs with instruction {{Mono|i2}}, as it is dependent on the completion of instruction {{Mono|i1}}.
==== Write after read (WAR) ====
({{Mono|i2}} tries to write a destination before it is read by {{Mono|i1}}) A write after read (WAR) data hazard represents a problem with concurrent execution.
===== Example =====
For example:
i1. R4 <- R1 + '''R5'''
i2. '''R5''' <- R1 + R2
In any situation with a chance that {{Mono|i2}} may finish before {{Mono|i1}} (i.e., with concurrent execution), it must be ensured that the result of register {{Mono|R5}} is not stored before {{Mono|i1}} has had a chance to fetch the operands.
==== Write after write (WAW) ====
({{Mono|i2}} tries to write an operand before it is written by {{Mono|i1}}) A write after write (WAW) data hazard may occur in a [[Concurrent computing|concurrent execution]] environment.
===== Example =====
For example:
i1. '''R5''' <- R4 + R7
i2. '''R5''' <- R1 + R3
The write back (WB) of {{Mono|i2}} must be delayed until {{Mono|i1}} finishes executing.
=== True dependency (read-after-write) ===
A true dependency, also known as a '''flow''' '''dependency''' or '''data dependency''', occurs when an instruction depends on the result of a previous instruction. A violation of a true dependency leads to a '''read-after-write (RAW)''' hazard.
1. A = 3
Line 31 ⟶ 74:
3. C = B
Instruction 3 is truly dependent on instruction 2, as the final value of C depends on the instruction updating B. Instruction 2 is truly dependent on instruction 1, as the final value of B depends on the instruction updating A. Since instruction 3 is truly dependent upon instruction 2 and instruction 2 is truly dependent on instruction 1, instruction 3 is also truly dependent on instruction 1. [[Instruction level parallelism]] is therefore not an option in this example.<ref name="architecture">{{cite book | author=[[John L. Hennessy]]; [[David A. Patterson (scientist)|David A. Patterson]] | title=Computer Architecture: a quantitative approach (3rd ed.) | publisher=[[Morgan Kaufmann]] | year=2003 | isbn=1-55860-724-2}}</ref>
=== Anti-dependency (write-after-read) ===
An anti-dependency
In the following example, instruction 2 anti-depends on instruction 3 — the ordering of these instructions cannot be changed, nor can they be executed in parallel (possibly changing the instruction ordering), as this would affect the final value of A. 1. B = 3
Line 42 ⟶ 86:
3. B = 7
Example
MUL R3,R1,R2
ADD R2,R5,R6
It is clear that there is anti-dependence between these 2 instructions. At first we read R2 then in second instruction we are Writing a new value for it.
An anti-dependency is an example of a ''name dependency''. That is, renaming of variables could remove the dependency, as in the next example:
Line 56 ⟶ 99:
3. B = 7
A new variable, B2, has been declared as a copy of B in a new instruction, instruction N.
Note that there is still a read-after-write dependency: instruction 2 is truly dependent on instruction N, which is truly dependent upon instruction 1. This dependency existed in the original version, where instruction 2 was truly dependent on instruction 1. This dependency cannot be safely removed.<ref name="architecture"/>
=== Output dependency (write-after-write) ===
An output dependency occurs when the ordering of instructions will affect the final output value of a variable. A violation of an output dependency leads to an '''write-after-write (WAW)''' hazard.
In the example below, there is an output dependency between instructions 3 and 1 — changing the ordering of instructions in this example will change the final value of A, thus these instructions cannot be executed in parallel.
1. B = 3
Line 67 ⟶ 113:
3. B = 7
As with anti-dependencies, output dependencies are ''name dependencies''.
1. B2 = 3
Line 73 ⟶ 119:
3. B = 7
==Implications==
Conventional programs are written assuming the [[sequential execution model]]. Under this model, instructions execute one after the other, atomically (i.e., at any given point in time, only one instruction is executed) and in the order specified by the program.
However, dependencies among statements or instructions may hinder parallelism — parallel execution of multiple instructions, either by a parallelizing compiler or by a processor exploiting [[instruction-level parallelism]]. Recklessly executing multiple instructions without considering related dependences may cause danger of getting wrong results, namely [[Hazard (computer architecture)|hazards]].
==Relevance in computing==
Data dependencies are relevant in various areas of computing, particularly in processor design, compiler construction, parallel computing, and concurrent programming.
=== Processor design ===
* [[Instruction pipelining]]: In pipelined processors, multiple instruction are executed in parallel in multiple pipeline stages. Thereby, data dependencies between [[Processor register|registers]] must be respected and handled in the processor pipeline. Most relevant are true dependencies that are resolved e.g. by [[Pipeline stall|stalling the pipeline]] or [[operand forwarding]].
* [[Out-of-order execution]]: Modern processors often execute instructions out of their original order to improve performance. Thereby, name dependencies between registers must be respected (in addition to data dependencies) and are resolved e.g. by [[register renaming]] or [[scoreboarding]]. Data dependencies are also relevant for memory accesses and must be respected by [[memory disambiguation]] techniques that execute [[Primary storage|memory]] access [[Instruction (computer science)|instructions]] (loads and stores) out of program order.
=== Compiler construction ===
Data dependencies are relevant for various [[compiler optimizations]], e.g.
* [[Instruction scheduling]]: Compilers must schedule instructions in a way that respects data dependencies. This is crucial in optimizing compilers that rearrange code for better performance.
* [[Loop transformation|Loop transformations]]: In optimizing loops, compilers need to consider data dependencies to apply transformations like loop unrolling, fusion, or tiling without changing the semantics of the program.
* [[Code motion]]: When a compiler considers moving a piece of code, it must ensure that data dependencies are not violated.
==See also==
* [[Dependency analysis]]
* [[Control dependency]]
* [[Loop_dependence_analysis#Data_dependence|Loop dependence analysis § Data dependence]]
* [[Hazard (computer architecture)#Data hazards|Hazard (computer architecture) § Data hazards]]
==References==
|