Content deleted Content added
m v2.05b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation) |
m link scope |
||
(16 intermediate revisions by 10 users not shown) | |||
Line 1:
{{Short description|Data processing chain}}
{{Multiple issues|
{{More citations needed|date=September 2019}}
{{Lead too short|date=July 2024}}
}}
In [[computing]], a '''pipeline''', also known as a '''data pipeline''',<ref>[https://www.dativa.com/data-pipelines/ Data Pipeline Development] Published by Dativa, retrieved 24 May, 2018</ref> is a set of [[data]] processing elements connected in series, where the output of one element is the input of the next one. The elements of a pipeline are often executed in parallel or in time-sliced fashion. Some amount of [[buffer (computer science)|buffer storage]] is often inserted between elements.▼
In [[computing]], a '''pipeline''', also known as a '''data pipeline''', is a set of [[data]] processing elements connected in series, where the output of one element is the input of the next one. The elements of a pipeline are often executed in parallel or in time-sliced fashion. Some amount of [[buffer (computer science)|buffer storage]] is often inserted between elements.
Computer-related pipelines include:▼
* [[Instruction pipeline]]s, such as the [[classic RISC pipeline]], which are used in [[central processing unit]]s (CPUs) and other [[Microprocessor|microprocessors]] to allow overlapping execution of multiple instructions with the same [[digital electronics|circuitry]]. The circuitry is usually divided up into stages and each stage processes a specific part of one instruction at a time, passing the partial results to the next stage. Examples of stages are instruction decode, arithmetic/logic and register fetch. They are related to the technologies of [[superscalar execution]], [[operand forwarding]], [[speculative execution]] and [[out-of-order execution]].▼
* [[Graphics pipeline]]s, found in most [[graphics processing unit]]s (GPUs), which consist of multiple [[arithmetic and logical unit|arithmetic unit]]s, or complete [[central processing unit|CPU]]s, that implement the various stages of common rendering operations ([[perspective projection]], window [[clipping (computer graphics)|clipping]], [[color]] and [[light]] calculation, rendering, etc.).▼
* [[pipeline (software)|Software pipeline]]s, which consist of a sequence of computing [[process (computing)|processes]] (commands, program runs, tasks, threads, procedures, etc.), conceptually executed in parallel, with the output stream of one process being automatically fed as the input stream of the next one. The [[Unix]] system call [[pipeline (Unix)|pipe]] is a classic example of this concept.▼
* [[HTTP pipelining]], the technique of issuing multiple [[HTTP]] requests through the same [[TCP connection]], without waiting for the previous one to finish before issuing a new one. ▼
== Concept and motivation ==
Line 18 ⟶ 13:
As this example shows, pipelining does not decrease the [[latency (engineering)|latency]], that is, the total time for one item to go through the whole system. It does however increase the system's [[throughput]], that is, the rate at which new items are processed after the first one.
=== In computing ===
▲In [[computing]], a
▲Computer-related pipelines include:
▲* [[Instruction pipeline]]s, such as the [[classic RISC pipeline]], which are used in [[central processing unit]]s (CPUs) and other [[Microprocessor|microprocessors]] to allow overlapping execution of multiple instructions with the same [[digital electronics|circuitry]]. The circuitry is usually divided up into stages and each stage processes a specific part of one instruction at a time, passing the partial results to the next stage. Examples of stages are instruction decode, arithmetic/logic and register fetch. They are related to the technologies of [[superscalar execution]], [[operand forwarding]], [[speculative execution]] and [[out-of-order execution]].
▲* [[Graphics pipeline]]s, found in most [[graphics processing unit]]s (GPUs), which consist of multiple [[arithmetic and logical unit|arithmetic unit]]s, or complete [[central processing unit|CPU]]s, that implement the various stages of common rendering operations ([[perspective projection]], window [[clipping (computer graphics)|clipping]], [[color]] and [[light]] calculation, rendering, etc.).
▲* [[pipeline (software)|Software pipeline]]s, which consist of a sequence of computing [[process (computing)|processes]] (commands, program runs, tasks, threads, procedures, etc.), conceptually executed in parallel, with the output stream of one process being automatically fed as the input stream of the next one. The [[Unix]] system call [[pipeline (Unix)|pipe]] is a classic example of this concept.
▲* [[HTTP pipelining]], the technique of issuing multiple [[HTTP]] requests through the same [[TCP connection]], without waiting for the previous one to finish before issuing a new one.
== Design considerations ==
Line 24 ⟶ 28:
=== Buffering ===
Under ideal circumstances, if all processing elements are synchronized and take the same amount of time to process, then each item can be received by each element just as it is released by the previous one, in a single [[clock
More generally, buffering between the pipeline stages is necessary when the processing times are irregular, or when items may be created or destroyed along the pipeline. For example, in a graphics pipeline that processes triangles to be rendered on the screen, an element that checks the visibility of each triangle may discard the triangle if it is invisible, or may output two or more triangular pieces of the element if they are partly hidden. Buffering is also needed to accommodate irregularities in the rates at which the application feeds items to the first stage and consumes the output of the last one.
Line 47 ⟶ 51:
== Typical software implementations ==
To be effectively implemented, data pipelines need a CPU [[scheduling]] strategy to dispatch work to the available CPU cores, and the usage of [[data structures]] on which the pipeline stages will operate on. For example, [[UNIX]] derivatives may pipeline commands connecting various processes' standard IO, using the pipes implemented by the operating system. Some [[operating systems]]{{Such as?|date=July 2020}} may provide [[Unix-like|UNIX-like]] syntax to string several program runs in a pipeline, but implement the latter as simple serial execution, rather than true pipelining—namely, by waiting for each program to finish before starting the next one.{{Citation needed|date=July 2020}}
Lower level approaches may rely on the threads provided by the operating system to schedule work on the stages: both [[thread pool]]-based implementations or on a one-thread-per-stage are viable, and exist.<ref>{{cite web | url=https://github.com/dteod/mtdp.git/ | title=MTDP | website=[[GitHub]] | date=September 2022 }}</ref>
Other strategies relying on [[cooperative multitasking]] exist, that do not need multiple threads of execution and hence additional CPU cores, such as using a round-robin scheduler with a coroutine-based framework. In this context, each stage may be instantiated with its own coroutine, yielding control back to the scheduler after finishing its round task. This approach may need careful control over the process' stages to avoid them abuse their time slice.
Line 82 ⟶ 86:
== Bibliography ==
* {{cite book| last1=Perez Garcia |first1=Pablo |title=Pipeline DSL a dsl to create a CI/CD pipeline for your projects| isbn=978-0-134-69147-3 |year=2018 |publisher=Addison-Wesley | url=https://github.com/politrons/Pipeline_DSL/ }}
* For a standard discussion on pipelining in parallel computing see {{cite book |title=Parallel Programming in C with MPI and openMP |first=Michael J. |last=Quinn |publisher=McGraw-Hill Professional |year=2004 |isbn=0072822562 |___location=Dubuque, Iowa |url-access=registration |url=https://archive.org/details/parallelprogramm0000quin }}
* {{cite web |url=https://www.datapipelines.com/blog/what-is-a-data-pipeline/ |title=What is a Data Pipeline? |last=Pogonyi |first=Roland |date=February 2021 |access-date=March 11, 2021}}
|