Content deleted Content added
Mutual exclusion & interrupts: mask IRQs during critical section |
m →top: HTTP to HTTPS for Brown University |
||
(45 intermediate revisions by 33 users not shown) | |||
Line 1:
{{Short description|Algorithm in a thread whose failure cannot cause another thread to fail}}
{{
In [[computer science]], an [[algorithm]] is called '''non-blocking''' if failure or [[Scheduling (computing)|suspension]] of any [[thread (computing)|thread]] cannot cause failure or suspension of another thread;<ref>{{cite book|last1=Göetz|first1=Brian|last2=Peierls|first2=Tim|last3=Bloch|first3=Joshua|last4=Bowbeer|first4=Joseph|last5=Holmes|first5=David|last6=Lea|first6=Doug|title=Java concurrency in practice|date=2006|publisher=Addison-Wesley|___location=Upper Saddle River, NJ|isbn=9780321349606|page=[https://archive.org/details/javaconcurrencyi00goet/page/41 41]|url-access=registration|url=https://archive.org/details/javaconcurrencyi00goet/page/41}}</ref> for some operations, these algorithms provide a useful alternative to traditional [[lock (computer science)|blocking implementations]]. A non-blocking algorithm is '''lock-free''' if there is guaranteed system-wide [[Resource starvation|progress]], and '''wait-free''' if there is also guaranteed per-thread progress. "Non-blocking" was used as a synonym for "lock-free" in the literature until the introduction of obstruction-freedom in 2003.<ref name=obs-free>{{cite conference|last1=Herlihy|first1=M.|last2=Luchangco|first2=V.|last3=Moir|first3=M.|title=Obstruction-Free Synchronization: Double-Ended Queues as an Example|conference=23rd [[International Conference on Distributed Computing Systems]]|year=2003|pages=522|url=https://www.cs.brown.edu/people/mph/HerlihyLM03/main.pdf}}</ref>
The word "non-blocking" was traditionally used to describe [[telecommunications network]]s that could route a connection through a set of relays "without having to re-arrange existing calls"
== Motivation ==
{{Main|Lock (computer science)#Disadvantages|l1=Disadvantages of locks}}
The traditional approach to multi-threaded programming is to use [[
Blocking a thread can be undesirable for many reasons. An obvious reason is that while the thread is blocked, it cannot accomplish anything: if the blocked thread had been performing a high-priority or [[real-time computing|real-time]] task, it would be highly undesirable to halt its progress.
Other problems are less obvious. For example, certain interactions between locks can lead to error conditions such as [[Deadlock (computer science)|deadlock]], [[livelock]], and [[priority inversion]]. Using locks also involves a trade-off between coarse-grained locking, which can significantly reduce opportunities for [[parallel computing|parallelism]], and fine-grained locking, which requires more careful design, increases locking overhead and is more prone to bugs.
Unlike blocking algorithms, non-blocking algorithms do not suffer from these downsides, and in addition are safe for use in [[interrupt handler]]s: even though the [[Pre-emptive multitasking|preempted]] thread cannot be resumed, progress is still possible without it. In contrast, global data structures protected by mutual exclusion cannot safely be accessed in an interrupt handler, as the preempted thread may be the one holding the lock.
A lock-free data structure can be used to improve performance.
A lock-free data structure increases the amount of time spent in parallel execution rather than serial execution, improving performance on a [[multi-core processor]], because access to the shared data structure does not need to be serialized to stay coherent.<ref>
Guillaume Marçais, and Carl Kingsford.
[https://web.archive.org/web/20140518060917/http://bioinformatics.oxfordjournals.org/content/27/6/764.abstract "A fast, lock-free approach for efficient parallel counting of occurrences of k-mers"].
Bioinformatics (2011) 27(6): 764-770.
{{doi|10.1093/bioinformatics/btr011}}
Line 30 ⟶ 27:
== Implementation ==
With few exceptions, non-blocking algorithms use [[Linearizability|atomic]] [[read-modify-write]] primitives that the hardware must provide, the most notable of which is [[Compare-and-swap|compare and swap (CAS)]]. [[Critical section]]s are almost always implemented using standard interfaces over these primitives (in the general case, critical sections will be blocking, even when implemented with these primitives).
Much research has also been done in providing basic [[data structure]]s such as [[stack (data structure)|stacks]], [[Queue (data structure)|queues]], [[Set (computer science)|sets]], and [[hash table]]s. These allow programs to easily exchange data between threads asynchronously.
Line 36 ⟶ 33:
Additionally, some non-blocking data structures are weak enough to be implemented without special atomic primitives. These exceptions include:
* a single-reader single-writer [[Circular buffer|ring buffer]] [[FIFO (computing and electronics)|FIFO]], with a size which evenly divides the overflow of one of the available unsigned integer types, can unconditionally be [[Producer–consumer problem#Without semaphores or monitors|implemented safely]] using only a [[memory barrier]]
* [[Read-copy-update]] with a single writer and any number of readers. (The readers are wait-free; the writer is usually lock-free, until it needs to reclaim memory).
*
Several libraries internally use lock-free techniques,<ref>
[
</ref><ref>
[
</ref><ref>
[http://concurrencykit.org Concurrency Kit] - A C library for non-blocking system design and implementation
</ref> but it is difficult to write lock-free code that is correct.<ref name="A_FALSE_SENSE_OF_SECURITY">Herb Sutter. {{cite web | url=http://www.drdobbs.com/article/print?articleId=210600279&siteSectionName=cpp | title=Lock-Free Code: A False Sense of Security | archive-url=https://web.archive.org/web/20150901211737/http://www.drdobbs.com/article/print?articleId=210600279&siteSectionName=cpp | archive-date=2015-09-01 |url-status=dead}}</ref><ref name="A_CORRECTED_QUEUE">Herb Sutter. {{cite web | archive-url=https://web.archive.org/web/20081205072023/http://www.ddj.com/cpp/210604448 | title=Writing Lock-Free Code: A Corrected Queue | archive-date=2008-12-05 | url-status=dead | url=http://www.ddj.com/cpp/210604448 }}</ref><ref>
</ref><ref>▼
Herb Sutter. [http://www.ddj.com/cpp/211601363 "Writing a Generalized Concurrent Queue"].
</ref><ref>
Herb Sutter. [http://www.ddj.com/cpp/184401930 "The Trouble With Locks"].
Non-blocking algorithms generally involve a series of read, read-modify-write, and write instructions in a carefully designed order.
Optimizing compilers can aggressively re-arrange operations.
Even when they don't, many modern CPUs often re-arrange such operations (they have a "weak [[consistency model]]"),
unless a [[memory barrier]] is used to tell the CPU not to reorder.
[[C++11]] programmers can use <code>std::atomic</code> in <code><atomic></code>,
and [[C11 (C standard revision)|C11]] programmers can use <code><stdatomic.h></code>,
both of which supply types and functions that tell the [[compiler]] not to re-arrange such instructions, and to insert the appropriate memory barriers.<ref>
Bruce Dawson.
[https://randomascii.wordpress.com/2020/11/29/arm-and-lock-free-programming/ "ARM and Lock-Free Programming"].
</ref>
Line 65 ⟶ 69:
This property is critical for real-time systems and is always nice to have as long as the performance cost is not too high.
It was shown in the 1980s<ref name=imp>{{cite conference |last=Herlihy |first=Maurice P. |conference=Proc. 7th Annual ACM Symp. on Principles of Distributed Computing |isbn=0-89791-277-2 |pages=276–290 |doi=10.1145/62546.62593 |title=Impossibility and universality results for wait-free synchronization |year=1988|doi-access=free }}</ref> that all algorithms can be implemented wait-free, and many transformations from serial code, called ''universal constructions'', have been demonstrated. However, the resulting performance does not in general match even naïve blocking designs. Several papers have since improved the performance of universal constructions, but still, their performance is far below blocking designs.
Several papers have investigated the difficulty of creating wait-free algorithms. For example, it has been shown<ref name=cond-sync>{{cite conference |
Wait-free algorithms were rare until 2011, both in research and in practice. However, in 2011 Kogan and [[Erez Petrank|Petrank]]<ref name=wf-queue>{{cite conference |last1=Kogan |first1=Alex |last2=Petrank |first2=Erez |conference=Proc. 16th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming (PPOPP) |year=2011 |isbn=978-1-4503-0119-0 |pages=223–234 |doi=10.1145/1941553.1941585 |title=Wait-free queues with multiple enqueuers and dequeuers|url=http://www.cs.technion.ac.il/~erez/Papers/wfquque-ppopp.pdf}}</ref> presented a wait-free queue building on the [[Compare-and-swap|CAS]] primitive, generally available on common hardware. Their construction expanded the lock-free queue of Michael and Scott,<ref name=lf-queue>{{cite conference |last1=Michael |first1=Maged |last2=Scott |first2=Michael |conference=Proc. 15th Annual ACM Symp. on Principles of Distributed Computing (PODC) |year=1996 |isbn=0-89791-800-2 |pages=267–275 |doi=10.1145/248052.248106 |title=Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms|doi-access=free }}</ref> which is an efficient queue often used in practice. A follow-up paper by Kogan and
Under reasonable assumptions, Alistarh, Censor-Hillel, and Shavit showed that lock-free algorithms are practically wait-free.<ref name=lf-wf>{{cite conference |last1=Alistarh |first1=Dan |last2=Censor-Hillel |first2=Keren |last3=Shavit |first3=Nir |conference=Proc. 46th Annual ACM Symposium on Theory of Computing (STOC’14) | year=2014 | isbn=978-1-4503-2710-7 | pages = 714–723 | doi=10.1145/2591796.2591836 | title=Are Lock-Free Concurrent Algorithms Practically Wait-Free?|arxiv=1311.3200 }}</ref> Thus, in the absence of hard deadlines, wait-free algorithms may not be worth the additional complexity that they introduce.
== Lock-freedom ==
Line 78 ⟶ 84:
All wait-free algorithms are lock-free.
In particular, if one thread is suspended, then a lock-free algorithm guarantees that the remaining threads can still make progress. Hence, if two threads can contend for the same mutex lock or [[spinlock]], then the algorithm is ''not'' lock-free. (If we suspend one thread that holds the lock, then the second thread will block.)
An algorithm is lock-free if infinitely often operation by some processors will succeed in a finite number of steps. For instance, if {{var|N}} processors are trying to execute an operation, some of the {{var|N}} processes will succeed in finishing the operation in a finite number of steps and others might fail and retry on failure. The difference between wait-free and lock-free is that wait-free operation by each process is guaranteed to succeed in a finite number of steps, regardless of the other processors.
In general, a lock-free algorithm can run in four phases: completing one's own operation, assisting an obstructing operation, aborting an obstructing operation, and waiting. Completing one's own operation is complicated by the possibility of concurrent assistance and abortion, but is invariably the fastest path to completion.
Line 94 ⟶ 100:
Some obstruction-free algorithms use a pair of "consistency markers" in the data structure. Processes reading the data structure first read one consistency marker, then read the relevant data into an internal buffer, then read the other marker, and then compare the markers. The data is consistent if the two markers are identical. Markers may be non-identical when the read is interrupted by another process updating the data structure. In such a case, the process discards the data in the internal buffer and tries again.
== See also ==
* [[Deadlock (computer science)|Deadlock]]
* [[Java ConcurrentMap#Lock-free atomicity]]▼
* [[Liveness]]
* [[Lock (
* [[Mutual exclusion]]
* [[Priority inversion]]
* [[Resource starvation]]
* [[Non-lock concurrency control]]
▲* [[Java ConcurrentMap#Lock-free atomicity]]
* [[Optimistic concurrency control]]
== References ==
|