Non-blocking algorithm: Difference between revisions

Content deleted Content added
Sssrfp (talk | contribs)
m Fixed grammar
Tags: canned edit summary Mobile edit Mobile app edit iOS app edit
Citation bot (talk | contribs)
Add: s2cid, authors 1-1. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Abductive | #UCB_toolbar
Line 2:
{{distinguish|Non-blocking I/O}}
 
In [[computer science]], an [[algorithm]] is called '''non-blocking''' if failure or [[Scheduling (computing)|suspension]] of any [[thread (computing)|thread]] cannot cause failure or suspension of another thread;<ref>{{cite book|last1=Göetz|first1=Brian|last2=Peierls|first2=Tim|last3=Bloch|first3=Joshua|last4=Bowbeer|first4=Joseph|last5=Holmes|first5=David|last6=Lea|first6=Doug|title=Java concurrency in practice|date=2006|publisher=Addison-Wesley|___location=Upper Saddle River, NJ|isbn=9780321349606|page=[https://archive.org/details/javaconcurrencyi00goet/page/41 41]|url-access=registration|url=https://archive.org/details/javaconcurrencyi00goet/page/41}}</ref> for some operations, these algorithms provide a useful alternative to traditional [[lock (computer science)|blocking implementations]]. A non-blocking algorithm is '''lock-free''' if there is guaranteed system-wide [[Resource starvation|progress]], and '''wait-free''' if there is also guaranteed per-thread progress. "Non-blocking" was used as a synonym for "lock-free" in the literature until the introduction of obstruction-freedom in 2003.<ref name=obs-free>{{cite conference|lastlast1=Herlihy|firstfirst1=M.|last2=Luchangco|first2=V.|last3=Moir|first3=M.|title=Obstruction-Free Synchronization: Double-Ended Queues as an Example|conference=23rd [[International Conference on Distributed Computing Systems]]|year=2003|pages=522|url=http://www.cs.brown.edu/people/mph/HerlihyLM03/main.pdf}}</ref>
 
The word "non-blocking" was traditionally used to describe [[telecommunications network]]s that could route a connection through a set of relays "without having to re-arrange existing calls" (see [[Clos network]]). Also, if the telephone exchange "is not defective, it can always make the connection" (see [[nonblocking minimal spanning switch]]).
Line 15:
Other problems are less obvious. For example, certain interactions between locks can lead to error conditions such as [[deadlock]], [[livelock]], and [[priority inversion]]. Using locks also involves a trade-off between coarse-grained locking, which can significantly reduce opportunities for [[parallel computing|parallelism]], and fine-grained locking, which requires more careful design, increases locking overhead and is more prone to bugs.
 
Unlike blocking algorithms, non-blocking algorithms do not suffer from these downsides, and in addition are safe for use in [[interrupt handler]]s: even though the [[Pre-emptive multitasking|preempted]] thread cannot be resumed, progress is still possible without it. In contrast, global data structures protected by mutual exclusion cannot safely be accessed in an interrupt handler, as the preempted thread may be the one holding the lock—but this can be rectified easily by masking the interrupt request during the critical section.<ref name="monit">{{cite journal | doi = 10.1145/358818.358824 | url = http://research.microsoft.com/lampson/23-ProcessesInMesa/Abstract.html | title = Experience with Processes and Monitors in Mesa | author = Butler W. Lampson | author-link = Butler W. Lampson |author2=David D. Redell |author2-link=David D. Redell | journal = Communications of the ACM | volume = 23 | issue = 2 | pages = 105–117 |date=February 1980| citeseerx = 10.1.1.142.5765 | s2cid = 1594544 }}</ref>
 
A lock-free data structure can be used to improve performance.
Line 27:
 
== Implementation ==
With few exceptions, non-blocking algorithms use [[Linearizability|atomic]] [[read-modify-write]] primitives that the hardware must provide, the most notable of which is [[Compare-and-swap|compare and swap (CAS)]]. [[Critical section]]s are almost always implemented using standard interfaces over these primitives (in the general case, critical sections will be blocking, even when implemented with these primitives). In the 1990s all non-blocking algorithms had to be written "natively" with the underlying primitives to achieve acceptable performance. However, the emerging field of [[software transactional memory]] promises standard abstractions for writing efficient non-blocking code.<ref name=lightweight-transactions>{{cite journal|lastlast1=Harris|firstfirst1=Tim|last2=Fraser|first2=Keir|title=Language support for lightweight transactions|journal=ACM SIGPLAN Notices|date=26 November 2003|volume=38|issue=11|pages=388|doi=10.1145/949343.949340|url=http://research.microsoft.com/en-us/um/people/tharris/papers/2003-oopsla.pdf|citeseerx=10.1.1.58.8466}}</ref><ref name=composable-memory-transactions>{{cite book|lastlast1=Harris|firstfirst1=Tim|last2=Marlow|first2=S.|last3=Peyton-Jones|first3=S.|last4=Herlihy|first4=M.|title=Proceedings of the 2005 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '05 : Chicago, Illinois|publisher=ACM Press|___location=New York, NY|isbn=978-1-59593-080-4|pages=48–60|chapter=Composable memory transactions|date=June 15–17, 2005|doi=10.1145/1065944.1065952}}</ref>
 
Much research has also been done in providing basic [[data structure]]s such as [[stack (data structure)|stacks]], [[Queue (data structure)|queues]], [[Set (computer science)|sets]], and [[hash table]]s. These allow programs to easily exchange data between threads asynchronously.
Line 60:
It was shown in the 1980s<ref name=imp>{{cite conference |last=Herlihy |first=Maurice P. |conference=Proc. 7th Annual ACM Symp. on Principles of Distributed Computing |isbn=0-89791-277-2 |pages=276–290 |doi=10.1145/62546.62593 |title=Impossibility and universality results for wait-free synchronization |year=1988}}</ref> that all algorithms can be implemented wait-free, and many transformations from serial code, called ''universal constructions'', have been demonstrated. However, the resulting performance does not in general match even naïve blocking designs. Several papers have since improved the performance of universal constructions, but still, their performance is far below blocking designs.
 
Several papers have investigated the difficulty of creating wait-free algorithms. For example, it has been shown<ref name=cond-sync>{{cite conference |lastlast1=Fich |firstfirst1=Faith|author1-link=Faith Ellen |last2=Hendler |first2=Danny |last3=Shavit |first3=Nir |conference=Proc. 23rd Annual ACM Symp.on Principles of Distributed Computing (PODC) |year=2004 |isbn=1-58113-802-4 |pages=80–87 |doi=10.1145/1011767.1011780 |title=On the inherent weakness of conditional synchronization primitives}}</ref> that the widely available atomic ''conditional'' primitives, [[Compare-and-swap|CAS]] and [[Load-link/store-conditional|LL/SC]], cannot provide starvation-free implementations of many common data structures without memory costs growing linearly in the number of threads.
 
But in practice these lower bounds do not present a real barrier as spending a cache line or exclusive reservation granule (up to 2&nbsp;KB on ARM) of store per thread in the shared memory is not considered too costly for practical systems (typically the amount of store logically required is a word, but physically CAS operations on the same cache line will collide, and LL/SC operations in the same exclusive reservation granule will collide, so the amount of store physically required{{citation needed|date=June 2014}} is greater).