Revision as of 23:10, 18 July 2021 edit InedibleHulk (talk \| contribs) Extended confirmed users 127,642 edits m →Caveat: Hyphen, comma, negative. ← Previous edit		Revision as of 17:38, 19 March 2023 edit undo Znok (talk \| contribs) 32 edits Poignancy, explanation of underlying mechanisms Next edit →
Line 1: In [[computer ~~science~~architecture]], the [[test-and-set]] CPU [[instruction (computer science)\|instruction]] (or instruction sequence) is ~~used~~designed to implement [[mutual exclusion]] in [[multiprocessor]] environments. Although a correct [[lock (computer science)\|lock]] can be implemented with test-and-set, itthe ~~can~~''test ~~lead~~and totest-and-set'' optimization lowers [[resource contention]] ~~in busy lock (~~caused by bus locking ~~and~~resp. [[cache ~~invalidation~~coherence ~~when~~\| ~~test-and-set~~cache ~~operation~~coherency ~~needs~~protocol]] tooverhead ~~access~~on ~~memory~~contended ~~[[atomic (computer science)\|atomically]])~~locks. ~~To lower the overhead a more elaborate locking protocol '''test and test-and-set''' is used.~~ Given a lock: ''boolean'' locked := false ''// shared lock variable'' ~~Entry~~the entry protocol is: '''procedure''' EnterCritical() { '''do''' { '''while''' ( locked == true ) '''skip''' ''// spin using normal instructions until the lock ~~'''''seems'''''~~is free'' } '''while''' ( ''TestAndSet''(locked) == true ) ''// attempt actual atomic locking using the [[test-and-set]] instruction'' } ~~Exit~~and the exit protocol is: '''procedure''' ExitCritical() { locked := false } The difference to the simple test-and-set protocol is the additional spin-loop (the ''test'' in ''test and test-and-set'') at the start of the entry protocol, which utilizes ordinary load instructions. The load in this loop executes with less overhead compared to an atomic operation (resp. a [[Load-link/store-conditional \| load-exclusive]] instruction). E.g., on a system utilizing the [[MESI]] cache coherency protocol, the cache line being loaded is moved to the Shared state, whereas a test-and-set instruction or a load-exclusive instruction moves it into the Exclusive state. The entry protocol uses normal memory reads to wait for the lock to become free. Test-and-set is only used to try to get the lock when normal memory read says it's free. Thus the expensive atomic memory operations happen less often than in a simple [[Busy waiting\|spin]] around test-and-set. This is particularly advantageous if multiple processors are contending for the same lock: whereas an atomic instruction or load-exclusive instruction requires a coherency-protocol transaction to give that processor exclusive access to the cache line (causing that line to ping-pong between the involved processors), ordinary loads on a line in Shared state require no protocol transactions at all: processors spinning in the inner loop operate purely locally. Cache-coherency protocol transactions are used only in the outer loop, after the initial check has ascertained that they have a reasonable likelihood of success. If the [[programming language]] used supports [[short-circuit evaluation]], the entry protocol could be implemented as: Line 29 ⟶ 32: ==Caveat== Although this [[Optimization (computer science)\|optimization]] is useful in [[system programming]], ittest-and-set ~~should~~is to be avoided in high-level [[concurrent programming]]: ~~when~~spinning ~~constraints~~in ~~are~~applications ~~unclear~~derives ~~and~~the ~~misunderstood.~~operating ~~One~~system example of bad usage is a similar [[programming idiom\|idiom]] called [[double-checked locking]], which is [[Double-checked_locking#Usage_in_Java\|unsafe without special precautions]] and can be an [[anti-pattern]].<ref>David Bacon et al. [http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html The "Double-Checked Locking is Broken" Declaration].</ref>scheduler the knowledge of who is blocking on what. Consequently, the scheduler will have to guess on how to allocate CPU time among the threads -- typically just allowing the threads to use up their timing quota. Threads will end up burning CPU cycles spinning, waiting for threads that are not scheduled. By using operating-system provided lock objects, such as mutexes, the OS can schedule exactly those threads ready to run. ==See also==

Test and test-and-set: Difference between revisions