Content deleted Content added
m linking |
→See also: fixed Tag: possibly inaccurate edit summary |
||
(48 intermediate revisions by 37 users not shown) | |||
Line 1:
{{
A '''distributed operating system''' is system software over a collection of independent software, [[Computer network|networked]], [[Inter-process communication|communicating]], and physically separate computational nodes. They handle jobs which are serviced by multiple CPUs.<ref name="Tanenbaum1993">{{cite journal |last=Tanenbaum |first=Andrew S |date=September 1993 |title=Distributed operating systems anno 1992. What have we learned so far? |journal=Distributed Systems Engineering |volume=1 |issue=1 |pages=3–10 |doi=10.1088/0967-1846/1/1/001|bibcode=1993DSE.....1....3T |doi-access=free }}</ref> Each individual node holds a specific software subset of the global aggregate operating system. Each subset is a composite of two distinct service provisioners.<ref name="Nutt1992">{{cite book|last=Nutt|first=Gary J.|title=Centralized and Distributed Operating Systems|url=https://archive.org/details/centralizeddistr0000nutt |url-access=registration|year=1992|publisher=Prentice Hall|isbn=978-0-13-122326-4}}</ref> The first is a ubiquitous minimal [[
▲They handle jobs which are serviced by multiple CPUs.<ref name="Tanenbaum1993">{{cite journal |last=Tanenbaum |first=Andrew S |date=September 1993 |title=Distributed operating systems anno 1992. What have we learned so far? |journal=Distributed Systems Engineering |volume=1 |issue=1 |pages=3–10 |doi=10.1088/0967-1846/1/1/001|doi-access=free }}</ref> Each individual node holds a specific software subset of the global aggregate operating system. Each subset is a composite of two distinct service provisioners.<ref name="Nutt1992">{{cite book|last=Nutt|first=Gary J.|title=Centralized and Distributed Operating Systems|url=https://archive.org/details/centralizeddistr0000nutt |url-access=registration|year=1992|publisher=Prentice Hall|isbn=978-0-13-122326-4}}</ref> The first is a ubiquitous minimal [[Kernel (computing)|kernel]], or [[microkernel]], that directly controls that node's hardware. Second is a higher-level collection of ''system management components'' that coordinate the node's individual and collaborative activities. These components abstract microkernel functions and support user applications.<ref name="Gościński1991">{{cite book|last=Gościński|first=Andrzej|title=Distributed Operating Systems: The Logical Design|url=https://books.google.com/books?id=ZnYhAQAAIAAJ|year=1991|publisher=Addison-Wesley Pub. Co.|isbn=978-0-201-41704-3}}</ref>
The microkernel and the management components collection work together. They support the system's goal of integrating multiple resources and processing functionality into an efficient and stable system.<ref name="Fortier1986">{{cite book|last=Fortier|first=Paul J.|title=Design of Distributed Operating Systems: Concepts and Technology|url=https://books.google.com/books?id=F7QmAAAAMAAJ|year=1986|publisher=Intertext Publications|isbn=9780070216211}}</ref> This seamless integration of individual nodes into a global system is referred to as ''transparency'', or ''[[single system image]]''; describing the illusion provided to users of the global system's appearance as a single computational entity.<!-- is transparency required for membership in the "dos" group?-->
Line 15 ⟶ 14:
===The kernel===
At each [[Locale (computer hardware)|locale]] (typically a node), the kernel provides a minimally complete set of node-level utilities necessary for operating a node's underlying hardware and resources. These mechanisms include allocation, management, and disposition of a node's resources, processes, communication, and [[input/output]] management support functions.<ref name="Hansen2001">{{cite book|editor=Hansen, Per Brinch|title=Classic Operating Systems: From Batch Processing to Distributed Systems|url=https://books.google.com/books?id=-PDPBvIPYBkC|year=2001|publisher=Springer|isbn=978-0-387-95113-3}}</ref> Within the kernel, the communications sub-system is of foremost importance for a distributed OS.<ref name="Gościński1991"/>
In a distributed OS, the kernel often supports a minimal set of functions, including low-level [[address space]] management, [[thread (computing)|thread]] management, and [[inter-process communication]] (IPC). A kernel of this design is referred to as a
[[Image:System Management Components.PNG|thumbnail|right|175px|alt=General overview of system management components that reside above the microkernel.|System management components overview]]
Line 25 ⟶ 23:
System management components are software processes that define the node's ''policies''. These components are the part of the OS outside the kernel. These components provide higher-level communication, process and resource management, reliability, performance and security. The components match the functions of a single-entity system, adding the transparency required in a distributed environment.<ref name="Gościński1991"/>
The distributed nature of the OS requires additional services to support a node's responsibilities to the global system. In addition, the system management components accept the "defensive" responsibilities of reliability, availability, and persistence. These responsibilities can conflict with each other. A consistent approach, balanced perspective, and a deep understanding of the overall system can assist in identifying [[diminishing returns]].<!--this sentence is rhetoric. say what is meant. give an example.--> Separation of policy and mechanism mitigates such conflicts.<ref name="Chow1997">{{cite book|last1=Chow|first1=Randy|author2=Theodore Johnson|title=Distributed Operating Systems and Algorithms|url=https://books.google.com/books?id=J4MZAQAAIAAJ|year=1997|publisher=Addison Wesley|isbn=978-0-201-49838-7}}</ref>
===Working together as an operating system===
Line 38 ⟶ 36:
==History==
Research and experimentation efforts began in earnest in the 1970s and continued through the 1990s, with focused interest peaking in the late 1980s. A number of distributed operating systems were introduced during this period; however, very few of these implementations achieved even modest commercial success.
Fundamental and pioneering implementations of primitive distributed operating system component concepts date to the early 1950s.<ref name=dyseac>{{cite journal |last1=Leiner |first1=Alan L. |title=System Specifications for the DYSEAC |journal=Journal of the ACM |date=April 1954 |volume=1 |issue=2 |pages=57–81 |doi=10.1145/320772.320773 |s2cid=15381094 |doi-access=
In the mid-1970s, research produced important advances in distributed computing. These breakthroughs provided a solid, stable foundation for efforts that continued through the 1990s.
The accelerating proliferation of [[Multiprocessing|multi-processor]] and [[multi-core processor]] systems research led to a resurgence of the distributed OS concept.
===The DYSEAC===
One of the first efforts was the [[DYSEAC]], a general-purpose [[Synchronization (computer science)|synchronous]] computer. In one of the earliest publications of the [[Association for Computing Machinery]], in April 1954, a researcher at the [[National Bureau of Standards]]{{snd}} now the National [[nist|Institute of Standards and Technology]] ([[nist|NIST]]){{snd}} presented a detailed specification of the DYSEAC. The introduction focused upon the requirements of the intended applications, including flexible communications, but also mentioned other computers:
{{
The specification discussed the architecture of multi-computer systems, preferring peer-to-peer rather than master-slave.
{{
This is one of the earliest examples of a computer with distributed control. The [[United States Department of the Army|Dept. of the Army]] reports<ref>Martin H. Weik, "A Third Survey of Domestic Electronic Digital Computing Systems," Ballistic Research Laboratories Report No. 1115, pg. 234-5, Aberdeen Proving Ground, Maryland, March 1961</ref> certified it reliable and that it passed all acceptance tests in April 1954. It was completed and delivered on time, in May 1954. This was a "[[portable computer]]", housed in a [[Tractor-trailer#Types of trailers|tractor-trailer]], with 2 attendant vehicles and [[Refrigerator truck|6 tons of refrigeration]] capacity.
Line 86 ⟶ 82:
This [[Computer configuration|configuration]] was ideal for distributed systems. The constant-time projection through memory for storing and retrieval was inherently [[Atomic operation|atomic]] and [[Mutual exclusion|exclusive]]. The cellular memory's intrinsic distributed characteristics<!-- are these intrinsically distributed or merely abstract?--> would be invaluable. The impact on the [[User interface|user]], [[Computer hardware|hardware]]/[[Peripheral|device]], or [[Application programming interface]]s was indirect. The authors were considering distributed systems, stating:
{{
===Foundational work===
====Coherent memory abstraction====
{{pad|2em}} Algorithms for scalable synchronization on
====File System abstraction====
{{pad|2em}}Measurements of a
<br />{{pad|2em}}Memory coherence in shared virtual memory systems <ref>Li, K. and Hudak, P. 1989. Memory coherence in shared virtual memory systems. ACM Trans. Comput. Syst. 7, 4 (Nov. 1989), 321-359.</ref>
Line 102 ⟶ 98:
{{pad|2em}}''Transactional Memory''
<br />{{pad|4em}}Composable memory transactions<ref>Harris, T., Marlow, S., [[Simon Peyton Jones|Peyton-Jones, S.]], and Herlihy, M. 2005. [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.77.3476&rep=rep1&type=pdf Composable memory transactions]. In Proceedings of the Tenth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Chicago, IL, USA, June 15–17, 2005). PPoPP '05. ACM, New York, NY, 48-60.</ref>
<br />{{pad|4em}}Transactional memory: architectural support for lock-free data structures <ref>Herlihy, M. and Moss, J. E. 1993. [http://hpl.americas.hp.net/techreports/Compaq-DEC/CRL-92-7.pdf Transactional memory: architectural support for lock-free data structures]. In Proceedings of the 20th Annual international Symposium on Computer Architecture (San Diego, California, United States, May 16–19, 1993). ISCA '93. ACM, New York, NY, 289-300.</ref>
<br />{{pad|4em}}Software transactional memory for dynamic-sized data structures<ref>Herlihy, M., Luchangco, V., Moir, M., and Scherer, W. N. 2003. [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.59.8787&rep=rep1&type=pdf Software transactional memory for dynamic-sized data structures]. In Proceedings of the Twenty-Second Annual Symposium on Principles of Distributed Computing (Boston, Massachusetts, July 13–16, 2003). PODC '03. ACM, New York, NY, 92-101.</ref>
<br />{{pad|4em}}Software transactional memory<ref>Shavit, N. and Touitou, D. 1995. [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.474.5928&rep=rep1&type=pdf Software transactional memory]. In Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing (Ottawa, Ontario, Canada, August 20–23, 1995). PODC '95. ACM, New York, NY, 204-213.</ref>
====Persistence abstraction====
{{pad|2em}}OceanStore: an architecture for global-scale persistent storage <ref>Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Wells, C., and Zhao, B. 2000. [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.439.4822&rep=rep1&type=pdf OceanStore: an architecture for global-scale persistent storage]. In Proceedings of the Ninth international Conference on Architectural Support For Programming Languages and Operating Systems (Cambridge, Massachusetts, United States). ASPLOS-IX. ACM, New York, NY, 190-201.</ref>
====Coordinator abstraction====
{{pad|2em}} Weighted voting for replicated data <ref>Gifford, D. K. 1979. [http://pages.cs.wisc.edu/~remzi/Classes/739/Spring2004/Papers/p150-gifford.pdf Weighted voting for replicated data]. In Proceedings of the Seventh ACM Symposium on Operating Systems Principles (Pacific Grove, California, United States, December 10–12, 1979). SOSP '79. ACM, New York, NY, 150-162</ref>
<br />{{pad|2em}} Consensus in the presence of partial synchrony <ref>Dwork, C., Lynch, N., and Stockmeyer, L. 1988. [https://groups.csail.mit.edu/tds/papers/Lynch/MIT-LCS-TM-270.pdf Consensus in the presence of partial synchrony]. J. ACM 35, 2 (Apr. 1988), 288-323.</ref>
====Reliability abstraction====
{{pad|2em}}''Sanity checks''
<br />{{pad|4em}}The Byzantine Generals Problem <ref>Lamport, L., Shostak, R., and Pease, M. 1982. [http://people.cs.uchicago.edu/~shanlu/teaching/33100_wi15/papers/byz.pdf The Byzantine Generals Problem]. ACM Trans. Program. Lang. Syst. 4, 3 (Jul. 1982), 382-401.</ref>
<br />{{pad|4em}}Fail-stop processors: an approach to designing fault-tolerant computing systems <ref>Schlichting, R. D. and Schneider, F. B. 1983. Fail-stop processors: an approach to designing fault-tolerant computing systems. ACM Trans. Comput. Syst. 1, 3 (Aug. 1983), 222-238.</ref>
Line 125 ⟶ 121:
==Distributed computing models==
{{More citations needed section|date=January 2012}}
===Three basic distributions===
Line 150 ⟶ 144:
* ''Location transparency'' – Location transparency comprises two distinct aspects of transparency, naming transparency and user mobility. Naming transparency requires that nothing in the physical or logical references to any system entity should expose any indication of the entity's ___location, or its local or remote relationship to the user or application. User mobility requires the consistent referencing of system entities, regardless of the system ___location from which the reference originates.<ref name="Sinha1997" />{{rp|20}}
* ''Access transparency'' – Local and remote system entities must remain indistinguishable when viewed through the user interface. The distributed operating system maintains this perception through the exposure of a single access mechanism for a system entity, regardless of that entity being local or remote to the user. Transparency dictates that any differences in methods of accessing any particular system entity—either local or remote—must be both invisible to, and undetectable by the user.<ref name="Gościński1991"/>{{rp|84}}<!--what is the difference between referencing and access?-->
* ''Migration transparency'' – Resources and activities migrate from one element to another controlled solely by the system and without user/application knowledge or action.<ref name="Galli2000">{{cite book|last=Galli|first=Doreen L.|title=Distributed Operating Systems: Concepts and Practice|url=https://archive.org/details/distributedopera00gall |url-access=registration|year=2000|publisher=Prentice Hall|isbn=978-0-13-079843-5}}</ref>{{rp|16}}
* ''Replication transparency'' – The process or fact that a resource has been duplicated on another element occurs under system control and without user/application knowledge or intervention.<ref name="Galli2000" />{{rp|16}}
* ''Concurrency transparency'' – Users/applications are unaware of and unaffected by the presence/activities of other users.<ref name="Galli2000" />{{rp|16}}
Line 190 ⟶ 184:
:* or a process must establish exclusive access to a shared resource.
Improper synchronization can lead to multiple failure modes including loss of [[ACID|atomicity, consistency, isolation and durability]], [[Deadlock (computer science)|deadlock]], [[livelock]] and loss of [[serializability]].{{Citation needed|date=January 2012}}
===Flexibility===
[[Flexibility (engineering)|Flexibility]] in a distributed operating system is enhanced through the modular
==Research==
Line 216 ⟶ 210:
===Effective and stable in multiple levels of complexity===
:Tessellation: Space-Time Partitioning in a Manycore Client OS.<ref>Rose Liu, Kevin Klues, and Sarah Bird, University of California at Berkeley; Steven Hofmeyr, Lawrence Berkeley National Laboratory; [[Krste Asanović]] and John Kubiatowicz, University of California at Berkeley. HotPar09.</ref>
==See also==
*
* {{annotated link|HarmonyOS}}
* [[Plan 9 from Bell Labs]]▼
* {{annotated link|OpenHarmony}}
* [[Inferno (operating system)|Inferno]]▼
* {{annotated link|BlueOS}}
* [[Single system image]] (SSI)▼
* {{annotated link|MINIX}}
* [[List of operating systems]]▼
*
▲* [[Multikernel]]
▲* [[Operating System Projects]]
▲* [[Edsger W. Dijkstra Prize in Distributed Computing]]
▲* [[List of distributed computing conferences]]
▲* [[List of distributed computing projects]]
==References==
{{Reflist
==Further reading==
* {{cite book|last1=Chow|first1=Randy|author2=Theodore Johnson|title=Distributed Operating Systems and Algorithms|url=https://books.google.com/books?id=J4MZAQAAIAAJ|year=1997|publisher=Addison Wesley|isbn=978-0-201-49838-7}}
* {{cite book|last=Sinha|first=Pradeep Kumar |title=Distributed Operating Systems: Concepts and Design|url=https://archive.org/details/distributedopera0000sinh|url-access=registration|year=1997|publisher=IEEE Press|isbn=978-0-7803-1119-0}}
* {{cite book|last=Galli|first=Doreen L.|title=Distributed Operating Systems: Concepts and Practice|url=https://archive.org/details/distributedopera00gall |url-access=registration|year=2000|publisher=Prentice Hall|isbn=978-0-13-079843-5}}
==External links==
{{Prone to spam|date=May 2022}}
<!-- {{No more links}}
Please be cautious adding more external links.
Wikipedia is not a collection of links and should not be used for advertising.
Excessive or inappropriate links will be removed.
See [[Wikipedia:External links]] and [[Wikipedia:Spam]] for details.
If there are already suitable links, propose additions or replacements on
the article's talk page.
-->
{{Operating system}}
{{Authority control}}
{{DEFAULTSORT:Distributed Operating System}}
|