Content deleted Content added
No edit summary |
No edit summary |
||
Line 73:
|style="background:#DFFFFF;width:500px;"|'''Added initial entry...'''<br /> [[User:JLSjr|JLSjr]] ([[User talk:JLSjr|talk]]) 12:48, 13 March 2010 (UTC)
|}
<hr style="width: 80%; height: 2px;">
A '''Distributed operating system''' is the minimal subset of software within a distributed system, which -- considered collectively -- provide all operating system services required to support higher-level components in empowering and maintaining the system.
== Description ==
A Distributed operating system is an [[operating system]]. While this statement is somewhat trivial, it is not always overtly obvious; because a distributed operating system is simultaneously a [[distributed system]]. This idea is synonymous to the square. Being a square, it might not immediately be recognized also as a rectangle, but nonetheless is. The distributed operating system performs all requisite activities and supplies all necessary functionality in its capacity as operating system; but is, at the same time more.
An operating system, at its most basic level, is expected to isolate and manage the lower-level physical complexities of the [[hardware]] and associated resources. In turn, these low-level physical elements are organized into simplified logical [[abstractions]]. These abstractions are finally presented as higher-level [[Interface (computer science)|interfaces]] into the underlying hardware and [[Resource (computer science)|resources]]. The Distributed operating system not only fulfills its role in this capacity; but is capable of doing so in a manner indistinguishable from its more centralized counterparts. That is, although distributed in nature, it appears to the user as a singular and local entity; only exposing its unique distributed attributes if convenient or necessary for a specific purpose. Again, a square is a rectangle, the same in principle, different in configuration.
Line 98:
This is an exceptionally important and subtle distinction. It will serve to differentiate the Distributed operating system from other decentralized operating systems. And more importantly, it provides a foundation for many additional and beneficial services, which will be described in detail below. Simply said, it is solely the dissemination of hardware elements that "allows" for additional benefits and services. It is the extremely complex, sophisticated, and orchestrated utilization of this separation that instantiates the benefits.
== Overview ==
An approach to describing the unique nature of DOS
The unique nature of the Distributed operating system is both subtle and complex. A distributed operating system’s hardware infrastructure elements are not centralized, that is the elements do not have a tight proximity to one another at a single ___location. A given distributed operating system’s structure elements could reside in various rooms within a building, or in various buildings around the world. This geographically spatial dissemination defines its decentralization; however, the distributed operating system is a distributed system, not simply decentralized.
Line 129 ⟶ 127:
Lastly, as to the nature of the distributed system, some experts state that the distributed operating system is not an operating system at all; but just a distributed system, because of the attention required in maintenance of the system. This author, and by extension this article, will maintain the operating system status of the distributed operating system, by both observation and vacuous proof. As mentioned earlier, a [[Square_(geometry)#Other_facts|square]] is a [[rectangle]]; and no level of effort on its behalf required to maintain four equivalent dimensions affects that fact.
== Architectural features ==
Line 139 ⟶ 135:
To remain transparent, a system's elements may copy (replicate) portions of themselves onto collections of host elements. In times of need, a failed element's information can be retrieved from these host elements to continue processing, and eventually reconstitute the faulty element. This too is added complexity, and it does not end here. This replication of information throughout the system requires coordination, and therefore a coordinator. The coordinator oversees many aspects of a system's operation, unless that coordinator fails. In this event, some other element must be chosen and constituted a coordinator. This process adds complexity to the system. The complexity in the system can quickly add up, and these examples by no means sum to a total. Transparency envelope a system in an abstraction of extremely complex construction; but provide a user with a complete, consistent, and simplified local interface to hardware, devices, and resources. The various facets of a system contributing to this complexity are discussed individually, below.
=== Modularity ===
A distributed operating system is inherently modular by definition. However, a system's '''modularity''' speaks more to its composition and configuration, the rationale behind these, and ultimately their effectiveness. A system element could be composed of multiple layers of components. Each of these components might vary in granularity of subcomponent. These layers and component compositions would each have a coherent and rational configuration towards some purpose in the system. The purpose could be for a more simplified abstraction, raw communication efficiency, accommodating heterogeneous elements, processing parallelism and concurrency, or possibly to support an object-oriented programming paradigm. In any event, the scattered distribution of system elements is not random, but is most often the result of detailed design and careful planning.
=== Persistence of Entity state ===
{{pad|2em}}existance not time-bound, regardless of breaks in system functions continuously
Line 151 ⟶ 145:
<br />{{pad|2em}}Subject to consistent and timely updates
<br />{{pad|2em}}Able to survive hardware failure
=== Efficiency ===
{{pad|2em}}Many issues can adversly affect system performance:
Line 160 ⟶ 153:
<br />{{pad|2em}}Workload variations, delays, interruptions, faults, and/or crashes of entities
<br />{{pad|4em}}Distributed processing community assists when needed
=== Replication ===
{{pad|2em}}Duplication of state among selected distributed entities, and the synchronization of that state
<br />{{pad|2em}}Remote communication required to effect synchronization
=== Reliability ===
{{pad|2em}}Inherent redundancy across the distributed entities provides fault-tolerance
<br />{{pad|2em}}Consistent synchronized redundancy across N nodes, tolerates up to N-1 node faults
=== Flexibility ===
{{pad|2em}}OS has lattitude in degree of exposure to externals
Line 177 ⟶ 167:
<br />{{pad|4em}}Coordination of process activity
<br />{{pad|4em}}Where run; Near user?, resources?, avail. CPU?, etc...
=== Scalability ===
{{pad|2em}}node expansion
<br />{{pad|2em}}process migration
== History ==
Line 190 ⟶ 177:
With a cursory glance around the internet, or a modest perusal of pertinent writings, one could very easily gain the notion that computer operating systems were a new phenomenon in the mid-twentieth century. In fact, important research in operating systems was being conducted at this time.<ref>Dreyfuss, P. 1958. System design of the Gamma 60. In Proceedings of the May 6-8, 1958, Western Joint Computer Conference: Contrasts in Computers (Los Angeles, California, May 06 - 08, 1958). IRE-ACM-AIEE '58 (Western). ACM, New York, NY, 130-133. </ref><ref>Leiner, A. L., Notz, W. A., Smith, J. L., and Weinberger, A. 1958. Organizing a network of computers to meet deadlines. In Papers and Discussions Presented At the December 9-13, 1957, Eastern Joint Computer Conference: Computers with Deadlines To Meet (Washington, D.C., December 09 - 13, 1957). IRE-ACM-AIEE '57</ref><ref>Leiner, A. L., Smith, J. L., Notz, W. A., and Weinberger, A. 1958. PILOT, the NBS multicomputer system. In Papers and Discussions Presented At the December 3-5, 1958, Eastern Joint Computer Conference: Modern Computers: Objectives, Designs, Applications (Philadelphia, Pennsylvania, December 03 - 05, 1958). AIEE-ACM-IRE '58 (Eastern). ACM, New York, NY, 71-75.</ref><ref>Bauer, W. F. 1958. Computer design from the programmer's viewpoint. In Papers and Discussions Presented At the December 3-5, 1958, Eastern Joint Computer Conference: Modern Computers: Objectives, Designs, Applications (Philadelphia, Pennsylvania, December 03 - 05, 1958). AIEE-ACM-IRE '58 (Eastern). ACM, New York, NY, 46-51.</ref><ref>Leiner, A. L., Notz, W. A., Smith, J. L., and Weinberger, A. 1959. PILOT—A New Multiple Computer System. J. ACM 6, 3 (Jul. 1959), 313-335. </ref><ref>Estrin, G. 1960. Organization of computer systems: the fixed plus variable structure computer. In Papers Presented At the May 3-5, 1960, Western Joint IRE-AIEE-ACM Computer Conference (San Francisco, California, May 03 - 05, 1960). IRE-AIEE-ACM '60 (Western). ACM, New York, NY, 33-40.</ref> While early exploration into operating systems took place in the years leading to 1950; shortly afterward, highly advanced research began on new systems to conquer new problems. In the first decade of the second-half of the [[20th century]], many new questions were asked, many new problems were identified, many solutions were developed and working for years, in controlled production environments.
==== Aboriginal Distributed Computing ====
Line 205 ⟶ 190:
This is one of the earliest examples of a computer with distributed control. [[United States Department of the Army|Dept. of the Army]] reports<ref>Martin H. Weik, "A Third Survey of Domestic Electronic Digital Computing Systems," Ballistic Research Laboratories Report No. 1115, pg. 234-5, Aberdeen Proving Ground, Maryland, March 1961</ref> show it was certified reliable and passed all acceptance tests in April of 1954. It was completed and delivered on time, in May of 1954. In addition, was it mentioned that this was a [[portable computer]]? It was housed in [[Tractor-trailer#Types_of_trailers|tractor-trailer]], and had 2 attendant vehicles and [[Refrigerator truck|6 tons of refrigeration]] capacity.
==== Multi-programming abstraction ====
Line 216 ⟶ 199:
Similar to the previous system, the TX-2 discussion has a distinct decentralized theme until it is revealed that efficiencies in system operation are gained when separate programmed devices are operated simultaneously. It is also stated that the full power of the central unit can be utilized by any device; and it may be used for as long as the device's situation requires. In this, we see the TX-2 as another example of a system exhibiting distributed control, its central unit not having dedicated control.
==== Memory access abstraction ====
Line 242 ⟶ 223:
{{quote|We wanted to present here the basic ideas of a distributed logic system with... the macroscopic concept of logical design, away from scanning, from searching, from addressing, and from counting, is equally important. We must, at all cost, free ourselves from the burdens of detailed local problems which only befit a machine low on the evolutionary scale of machines.|Chung-Yeol (C. Y.) Lee|''Intercommunicating Cells, Basis for a Distributed Logic Computer''}}
==== Component abstraction ====
Line 252 ⟶ 230:
<br />
''Defining a kernel with all the attributes given above is difficult, and perhaps impractical... It is, nevertheless, the approach taken in the HYDRA system. Although we make no claim either that the set of facilities provided by the HYDRA kernel ... we do believe the set provides primitives which are both necessary and adequate for the construction of a large and interesting class of operating environments. It is our view that the set of functions provided by HYDRA will enable the user of C.mmp to create his own operating environment without being confined to predetermined command and file systems, execution scenarios, resource allocation policies, etc.''</font>
==== Initial composition ====
Line 261 ⟶ 235:
<font color="red">''The National Software Works (NSW) is a significant new step in the development of distributed processing systems and computer networks. NSW is an ambitious project to link a set of geographically distributed and diverse hosts with an operating system which appears as a single entity to a prospective user.''</font>
==== Complete instantiation ====
Line 271 ⟶ 242:
''The decision not to use logical or physical sharing of memory for communication is influenced both by the constraints of currently available hardware and by our perception of cost bottlenecks likely to arise as the number of processors increases. ''</font>
=== Foundational Work ===
Line 282 ⟶ 248:
{{pad|2em}}'''Algorithms for scalable synchronization on shared-memory multiprocessors'''<ref>Mellor-Crummey, J. M. and Scott, M. L. 1991. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst. 9, 1 (Feb. 1991), 21-65.</ref>
<br />{{pad|2em}}'''A N algorithm for mutual exclusion in decentralized systems'''<ref>Maekawa, M. 1985. A N algorithm for mutual exclusion in decentralized systems. ACM Trans. Comput. Syst. 3, 2 (May. 1985), 145-159.</ref>
==== File System abstraction ====
{{pad|2em}}'''Measurements of a distributed file system'''<ref>Baker, M. G., Hartman, J. H., Kupfer, M. D., Shirriff, K. W., and Ousterhout, J. K. 1991. Measurements of a distributed file system. In Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles (Pacific Grove, California, United States, October 13 - 16, 1991). SOSP '91. ACM, New York, NY, 198-212.</ref>
<br />{{pad|2em}}'''Memory coherence in shared virtual memory systems'''<ref>Li, K. and Hudak, P. 1989. Memory coherence in shared virtual memory systems. ACM Trans. Comput. Syst. 7, 4 (Nov. 1989), 321-359.</ref>
==== Transaction abstraction ====
{{pad|2em}}''Transactions''
<br />{{pad|4em}}'''Sagas'''<ref>Garcia-Molina, H. and Salem, K. 1987. Sagas. In Proceedings of the 1987 ACM SIGMOD international Conference on Management of Data (San Francisco, California, United States, May 27 - 29, 1987). U. Dayal, Ed. SIGMOD '87. ACM, New York, NY, 249-259.</ref>
{{pad|2em}}''Transactional Memory''
Line 303 ⟶ 262:
<br />{{pad|4em}}'''Software transactional memory for dynamic-sized data structures'''<ref>Herlihy, M., Luchangco, V., Moir, M., and Scherer, W. N. 2003. Software transactional memory for dynamic-sized data structures. In Proceedings of the Twenty-Second Annual Symposium on Principles of Distributed Computing (Boston, Massachusetts, July 13 - 16, 2003). PODC '03. ACM, New York, NY, 92-101.</ref>
<br />{{pad|4em}}'''Software transactional memory'''<ref>Shavit, N. and Touitou, D. 1995. Software transactional memory. In Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing (Ottowa, Ontario, Canada, August 20 - 23, 1995). PODC '95. ACM, New York, NY, 204-213.</ref>
==== Persistence abstraction ====
{{pad|2em}}'''OceanStore: an architecture for global-scale persistent storage'''<ref>Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Wells, C., and Zhao, B. 2000. OceanStore: an architecture for global-scale persistent storage. In Proceedings of the Ninth international Conference on Architectural Support For Programming Languages and Operating Systems (Cambridge, Massachusetts, United States). ASPLOS-IX. ACM, New York, NY, 190-201.</ref>
==== Coordinator abstraction ====
{{pad|2em}}'''Weighted voting for replicated data'''<ref>Gifford, D. K. 1979. Weighted voting for replicated data. In Proceedings of the Seventh ACM Symposium on Operating Systems Principles (Pacific Grove, California, United States, December 10 - 12, 1979). SOSP '79. ACM, New York, NY, 150-162</ref>
<br />{{pad|2em}}'''Consensus in the presence of partial synchrony'''<ref>Dwork, C., Lynch, N., and Stockmeyer, L. 1988. Consensus in the presence of partial synchrony. J. ACM 35, 2 (Apr. 1988), 288-323.</ref>
==== Reliability abstraction ====
Line 324 ⟶ 274:
<br />{{pad|4em}}'''The Byzantine Generals Problem'''<ref>Lamport, L., Shostak, R., and Pease, M. 1982. The Byzantine Generals Problem. ACM Trans. Program. Lang. Syst. 4, 3 (Jul. 1982), 382-401.</ref>
<br />{{pad|4em}}'''Fail-stop processors: an approach to designing fault-tolerant computing systems'''<ref>Schlichting, R. D. and Schneider, F. B. 1983. Fail-stop processors: an approach to designing fault-tolerant computing systems. ACM Trans. Comput. Syst. 1, 3 (Aug. 1983), 222-238.</ref>
{{pad|2em}}''Recoverability''
Line 331 ⟶ 279:
<br />{{pad|4em}}'''Optimistic recovery in distributed systems'''<ref>Strom, R. and Yemini, S. 1985. Optimistic recovery in distributed systems. ACM Trans. Comput. Syst. 3, 3 </ref>
=== Current Research ===
==== replicated model extended to a component object model ====
{{pad|2em}}Architectural Design of E1 Distributed Operating System<ref>L.B. Ryzhyk, A.Y. Burtsev. Architectural design of E1 distributed operating system. System Research and Information Technologies international scientific and technical journal, October 2004, Kiev, Ukraine.</Ref>
Line 343 ⟶ 287:
<br />{{pad|2em}}Design and development of MINIX distributed operating system<ref>Ramesh, K. S. 1988. Design and development of MINIX distributed operating system. In Proceedings of the 1988 ACM Sixteenth Annual Conference on Computer Science (Atlanta, Georgia, United States). CSC '88. ACM, New York, NY, 685.</ref>
=== Future Directions ===
==== Systems able to provide low-level complexity exposure, in proportion to trust and accepted responsibility ====
{{pad|2em}}Application performance and flexibility on exokernel systems.<ref>M. Frans Kaashoek, Dawson R. Engler, Gregory R. Ganger, Héctor M. Briceño, Russell Hunt, David Mazières, Thomas Pinckney, Robert Grimm, John Jannotti, and Kenneth Mackenzie. In the Proceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP '97), Saint-Malô, France, October 1997.</ref>
<br />{{pad|2em}}Scale and performance in the Denali isolation kernel.<ref>Whitaker, A., Shaw, M., and Gribble, S. D. 2002. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation</ref>
==== Infrastructures focused on multi-processor/core processing ====
{{pad|2em}}The multikernel: a new OS architecture for scalable multicore systems.<ref>Baumann, A., Barham, P., Dagand, P., Harris, T., Isaacs, R., Peter, S., Roscoe, T., Schüpbach, A., and Singhania, A. 2009. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (Big Sky, Montana, USA, October 11 - 14, 2009). SOSP '09.</ref>
<br />{{pad|2em}}Corey: an Operating System for Many Cores.<ref>S. Boyd-Wickizer, H. Chen, R. Chen, Y. Mao, F. Kashoek, R. Morris, A. Pesterev, L. Stein, M. Wu, Y. Dai, Y. Zhang, and Z. Zhang. Proceedings of the 2008 Symposium on Operating Systems Design and Implementation (OSDI), December 2008.</ref>
==== Systems extending a consistent and stable impression of distributed processing over extremes in heterogeneity ====
{{pad|2em}}Helios: heterogeneous multiprocessing with satellite kernels.<ref>Nightingale, E. B., Hodson, O., McIlroy, R., Hawblitzel, C., and Hunt, G. 2009. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (Big Sky, Montana, USA, October 11 - 14, 2009). SOSP '09.</ref>
==== Systems able to provide effective, stable, and beneficial views of vastly increased complexity on multiple levels ====
{{pad|2em}}Tesselation
== References ==
<!--- See http://en.wikipedia.org/wiki/Wikipedia:Footnotes on how to create references using <ref></ref> tags which will then appear here automatically -->
{{Reflist}}
== External links ==
* Coming Soon...
<!--- Categories --->
|