Content deleted Content added
#REDIRECT Parallel computing |
Initial Release |
||
Line 1:
{{Userspace draft|source=ArticleWizard|date=March 2010}}
<br /><br />
{{notice|There are several improvements in process
* There is basically an outline of structure in place now
* The reference material supporting the parts of the structure have been entered
* All comments, observations, hints, corrections, additions, etc... are very welcome.
}}<br />
<br /><br />
I am [[user:JLSjr|JLSjr]], please consider assisting in this effort.
<br /><br /><br />
'''Distributed operating system''' is a decentralized [[operating system]], with its functionality disseminated across multiple, independent processing entities. The system is able to project a complete, consistent, and transparent abstraction; presenting a façade, understood by a user in a familiar centralized context. This abstraction hides the system's distributed nature, not simply separating a user from inherent complexity, but preventing its indication altogether.
<br />
<br />
== Architectural features ==
=== Transparency ===
<pre>
Common execution paradigm across multiple distributed entities
Complexity involves:
asynchronous communication
non-uniform memory access
probability of faults
possibility of heterogeneous remote resources
Distributed operating system must be able to isolate users from complexities
Provide a complete and consistent abstraction of resources, across the distributed system
</pre>
=== Modularity ===
<pre>
System abstraction composed from modular subcomponents/entities
Entities usually smaller, functionally-cohesive, cooperative, and stateful
</pre>
=== Persistence of Entity state ===
<pre>
existance not time-bound, regardless of breaks in system functions continuously
resides in nonvolatile storage; synchronized with current, stable, active copy
Subject to consistent and timely updates
Able to survive hardware failure
</pre>
=== Efficiency ===
<pre>
Many issues can adversly affect system performance:
latency in interactions among distributed entities
local response facade requires remote entities' state be cached locally
and consistently synchronized to maintain the paradigm
Workload variations, delays, interruptions, faults, and/or crashes of entities
Distributed processing community assists when needed
</pre>
=== Replication ===
<pre>
Duplication of state among selected distributed entities, and the synchronization of that state
Remote communication required to effect synchronization
</pre>
=== Reliability ===
<pre>
Inherent redundancy across the distributed entities provides fault-tolerance
Consistent synchronized redundancy across N nodes, tolerates up to N-1 node faults</pre>
=== Flexibility ===
<pre>
OS has lattitude in degree of exposure to externals
Externals have lattitude in degree of exposure they accept
Coordination of process activity
Where run; Near user?, resources?, avail. CPU?, etc...
</pre>
=== Scalability ===
<pre>
node expansion
process migration
</pre>
== History ==
=== Pioneering inspirations ===
==== Connectivity abstraction ====
Ethernet: distributed packet switching for local computer networks<ref>Metcalfe, R. M. and Boggs, D. R. 1976. Ethernet: distributed packet switching for local computer networks. Commun. ACM 19, 7 (Jul. 1976), 395-404.</ref>
==== Memory access abstraction ====
Intercommunicating Cells, Basis for a Distributed Logic Computer<ref>Lee, C. Y. 1962. Intercommunicating cells, basis for a distributed logic computer. In Proceedings of the December 4-6, 1962, Fall Joint Computer Conference (Philadelphia, Pennsylvania, December 04 - 06, 1962). AFIPS '62 (Fall).</ref> (1962)
<pre>
The purpose of this paper is to describe an information storage and retrieval system in which logic is distributed throughout the system. The system is made up of cells. Each cell is a small finite state machine which can communicate with its neighboring cells ... The motivation stems from our contention that in the present generation of machines the scheme of locating a quantity of information by its "address" is fundamentally a weak one...
The association of an address with a quantity of information is very much the result of the type of computer organization we now have. Writing a program in machine language, one rarely deals with the quantities of information themselves. A programmer normally must know where these quantities of information are located. He then manipulates their addresses according to the problem at hand. An address in this way often assumes the role of the name of a piece of information...
</pre>
==== Process abstraction ====
The Structure of the "THE"-Multiprogramming <ref>Dijkstra, E. W. 1968. The structure of the “THE”-multiprogramming system. Commun. ACM 11, 5 (May. 1968), 341-346.</ref> (1968)
<pre>
A multiprogramming system is described in which all activities are divided over a number of sequential processes. These sequential processes are placed at various hierarchical levels, in each of which one or more independent abstractions have been implemented. The hierarchical structure proved to be vital for the verification of the logical soundness of the design and the correctness of its implementation.
</pre>
==== Component abstraction ====
HYDRA:The Kernel of a Multiprocessor Operating System<ref>Wulf, W., Cohen, E., Corwin, W., Jones, A., Levin, R., Pierson, C., and Pollack, F. 1974. HYDRA: the kernel of a multiprocessor operating system. Commun. ACM 17, 6 (Jun. 1974), 337-345.</ref> (1974)
<br />
The design philosophy of HYDRA ... suggest that, at the heart of the system, one should build a collection of facilities of "universal applicability" and "absolute reliability" -- a set of mechanisms from which an arbitrary set of operating system facilities and policies can be conveniently, flexibly, efficiently, and reliably constructed.
<br />
Defining a kernel with all the attributes given above is difficult, and perhaps impractical... It is, nevertheless, the approach taken in the HYDRA system. Although we make no claim either that the set of facilities provided by the HYDRA kernel ... we do believe the set provides primitives which are both necessary and adequate for the construction of a large and interesting class of operating environments. It is our view that the set of functions provided by HYDRA will enable the user of C.mmp to create his own operating environment without being confined to predetermined command and file systems, execution scenarios, resource allocation policies, etc.
<br />
==== Initial composition ====
The National Software Works: A Distributed Processing System<ref>Millstein, R. E. 1977. The National Software Works: A distributed processing system. In Proceedings of the 1977 Annual Conference ACM '77. ACM, New York, NY, 44-52.</ref> (1975)
<pre>
The National Software Works (NSW) is a significant new step in the development of distributed processing systems and computer networks. NSW is an ambitious project to link a set of geographically distributed and diverse hosts with an operating system which appears as a single entity to a prospective user.
</pre>
==== Complete instantiation ====
The Rosco Distributed Operating System<ref>Solomon, M. H. and Finkel, R. A. 1979. The Roscoe distributed operating system. In Proceedings of the Seventh ACM Symposium on Operating Systems Principles (Pacific Grove, California, United States, December 10 - 12, 1979). SOSP '79.</ref> (1979)
<pre>
Roscoe is an operating system implemented at the University of Wisconsin that allows a network of microcomputers to cooperate to provide a general-purpose computing facility. The goal of the Roscoe network is to provide a general-purpose computation resource in which individual resources such as files and processors are shared among processes and control is distributed in a non-hierarchical fashion. All processors are identical. Similarly, all processors run the same operating system kernel. However, they may differ in the peripheral units connected to them. No memory is shared between processors. All communication involves messages explicitly passed between physically connected processors. No assumptions are made about the topology of interconnection.
The decision not to use logical or physical sharing of memory for communication is influenced both by the constraints of currently available hardware and by our perception of cost bottlenecks likely to arise as the number of processors increases.
</pre>
=== Foundational Work ===
==== Coherent memory abstraction ====
* Algorithms for scalable synchronization on shared-memory multiprocessors<ref>Mellor-Crummey, J. M. and Scott, M. L. 1991. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst. 9, 1 (Feb. 1991), 21-65.</ref>
* A N algorithm for mutual exclusion in decentralized systems<ref>Maekawa, M. 1985. A N algorithm for mutual exclusion in decentralized systems. ACM Trans. Comput. Syst. 3, 2 (May. 1985), 145-159.</ref>
* <ref></ref>
==== File System abstraction ====
* Measurements of a distributed file system<ref>Baker, M. G., Hartman, J. H., Kupfer, M. D., Shirriff, K. W., and Ousterhout, J. K. 1991. Measurements of a distributed file system. In Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles (Pacific Grove, California, United States, October 13 - 16, 1991). SOSP '91. ACM, New York, NY, 198-212.</ref>
* Memory coherence in shared virtual memory systems<ref>Li, K. and Hudak, P. 1989. Memory coherence in shared virtual memory systems. ACM Trans. Comput. Syst. 7, 4 (Nov. 1989), 321-359.</ref>
* <ref></ref>
==== Transaction abstraction ====
* Transactions
** <ref></ref>
** Sagas<ref>Garcia-Molina, H. and Salem, K. 1987. Sagas. In Proceedings of the 1987 ACM SIGMOD international Conference on Management of Data (San Francisco, California, United States, May 27 - 29, 1987). U. Dayal, Ed. SIGMOD '87. ACM, New York, NY, 249-259.</ref>
* Transactional Memory
** Composable memory transactions<ref>Harris, T., Marlow, S., Peyton-Jones, S., and Herlihy, M. 2005. Composable memory transactions. In Proceedings of the Tenth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Chicago, IL, USA, June 15 - 17, 2005). PPoPP '05. ACM, New York, NY, 48-60.</ref>
** Transactional memory: architectural support for lock-free data structures<ref>Herlihy, M. and Moss, J. E. 1993. Transactional memory: architectural support for lock-free data structures. In Proceedings of the 20th Annual international Symposium on Computer Architecture (San Diego, California, United States, May 16 - 19, 1993). ISCA '93. ACM, New York, NY, 289-300.</ref>
** Software transactional memory for dynamic-sized data structures<ref>Herlihy, M., Luchangco, V., Moir, M., and Scherer, W. N. 2003. Software transactional memory for dynamic-sized data structures. In Proceedings of the Twenty-Second Annual Symposium on Principles of Distributed Computing (Boston, Massachusetts, July 13 - 16, 2003). PODC '03. ACM, New York, NY, 92-101.</ref>
** Software transactional memory<ref>Shavit, N. and Touitou, D. 1995. Software transactional memory. In Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing (Ottowa, Ontario, Canada, August 20 - 23, 1995). PODC '95. ACM, New York, NY, 204-213.</ref>
==== Persistence abstraction ====
* OceanStore: an architecture for global-scale persistent storage<ref>Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Wells, C., and Zhao, B. 2000. OceanStore: an architecture for global-scale persistent storage. In Proceedings of the Ninth international Conference on Architectural Support For Programming Languages and Operating Systems (Cambridge, Massachusetts, United States). ASPLOS-IX. ACM, New York, NY, 190-201.</ref>
* <ref></ref>
* <ref></ref>
==== Coordinator abvstraction ====
* Weighted voting for replicated data<ref>Gifford, D. K. 1979. Weighted voting for replicated data. In Proceedings of the Seventh ACM Symposium on Operating Systems Principles (Pacific Grove, California, United States, December 10 - 12, 1979). SOSP '79. ACM, New York, NY, 150-162</ref>
* Consensus in the presence of partial synchrony<ref>Dwork, C., Lynch, N., and Stockmeyer, L. 1988. Consensus in the presence of partial synchrony. J. ACM 35, 2 (Apr. 1988), 288-323.</ref>
* <ref></ref>
==== Reliability abstraction ====
* Sanity checks
** The Byzantine Generals Problem<ref>Lamport, L., Shostak, R., and Pease, M. 1982. The Byzantine Generals Problem. ACM Trans. Program. Lang. Syst. 4, 3 (Jul. 1982), 382-401.</ref>
** Fail-stop processors: an approach to designing fault-tolerant computing systems<ref>Schlichting, R. D. and Schneider, F. B. 1983. Fail-stop processors: an approach to designing fault-tolerant computing systems. ACM Trans. Comput. Syst. 1, 3 (Aug. 1983), 222-238.</ref>
** <ref></ref>
* Recoverability
** Distributed snapshots: determining global states of distributed systems<ref>Chandy, K. M. and Lamport, L. 1985. Distributed snapshots: determining global states of distributed systems. ACM Trans. Comput. Syst. 3, 1 (Feb. 1985), 63-75.</ref>
** Optimistic recovery in distributed systems<ref>Strom, R. and Yemini, S. 1985. Optimistic recovery in distributed systems. ACM Trans. Comput. Syst. 3, 3 </ref>
=== Current Research ===
==== replicated model extended to a component object model ====
** Architectural Design of E1 Distributed Operating System<ref>L.B. Ryzhyk, A.Y. Burtsev. Architectural design of E1 distributed operating system. System Research and Information Technologies international scientific and technical journal, October 2004, Kiev, Ukraine.</Ref>
** The Cronus distributed operating system<ref>Vinter, S. T. and Schantz, R. E. 1986. The Cronus distributed operating system. In Proceedings of the 2nd Workshop on Making Distributed Systems Work (Amsterdam, Netherlands, September 08 - 10, 1986). EW 2. ACM, New York, NY, 1-3.</ref>
** Fine-grained mobility in the emerald system<ref>Jul, E., Levy, H., Hutchinson, N., and Black, A. 1987. Fine-grained mobility in the emerald system. In Proceedings of the Eleventh ACM Symposium on Operating Systems Principles (Austin, Texas, United States, November 08 - 11, 1987). SOSP '87. ACM, New York, NY, 105-106.</ref>
** Design and development of MINIX distributed operating system<ref>Ramesh, K. S. 1988. Design and development of MINIX distributed operating system. In Proceedings of the 1988 ACM Sixteenth Annual Conference on Computer Science (Atlanta, Georgia, United States). CSC '88. ACM, New York, NY, 685.</ref>
=== Future Directions ===
==== System able to provide low-level complexity exposure, in proportion to trust and accepted responsibility ====
** Application performance and flexibility on exokernel systems.<ref>M. Frans Kaashoek, Dawson R. Engler, Gregory R. Ganger, Héctor M. Briceño, Russell Hunt, David Mazières, Thomas Pinckney, Robert Grimm, John Jannotti, and Kenneth Mackenzie. In the Proceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP '97), Saint-Malô, France, October 1997.</ref>
** Scale and performance in the Denali isolation kernel.<ref>Whitaker, A., Shaw, M., and Gribble, S. D. 2002. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation</ref>
==== Infrastructure focus on multi-processor/core ====
** The multikernel: a new OS architecture for scalable multicore systems.<ref>Baumann, A., Barham, P., Dagand, P., Harris, T., Isaacs, R., Peter, S., Roscoe, T., Schüpbach, A., and Singhania, A. 2009. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (Big Sky, Montana, USA, October 11 - 14, 2009). SOSP '09.</ref>
** Corey: an Operating System for Many Cores.<ref>S. Boyd-Wickizer, H. Chen, R. Chen, Y. Mao, F. Kashoek, R. Morris, A. Pesterev, L. Stein, M. Wu, Y. Dai, Y. Zhang, and Z. Zhang. Proceedings of the 2008 Symposium on Operating Systems Design and Implementation (OSDI), December 2008.</ref>
==== System extends a consistent and stable impression of processing over extremes in heterogeneity ====
** Helios: heterogeneous multiprocessing with satellite kernels.<ref>Nightingale, E. B., Hodson, O., McIlroy, R., Hawblitzel, C., and Hunt, G. 2009. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (Big Sky, Montana, USA, October 11 - 14, 2009). SOSP '09.</ref>
==== System able to provide effective, stable, and beneficial view of vastly inceased complexity on multiple levels ====
* Tesselation
* ...
== References ==
<!--- See http://en.wikipedia.org/wiki/Wikipedia:Footnotes on how to create references using <ref></ref> tags which will then appear here automatically -->
{{Reflist}}
== External links ==
* Coming Soon...
<!--- Categories --->
[[Category:Articles created via the Article Wizard]]
|