This article or section is in a state of significant expansion or restructuring, and is not yet ready for use. You are welcome to assist in its construction by editing it as well. If this article or section has not been edited in several days, please remove this template. If you are the editor who added this template and you are actively editing, please be sure to replace this template with {{in use}} during the active editing session. Click on the link for template parameters to use.
This article was last edited by JLSjr (talk | contribs) 15 years ago. (Update timer) |
There are several improvements in process (daily)
|
Revision/Update History |
![]() |
Added rough draft of "Lead" section... JLSjr (talk) 09:02, 16 March 2010 (UTC) |
![]() |
Added "Introduction" outline; framework of text to come... JLSjr (talk) 04:56, 15 March 2010 (UTC) |
![]() |
Added initial entry... JLSjr (talk) 12:48, 13 March 2010 (UTC) |
A Distributed operating system is an operating system. While this statement is somewhat trivial, it is not always overtly obvious; because a distributed operating system is simultaneously a distributed system. This idea is synonymous to a square being a square, as it might not immediately be recognized as a rectangle, but nonetheless is. The distributed operating system performs all requisite activities and supplies all necessary functionality in its capacity as operating system, but is at the same time more.
An operating system, at its most basic level, is expected to isolate and manage the lower-level physical complexities of the hardware and associated resources. In turn, these low-level physical elements are organized into simplified logical abstractions. These abstractions are finally presented as higher-level interfaces into the underlying hardware and resources. The Distributed operating system not only fulfills its role in this capacity; but is capable of doing so in a manner indistinguishable from its more centralized counterparts. That is, although distributed in nature, it appears to the user as a singular and local entity; only exposing its unique distributed attributes if convenient or necessary for a specific purpose. Again, a square is a rectangle, the same in principle, different in configuration.
In most graphic depictions, an operating system is illustrated as a discrete organizational container sandwiched between hardware and its resources below, and applications, programmers, and other "needy" entities above. A Distributed operating system, illustrated in a similar fashion, would completely contain the set of discrete and disseminated hardware elements within its container. It is the hardware that displays differing degrees of spatial separation, however it is the containment of these hardware elements that allow for a consistent, unified, singular, and local appearance.
This is an exceptionally important and subtle distinction. It will serve to differentiate the Distributed operating system from other decentralized operating systems. And more importantly, it provides a foundation for many additional and beneficial services, which will be described in detail below. Simply said, it is solely the dissemination of hardware elements that "allows" for additional benefits and services. It is the extremely complex, sophisticated, and orchestrated utilization of this separation that instantiates the benefits.
Overview
An approach to describing the unique nature of DOS
A Distributed OS is decentralized, but has additional specific attributes; it is distributed
compared to a standard OS which is centralized and local
or a Network OS which is decentralized, and while possibly spatially scattered, is not distributed
to demonstrate, consider an OSs constituent element organization: decentralized vs. distributed
three tightly interrelated aspects to this organization:
organizational hierarchy, degree of interconnection, sphere of control
organizational hierarchy:
centralized:
dictatorship - one level
all entities report directly to one higher-level entity
decentralized:
federated - one or more levels
any given sub-level's entities reporting directly to next higher-level entity
distributed:
autonomous - no levels required
all entities reporting directly/indirectly to an appointed coordinator
degree of interconnection:
centralized:
single direct connection
balloons on a string
decentralized:
single direct/indirect connection
corporate org-chart (no dotted lines)
twigs, limbs, branches of a tree
distributed:
multiple direct/indirect connections possible
string art, spiro-graph (completely connected)
spider web (closest neighbors)
interstate system (suited to purpose)
sphere of control:
centralized:
direct and absolute
decentralized:
shared by level, through delegation and follow-up
distributed:
collective, usually through mutual agreement
DOS is an OS with a unique set of constitutional characteristics
providing opportunity for an exceptional array of additional and extended services
while overcoming extreme quantities and degrees of complexity in the process
Architectural features
Transparency
Common execution paradigm across multiple distributed entities
Complexity involves:
asynchronous communication
non-uniform memory access
probability of faults
possibility of heterogeneous remote resources
Distributed operating system must be able to isolate users from complexities
Provide a complete and consistent abstraction of resources, across the distributed system
Modularity
System abstraction composed from modular subcomponents/entities
Entities usually smaller, functionally-cohesive, cooperative, and stateful
Persistence of Entity state
existance not time-bound, regardless of breaks in system functions continuously
resides in nonvolatile storage; synchronized with current, stable, active copy
Subject to consistent and timely updates
Able to survive hardware failure
Efficiency
Many issues can adversly affect system performance:
latency in interactions among distributed entities
local response facade requires remote entities' state be cached locally
and consistently synchronized to maintain the paradigm
Workload variations, delays, interruptions, faults, and/or crashes of entities
Distributed processing community assists when needed
Replication
Duplication of state among selected distributed entities, and the synchronization of that state
Remote communication required to effect synchronization
Reliability
Inherent redundancy across the distributed entities provides fault-tolerance
Consistent synchronized redundancy across N nodes, tolerates up to N-1 node faults
Flexibility
OS has lattitude in degree of exposure to externals
Externals have lattitude in degree of exposure they accept
Coordination of process activity
Where run; Near user?, resources?, avail. CPU?, etc...
Scalability
node expansion
process migration
History
Pioneering inspirations
Connectivity abstraction
Ethernet: distributed packet switching for local computer networks[1]
Memory access abstraction
Intercommunicating Cells, Basis for a Distributed Logic Computer[2] (1962)
The purpose of this paper is to describe an information storage and retrieval system in which logic is distributed throughout the system. The system is made up of cells. Each cell is a small finite state machine which can communicate with its neighboring cells ... The motivation stems from our contention that in the present generation of machines the scheme of locating a quantity of information by its "address" is fundamentally a weak one...
The association of an address with a quantity of information is very much the result of the type of computer organization we now have. Writing a program in machine language, one rarely deals with the quantities of information themselves. A programmer normally must know where these quantities of information are located. He then manipulates their addresses according to the problem at hand. An address in this way often assumes the role of the name of a piece of information...
Process abstraction
The Structure of the "THE"-Multiprogramming[3] (1968)
A multiprogramming system is described in which all activities are divided over a number of sequential processes. These sequential processes are placed at various hierarchical levels, in each of which one or more independent abstractions have been implemented. The hierarchical structure proved to be vital for the verification of the logical soundness of the design and the correctness of its implementation.
Component abstraction
HYDRA:The Kernel of a Multiprocessor Operating System[4] (1974)
The design philosophy of HYDRA ... suggest that, at the heart of the system, one should build a collection of facilities of "universal applicability" and "absolute reliability" -- a set of mechanisms from which an arbitrary set of operating system facilities and policies can be conveniently, flexibly, efficiently, and reliably constructed.
Defining a kernel with all the attributes given above is difficult, and perhaps impractical... It is, nevertheless, the approach taken in the HYDRA system. Although we make no claim either that the set of facilities provided by the HYDRA kernel ... we do believe the set provides primitives which are both necessary and adequate for the construction of a large and interesting class of operating environments. It is our view that the set of functions provided by HYDRA will enable the user of C.mmp to create his own operating environment without being confined to predetermined command and file systems, execution scenarios, resource allocation policies, etc.
Initial composition
The National Software Works: A Distributed Processing System[5] (1975)
The National Software Works (NSW) is a significant new step in the development of distributed processing systems and computer networks. NSW is an ambitious project to link a set of geographically distributed and diverse hosts with an operating system which appears as a single entity to a prospective user.
Complete instantiation
The Rosco Distributed Operating System[6] (1979)
Roscoe is an operating system implemented at the University of Wisconsin that allows a network of microcomputers to cooperate to provide a general-purpose computing facility. The goal of the Roscoe network is to provide a general-purpose computation resource in which individual resources such as files and processors are shared among processes and control is distributed in a non-hierarchical fashion. All processors are identical. Similarly, all processors run the same operating system kernel. However, they may differ in the peripheral units connected to them. No memory is shared between processors. All communication involves messages explicitly passed between physically connected processors. No assumptions are made about the topology of interconnection.
The decision not to use logical or physical sharing of memory for communication is influenced both by the constraints of currently available hardware and by our perception of cost bottlenecks likely to arise as the number of processors increases.
Foundational Work
Coherent memory abstraction
Algorithms for scalable synchronization on shared-memory multiprocessors[7]
A N algorithm for mutual exclusion in decentralized systems[8]
File System abstraction
Measurements of a distributed file system[9]
Memory coherence in shared virtual memory systems[10]
Transaction abstraction
Transactions
Sagas[11]
Transactional Memory
Composable memory transactions[12]
Transactional memory: architectural support for lock-free data structures[13]
Software transactional memory for dynamic-sized data structures[14]
Software transactional memory[15]
Persistence abstraction
OceanStore: an architecture for global-scale persistent storage[16]
Coordinator abvstraction
Weighted voting for replicated data[17]
Consensus in the presence of partial synchrony[18]
Reliability abstraction
Sanity checks
The Byzantine Generals Problem[19]
Fail-stop processors: an approach to designing fault-tolerant computing systems[20]
Recoverability
Distributed snapshots: determining global states of distributed systems[21]
Optimistic recovery in distributed systems[22]
Current Research
replicated model extended to a component object model
Architectural Design of E1 Distributed Operating System[23]
The Cronus distributed operating system[24]
Fine-grained mobility in the emerald system[25]
Design and development of MINIX distributed operating system[26]
Future Directions
System able to provide low-level complexity exposure, in proportion to trust and accepted responsibility
Application performance and flexibility on exokernel systems.[27]
Scale and performance in the Denali isolation kernel.[28]
Infrastructure focus on multi-processor/core
The multikernel: a new OS architecture for scalable multicore systems.[29]
Corey: an Operating System for Many Cores.[30]
System extends a consistent and stable impression of processing over extremes in heterogeneity
Helios: heterogeneous multiprocessing with satellite kernels.[31]
System able to provide effective, stable, and beneficial view of vastly inceased complexity on multiple levels
Tesselation
References
- ^ Metcalfe, R. M. and Boggs, D. R. 1976. Ethernet: distributed packet switching for local computer networks. Commun. ACM 19, 7 (Jul. 1976), 395-404.
- ^ Lee, C. Y. 1962. Intercommunicating cells, basis for a distributed logic computer. In Proceedings of the December 4-6, 1962, Fall Joint Computer Conference (Philadelphia, Pennsylvania, December 04 - 06, 1962). AFIPS '62 (Fall).
- ^ Dijkstra, E. W. 1968. The structure of the “THE”-multiprogramming system. Commun. ACM 11, 5 (May. 1968), 341-346.
- ^ Wulf, W., Cohen, E., Corwin, W., Jones, A., Levin, R., Pierson, C., and Pollack, F. 1974. HYDRA: the kernel of a multiprocessor operating system. Commun. ACM 17, 6 (Jun. 1974), 337-345.
- ^ Millstein, R. E. 1977. The National Software Works: A distributed processing system. In Proceedings of the 1977 Annual Conference ACM '77. ACM, New York, NY, 44-52.
- ^ Solomon, M. H. and Finkel, R. A. 1979. The Roscoe distributed operating system. In Proceedings of the Seventh ACM Symposium on Operating Systems Principles (Pacific Grove, California, United States, December 10 - 12, 1979). SOSP '79.
- ^ Mellor-Crummey, J. M. and Scott, M. L. 1991. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst. 9, 1 (Feb. 1991), 21-65.
- ^ Maekawa, M. 1985. A N algorithm for mutual exclusion in decentralized systems. ACM Trans. Comput. Syst. 3, 2 (May. 1985), 145-159.
- ^ Baker, M. G., Hartman, J. H., Kupfer, M. D., Shirriff, K. W., and Ousterhout, J. K. 1991. Measurements of a distributed file system. In Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles (Pacific Grove, California, United States, October 13 - 16, 1991). SOSP '91. ACM, New York, NY, 198-212.
- ^ Li, K. and Hudak, P. 1989. Memory coherence in shared virtual memory systems. ACM Trans. Comput. Syst. 7, 4 (Nov. 1989), 321-359.
- ^ Garcia-Molina, H. and Salem, K. 1987. Sagas. In Proceedings of the 1987 ACM SIGMOD international Conference on Management of Data (San Francisco, California, United States, May 27 - 29, 1987). U. Dayal, Ed. SIGMOD '87. ACM, New York, NY, 249-259.
- ^ Harris, T., Marlow, S., Peyton-Jones, S., and Herlihy, M. 2005. Composable memory transactions. In Proceedings of the Tenth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Chicago, IL, USA, June 15 - 17, 2005). PPoPP '05. ACM, New York, NY, 48-60.
- ^ Herlihy, M. and Moss, J. E. 1993. Transactional memory: architectural support for lock-free data structures. In Proceedings of the 20th Annual international Symposium on Computer Architecture (San Diego, California, United States, May 16 - 19, 1993). ISCA '93. ACM, New York, NY, 289-300.
- ^ Herlihy, M., Luchangco, V., Moir, M., and Scherer, W. N. 2003. Software transactional memory for dynamic-sized data structures. In Proceedings of the Twenty-Second Annual Symposium on Principles of Distributed Computing (Boston, Massachusetts, July 13 - 16, 2003). PODC '03. ACM, New York, NY, 92-101.
- ^ Shavit, N. and Touitou, D. 1995. Software transactional memory. In Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing (Ottowa, Ontario, Canada, August 20 - 23, 1995). PODC '95. ACM, New York, NY, 204-213.
- ^ Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Wells, C., and Zhao, B. 2000. OceanStore: an architecture for global-scale persistent storage. In Proceedings of the Ninth international Conference on Architectural Support For Programming Languages and Operating Systems (Cambridge, Massachusetts, United States). ASPLOS-IX. ACM, New York, NY, 190-201.
- ^ Gifford, D. K. 1979. Weighted voting for replicated data. In Proceedings of the Seventh ACM Symposium on Operating Systems Principles (Pacific Grove, California, United States, December 10 - 12, 1979). SOSP '79. ACM, New York, NY, 150-162
- ^ Dwork, C., Lynch, N., and Stockmeyer, L. 1988. Consensus in the presence of partial synchrony. J. ACM 35, 2 (Apr. 1988), 288-323.
- ^ Lamport, L., Shostak, R., and Pease, M. 1982. The Byzantine Generals Problem. ACM Trans. Program. Lang. Syst. 4, 3 (Jul. 1982), 382-401.
- ^ Schlichting, R. D. and Schneider, F. B. 1983. Fail-stop processors: an approach to designing fault-tolerant computing systems. ACM Trans. Comput. Syst. 1, 3 (Aug. 1983), 222-238.
- ^ Chandy, K. M. and Lamport, L. 1985. Distributed snapshots: determining global states of distributed systems. ACM Trans. Comput. Syst. 3, 1 (Feb. 1985), 63-75.
- ^ Strom, R. and Yemini, S. 1985. Optimistic recovery in distributed systems. ACM Trans. Comput. Syst. 3, 3
- ^ L.B. Ryzhyk, A.Y. Burtsev. Architectural design of E1 distributed operating system. System Research and Information Technologies international scientific and technical journal, October 2004, Kiev, Ukraine.
- ^ Vinter, S. T. and Schantz, R. E. 1986. The Cronus distributed operating system. In Proceedings of the 2nd Workshop on Making Distributed Systems Work (Amsterdam, Netherlands, September 08 - 10, 1986). EW 2. ACM, New York, NY, 1-3.
- ^ Jul, E., Levy, H., Hutchinson, N., and Black, A. 1987. Fine-grained mobility in the emerald system. In Proceedings of the Eleventh ACM Symposium on Operating Systems Principles (Austin, Texas, United States, November 08 - 11, 1987). SOSP '87. ACM, New York, NY, 105-106.
- ^ Ramesh, K. S. 1988. Design and development of MINIX distributed operating system. In Proceedings of the 1988 ACM Sixteenth Annual Conference on Computer Science (Atlanta, Georgia, United States). CSC '88. ACM, New York, NY, 685.
- ^ M. Frans Kaashoek, Dawson R. Engler, Gregory R. Ganger, Héctor M. Briceño, Russell Hunt, David Mazières, Thomas Pinckney, Robert Grimm, John Jannotti, and Kenneth Mackenzie. In the Proceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP '97), Saint-Malô, France, October 1997.
- ^ Whitaker, A., Shaw, M., and Gribble, S. D. 2002. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation
- ^ Baumann, A., Barham, P., Dagand, P., Harris, T., Isaacs, R., Peter, S., Roscoe, T., Schüpbach, A., and Singhania, A. 2009. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (Big Sky, Montana, USA, October 11 - 14, 2009). SOSP '09.
- ^ S. Boyd-Wickizer, H. Chen, R. Chen, Y. Mao, F. Kashoek, R. Morris, A. Pesterev, L. Stein, M. Wu, Y. Dai, Y. Zhang, and Z. Zhang. Proceedings of the 2008 Symposium on Operating Systems Design and Implementation (OSDI), December 2008.
- ^ Nightingale, E. B., Hodson, O., McIlroy, R., Hawblitzel, C., and Hunt, G. 2009. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (Big Sky, Montana, USA, October 11 - 14, 2009). SOSP '09.
External links
- Coming Soon...