Distributed operating system: Difference between revisions

Content deleted Content added
JLSjr (talk | contribs)
No edit summary
JLSjr (talk | contribs)
No edit summary
Line 98:
==Major Design Considerations==
===Transparency===
'''Transparency, simply put''', is the quality of a distributed operating system to be seen and understood as a '''single-system image'''., Transparencyand is the greatest overriding consideration in the high-level conceptual design of a distributed operating system. While a simple concept, the consideration of transparency directly effects decision making in every aspect of design of a distributed operating system. Depending on the degree to which transparency is implemented into a system, certain requirements and/or restrictions may be imposed upon the many other design considerations, and thetheir relationships between them.
 
'''Transparency''' is a property of a system or application, that allows a user to accomplish ana system-related objective with little, ifabsolute anyminimal knowledge of the particular internal details related to the objective. A system or application may expose as much, or as little transparancy in a given area of functionality as deemed necessary. That is to say, the degree to which transparency is implemented can vary between subsets of functionality in a system or application. There are many specific areas of a system that can benefit from transparency; access, ___location, performance, naming, and migration to name a few.
 
For example, a distributed operating system may present access to a hard drive as "C:" and access to a DVD as "G:". The user does not require any knowledge of device drivers or methods of direct memory access techniques possibly used behind-the-scenes; both devices work the same way, from the user's perspective. This example demonstrates a high-level of transparency; and displays how low-level details are made somewhat "invisible" to the user through transparency. On the other hand, if a user desires to access another system or server, a host name or IP address may be required, along with a remote-machine user login and password. This would indicate a low-degree of transparency, as there is detailed knowledge required of the user in order to accomplish this task.
 
Generally, transparency and user-required knowledge form an inverse relation. As transparency is designed and implemented into various areas of a system, great care must be taken not to adversely effect other areas of transparency and other basic design concerns. Transparency, as a design concept, is one of the grand challenges in design of a distributed operating system; as it is a factor in the necessity for a complete upfront understanding.
 
*'''Location transparency''' - Location transparency comprises two distinct aspects, Naming and User mobility. '''Naming transparency''' requires that nothing in the physical or logical references to an entity should expose any indication of the entities ___location. '''User mobility''' requires consistent referencing of an entity regardless of its ___location within the system. These two related concepts, naming transparency and user mobility, work together to remove the need for a user's knowledge regarding specific entities' details within a system.
 
*'''Access transparency''' - Local and remote resources should remain indistinguishable through user interface system calls. The Distributed operating system maintains a user's perception of these entities in a clean, clear, and consistent manner. System entities or processes maintain consistent access/entry mechanism, regardless of being local or remote.
 
*'''Migration transparency''' - Resources and processes can be migrated, without user-knowledge, by the system to another node in an attempt to maximize efficiency, reliability, and security. Requires policy decision-making abilities, Naming stability, and in the event of a process migration, all IPC messages must be received or held pending the migration.
*'''Replication transparency''' - Systems entities can be copied to strategic points in the system to increase efficiencies through better proximity, and also provide for improved reliability through the distributed replication as a back-up; prompted by dynamic stratagem.
 
*'''Concurrency transparency''' - System should possess and exhibit properties to allow multiple simultaneous uses of system resources between users ho are kept unaware of the concurrent usage. Required properties are synchronization mechanisms to keep events ordered and consistent, mutual-exclusivity management for resources, sufficient capabilities to detect and recover from both starvation and deadlock.
 
*'''Parallel transparency''' - System should have stable performance characteristics, regardless if some nodes increase rapidly in workload, through properties of migration, replication, and concurrency. This requires an intelligent policy decision stratagem to facilitate the timely and accurate allocation, migration, and disposition of resources.
 
*'''Failure transparency''' - The system should shield users from the knowledge of and the affects resulting from failures. In the event of a partial failure, the system is responsible for rapid and accurate detection and orchestration of a remedy with little, if any imposition on users. These methods can range from static proactive posturing to dynamic and more flexible response mechanisms.
 
*'''Perform Transparency''' - System should create and maintain a reasonable, stable, and predictable performance expectation for the user, that is both resilient from and helpful in situations where parts of the system may experience significant delay or even failure. While reasonable and predictable are important, there should be no inherent expectation or expressed indication of fairness or equality.
 
*'''Name transparency''' - All system entities should maintain a complete decoupling between entity naming from any spatial or temporal ___location, as well as any other system entity.
 
*'''Size/Scale transparency''' - A user's experience or perception of their system should remain stable and consistent in the face of system extension, scaling, or waning due to failure.
 
*'''Revision transparency''' - System users should be completely oblivious to system-software version changes and changes in internal implementation of system infrastructure. While a user may become aware of, or discover the availability of a new function or service, the implementation or alteration of the systems internal structure should in no way be the prompt for this discovery.
 
*'''Control transparency''' - All system constants, properties, configuration settings, etc. should be completely consistent in appearance, connotation, and denotation to all users and software applications aware of them.
 
*'''Data transparency''' - No system data-entity should expose itself as peculiar when required to interact remotely.
 
===Inter-process communication===
Line 122 ⟶ 154:
===Flexibility===
Flexibility in a distributed operating system is made possible through the modular characteristics of the microkernel. With the microkernel presenting a minimal -- but complete -- set of primitives and basic functionally cohesive services, The higher-level management components can be composed in a similar functionally cohesive manner. This capability leads to exceptional flexibility in the management components collection; but more importantly, it allows the opportunity to dynamically swap, upgrade, or install additional of components above the kernel.
 
==Transparency responsibilities==
'''Transparency''' is a property of a system or application, that allows a user to accomplish an objective with little, if any knowledge of the particular internal details related to the objective. A system or application may expose as much, or as little transparancy in a given area of functionality as deemed necessary. That is to say, the degree to which transparency is implemented can vary between subsets of functionality in a system or application. There are many specific areas of a system that can benefit from transparency; access, ___location, performance, naming, and migration to name a few.
 
For example, a distributed operating system may present access to a hard drive as "C:" and access to a DVD as "G:". The user does not require any knowledge of device drivers or methods of direct memory access techniques possibly used behind-the-scenes; both devices work the same way, from the user's perspective. This example demonstrates a high-level of transparency; and displays how low-level details are made somewhat "invisible" to the user through transparency. On the other hand, if a user desires to access another system or server, a host name or IP address may be required, along with a remote-machine user login and password. This would indicate a low-degree of transparency, as there is detailed knowledge required of the user in order to accomplish this task.
 
Generally, transparency and user-required knowledge form an inverse relation. As transparency is designed and implemented into various areas of a system, great care must be taken not to adversely effect other areas of transparency and other basic design concerns. Transparency, as a design concept, is one of the grand challenges in design of a distributed operating system; as it is a factor in the necessity for a complete upfront understanding.
 
===Location transparency===
'''Location transparency''' comprises two distinct aspects, Naming and User mobility. '''Naming transparency''' requires that nothing in the physical or logical references to an entity should expose any indication of the entities ___location. '''User mobility''' requires consistent referencing of an entity regardless of its ___location within the system. These two related concepts, naming transparency and user mobility, work together to remove the need for a user's knowledge regarding specific entities' details within a system.
 
===Access transparency===
Local and remote resources should remain indistinguishable through user interface system calls. The Distributed operating system maintains a user's perception of these entities in a clean, clear, and consistent manner.
 
System entities or processes maintain consistent access/entry mechanism, regardless of being local or remote
 
===Migration transparency===
Resources and processes can be migrated, without user-knowledge, by the system to another node in an attempt to maximize efficiency, reliability, and security. Requires policy decision-making abilities, Naming stability, and in the event of a process migration, all IPC messages must be received or held pending the migration.
 
===Replication transparency===
Systems entities can be copied to strategic points in the system to increase efficiencies through better proximity, and also provide for improved reliability through the distributed replication as a back-up; prompted by dynamic stratagem.
 
===Concurrency transparency===
System should possess and exhibit properties to allow multiple simultaneous uses of system resources between users ho are kept unaware of the concurrent usage. Required properties are synchronization mechanisms to keep events ordered and consistent, mutual-exclusivity management for resources, sufficient capabilities to detect and recover from both starvation and deadlock.
 
===Parallel transparency===
System should have stable performance characteristics, regardless if some nodes increase rapidly in workload, through properties of migration, replication, and concurrency. This requires an intelligent policy decision stratagem to facilitate the timely and accurate allocation, migration, and disposition of resources.
===Failure transparency===
The system should shield users from the knowledge of and the affects resulting from failures. In the event of a partial failure, the system is responsible for rapid and accurate detection and orchestration of a remedy with little, if any imposition on users. These methods can range from static proactive posturing to dynamic and more flexible response mechanisms.
 
===Perform Transparency===
System should create and maintain a reasonable, stable, and predictable performance expectation for the user, that is both resilient from and helpful in situations where parts of the system may experience significant delay or even failure. While reasonable and predictable are important, there should be no inherent expectation or expressed indication of fairness or equality.
 
===Name transparency===
All system entities should maintain a complete decoupling between entity naming from any spatial or temporal ___location, as well as any other system entity.
===Size/Scale transparency===
A user's experience or perception of their system should remain stable and consistent in the face of system extension, scaling, or waning due to failure.
===Revision transparency===
System users should be completely oblivious to system-software version changes and changes in internal implementation of system infrastructure. While a user may become aware of, or discover the availability of a new function or service, the implementation or alteration of the systems internal structure should in no way be the prompt for this discovery.
 
===Control transparency===
All system constants, properties, configuration settings, etc. should be completely consistent in appearance, connotation, and denotation to all users and software applications aware of them.
===Data transparency===
No system data-entity should expose itself as peculiar when required to interact remotely.
 
 
==Historical perspectives==
 
===Pioneering inspirations===
 
With a cursory glance around the internet, or a modest perusal of pertinent writings, one could very easily gain the notion that computer operating systems were a new phenomenon in the mid-twentieth century. In fact, important research in operating systems was being conducted at this time.<ref>Dreyfuss, P. 1958. System design of the Gamma 60. In Proceedings of the May 6-8, 1958, Western Joint Computer Conference: Contrasts in Computers (Los Angeles, California, May 06 - 08, 1958). IRE-ACM-AIEE '58 (Western). ACM, New York, NY, 130-133. </ref><ref>Leiner, A. L., Notz, W. A., Smith, J. L., and Weinberger, A. 1958. Organizing a network of computers to meet deadlines. In Papers and Discussions Presented At the December 9-13, 1957, Eastern Joint Computer Conference: Computers with Deadlines To Meet (Washington, D.C., December 09 - 13, 1957). IRE-ACM-AIEE '57</ref><ref>Leiner, A. L., Smith, J. L., Notz, W. A., and Weinberger, A. 1958. PILOT, the NBS multicomputer system. In Papers and Discussions Presented At the December 3-5, 1958, Eastern Joint Computer Conference: Modern Computers: Objectives, Designs, Applications (Philadelphia, Pennsylvania, December 03 - 05, 1958). AIEE-ACM-IRE '58 (Eastern). ACM, New York, NY, 71-75.</ref><ref>Bauer, W. F. 1958. Computer design from the programmer's viewpoint. In Papers and Discussions Presented At the December 3-5, 1958, Eastern Joint Computer Conference: Modern Computers: Objectives, Designs, Applications (Philadelphia, Pennsylvania, December 03 - 05, 1958). AIEE-ACM-IRE '58 (Eastern). ACM, New York, NY, 46-51.</ref><ref>Leiner, A. L., Notz, W. A., Smith, J. L., and Weinberger, A. 1959. PILOT—A New Multiple Computer System. J. ACM 6, 3 (Jul. 1959), 313-335. </ref><ref>Estrin, G. 1960. Organization of computer systems: the fixed plus variable structure computer. In Papers Presented At the May 3-5, 1960, Western Joint IRE-AIEE-ACM Computer Conference (San Francisco, California, May 03 - 05, 1960). IRE-AIEE-ACM '60 (Western). ACM, New York, NY, 33-40.</ref> While early exploration into operating systems took place in the years leading to 1950; shortly afterward, highly advanced research began on new systems to conquer new problems. In the first decade of the second-half of the [[20th century]], many new questions were asked, many new problems were identified, many solutions were developed and working for years, in controlled production environments.
 
Line 220 ⟶ 204:
 
{{quote|We wanted to present here the basic ideas of a distributed logic system with... the macroscopic concept of logical design, away from scanning, from searching, from addressing, and from counting, is equally important. We must, at all cost, free ourselves from the burdens of detailed local problems which only befit a machine low on the evolutionary scale of machines.|Chung-Yeol (C. Y.) Lee|''Intercommunicating Cells, Basis for a Distributed Logic Computer''}}
 
====Component abstraction====
'''HYDRA:The Kernel of a Multiprocessor Operating System'''<ref>Wulf, W., Cohen, E., Corwin, W., Jones, A., Levin, R., Pierson, C., and Pollack, F. 1974. HYDRA: the kernel of a multiprocessor operating system. Commun. ACM 17, 6 (Jun. 1974), 337-345.</ref> (1974)
<br />
<font color="red">''The design philosophy of HYDRA ... suggest that, at the heart of the system, one should build a collection of facilities of "universal applicability" and "absolute reliability" -- a set of mechanisms from which an arbitrary set of operating system facilities and policies can be conveniently, flexibly, efficiently, and reliably constructed.''
<br />
''Defining a kernel with all the attributes given above is difficult, and perhaps impractical... It is, nevertheless, the approach taken in the HYDRA system. Although we make no claim either that the set of facilities provided by the HYDRA kernel ... we do believe the set provides primitives which are both necessary and adequate for the construction of a large and interesting class of operating environments. It is our view that the set of functions provided by HYDRA will enable the user of C.mmp to create his own operating environment without being confined to predetermined command and file systems, execution scenarios, resource allocation policies, etc.''</font>
 
====Initial composition====
'''The National Software Works: A Distributed Processing System'''<ref>Millstein, R. E. 1977. The National Software Works: A distributed processing system. In Proceedings of the 1977 Annual Conference ACM '77. ACM, New York, NY, 44-52.</ref> (1975)
 
<font color="red">''The National Software Works (NSW) is a significant new step in the development of distributed processing systems and computer networks. NSW is an ambitious project to link a set of geographically distributed and diverse hosts with an operating system which appears as a single entity to a prospective user.''</font>
 
====Complete instantiation====
'''The Rosco Distributed Operating System'''<ref>Solomon, M. H. and Finkel, R. A. 1979. The Roscoe distributed operating system. In Proceedings of the Seventh ACM Symposium on Operating Systems Principles (Pacific Grove, California, United States, December 10 - 12, 1979). SOSP '79.</ref> (1979)
 
<font color="red">''Roscoe is an operating system implemented at the University of Wisconsin that allows a network of microcomputers to cooperate to provide a general-purpose computing facility. The goal of the Roscoe network is to provide a general-purpose computation resource in which individual resources such as files and processors are shared among processes and control is distributed in a non-hierarchical fashion. All processors are identical. Similarly, all processors run the same operating system kernel. However, they may differ in the peripheral units connected to them. No memory is shared between processors. All communication involves messages explicitly passed between physically connected processors. No assumptions are made about the topology of interconnection.''
 
''The decision not to use logical or physical sharing of memory for communication is influenced both by the constraints of currently available hardware and by our perception of cost bottlenecks likely to arise as the number of processors increases. ''</font>
 
===Foundational work===
 
====Coherent memory abstraction====
{{pad|2em}}'''Algorithms for scalable synchronization on shared-memory multiprocessors'''<ref>Mellor-Crummey, J. M. and Scott, M. L. 1991. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst. 9, 1 (Feb. 1991), 21-65.</ref>