Revision as of 08:51, 3 May 2010 edit JLSjr (talk \| contribs) 106 edits No edit summary ← Previous edit		Revision as of 05:32, 6 May 2010 edit undo JLSjr (talk \| contribs) 106 edits No edit summary Next edit →
Line 1: == Headline text == {{Userspace draft\|date=April 2010}} <br /> Line 137 ⟶ 138: Transparency, simply put, is the quality of a distributed system to be seen and understood as a single-system image. Transparency is the greatest overriding consideration in the high-level conceptual design of a distributed operating system. While a simple concept, the consideration of transparency directly effects decision making in every aspect of design of a distributed operating system. Depending on the degree to which transparency is implemented into a system, certain requirements and/or restrictions may be imposed upon the many design considerations, and the relationships between them. ===Inter-~~Process~~process ~~Communication~~communication=== Inter-Process Communication (IPC) is the implementation of general communication, process interaction, and data flow between threads and/or processes both within a system node, and between all nodes in a distributed system. The distributed nature of a system's nodes and the multi-level considerations of intra-node and inter-node requirements provide the base-line for high-level IPC design considerations. However, IPC in a distributed operating system is a low-level implementation. IPC is the low-level critical complement to the high-level concept of transparency. Many of the requirements and restrictions imposed on a system as a result of transparency will be accomplished directly or indirectly through IPC. In this sense, IPC is the greatest underlying concept in the low-level design considerations of a distributed operating system. ===Process ~~Management~~management=== Process management provides policies and mechanisms for effective and efficient sharing of a system's distributed processing resources between that system's distributed processes. These policies and mechanisms support operations involving the allocation and de-allocation of processes and ports, as well as provisions to run, suspend, migrate, halt, or resume execution of processes, to mention a few. While these distributed system resources and the operations on them can be either local or remote with respect to each other, the distributed operating system must still maintain complete state of and sychronization over all processes in the system; and do so in a manner completely consistent from the user's unified system perspective. As an example, load balancing is a common process management function. One consideration of load balancing is which process should be moved. The kernel may have several mechanisms, one of which might be priority-based choice. This mechanism in the kernel defines '''what can be done'''; in this case, choose a process based on some priority. The system management components would have policies implementing the decision making for this context. One of these policies would define what priority means, and how it is to be used to choose a process in this instance. ===Resource ~~Management~~management=== Systems resources such as memory, files, devices, etc. are distributed throughout a system, and at any given moment, any of these nodes may have light to idle workloads. Load sharing and load balancing require many policy-oriented decisions, ranging from finding idle CPUs, when to move, and which to move. Many algorithms exist to aid in these decisions; however, this calls for a second-level of decision making policy in choosing the algorithm best suited for the scenario, and the conditions surrounding the scenario. Line 160 ⟶ 161: Flexibility in a distributed system is made possible through the modular characteristics of the microkernel. With the microkernel presenting a minimal -- but complete -- set of primitives and basic functionally cohesive services, The higher-level management components can be composed in a similar functionally cohesive manner. This capability leads to exceptional flexibility in the management components collection; but more importantly, it allows the opportunity to dynamically swap, upgrade, or install additional of components above the kernel. ==Transparency ~~Responsibilities~~responsibilities== '''Transparency''' is a property of a system or application, that allows a user to accomplish an objective with little, if any knowledge of the particular internal details related to the objective. A system or application may expose as much, or as little transparancy in a given area of functionality as deemed necessary. That is to say, the degree to which transparency is implemented can vary between subsets of functionality in a system or application. There are many specific areas of a system that can benefit from transparency; access, ___location, performance, naming, and migration to name a few. For example, a diatributed operating system may present access to a hard drive as "C:" and access to a DVD as "G:". The user does not require any knowledge of device drivers or methods of direct memory access techniques possibly used behind-the-scenes; both devices work the same way, from the user's perspective. This example demonstrates a high-level of transparency; and displays how low-level details are made somewhat "invisible" to the user through transparency. On the other hand, if a user desires to access another system or server, a host name or IP address may be required, along with a remote-machine user login and password. This would indicate a low-degree of transparency, as there is much more detailed knowledge required of the user in order to accomplish this task. ===Location Transparency===▼ System should create and maintain the user's perception and understanding of the entirety of the system, its devices, and resources as local entities. At no point in any user's system experience should there exist any expectation of any user to be Generally, transparency and user-required knowledge form an inverse relation. As transparency is designed and implemented into various areas of a system, great care must be taken not to adversly effect other areas of transparency and other basic design concerns. Transparency, as a design concept, is one of the grand challenges in design of a distributed operating system; as it is a factor in the necessity for a complete upfront understanding. ▲===Location ~~Transparency~~transparency=== '''Location transparency''' comprises two distinct aspects, Naming and User mobility. '''Naming transparency''' requires that nothing in the physical or logical references to an entity should expose any indication of the entities ___location. '''User mobility''' requires consistent referencing of an entity regardless of its ___location within the system. These two, naming transparency and user mobility, work together to remove the need for a user's knowledge regarding specific entities' details within a system. ===Access ~~Transparency~~transparency===▼ Local and remote resources should remain indistiguishable through user interface system calls. The Distributed operating system maintains a user's perception of these entities in a clean, clear, and consistent manner. ▲===Access Transparency=== System entities or processes maintain consistent access/entry mechanism, regardless of being local or remote ===Migration ~~Transparency~~transparency=== Resources and processes can be migrated, without user-knowledge, by the system to another node in an attempt to maximize efficiency, reliability, and security. Requires policy decision-making abilities, Naming stability, and in the event of a process migration, all IPC messages must be received or held pending the migration. ===Replication ~~Transparency~~transparency=== Systems entities can be copied to strategic points in the system to increase efficiencies through better proximity, and also provide for improved reliability through the distributed replication as a back-up; prompted by dynamic stratagem. ===Concurrency ~~Transparency~~transparency=== System should possess and exhibit properties to allow multiple simultaneous uses of system resources between users ho are kept unaware of the concurrent usage. Required properties are synchronization mechanisms to keep events ordered and consistent, mutual-exclusivity management for resources, sufficient capabilities to detect and recover from both starvation and deadlock. ===Parallel ~~Transparency~~transparency=== System should have stable performance characteristics, regardless if some nodes increase rapidly in workload, through properties of migration, replication, and concurrency. This requires an intelligent policy decision stratagem to facilitate the timely and accurate allocation, migration, and disposition of resources. ===Failure ~~Transparency~~transparency=== The system should shield users from the knowledge of and the affects resulting from failures. In the event of a partial failure, the system is responsible for rapid and accurate detection and orchestration of a remedy with little, if any imposition on users. These methods can range from static proactive posturing to dynamic and more flexible response mechanisms. Line 185 ⟶ 193: System should create and maintain a reasonable, stable, and predictable performance expectation for the user, that is both resilient from and helpful in situations where parts of the system may experience significant delay or even failure. While reasonable and predictable are important, there should be no inherent expectation or expressed indication of fairness or equality. ===Name ~~Transparency~~transparency=== All system entities should maintain a complete decoupling between entity naming from any spatial or temporal ___location, as well as any other system entity. ===Size/Scale ~~Transparency~~transparency=== A user's experience or perception of their system should remain stable and consistent in the face of system extension, scaling, or waning due to failure. ===Revision ~~Transparency~~transparency=== System users should be completely oblivious to system-software version changes and changes in internal implementation of system infrastructure. While a user may become aware of, or discover the availability of a new function or service, the implementation or alteration of the systems internal structure should in no way be the prompt for this discovery. ===Control ~~Transparency~~transparency=== All system constants, properties, configuration settings, etc. should be completely consistent in appearance, connotation, and denotation to all users and software applications aware of them. ===Data ~~Transparency~~transparency=== No system data-entity should expose itself as peculiar when required to interact remotely. ==Historical ~~Perspectives~~perspectives== === Pioneering inspirations === Line 205 ⟶ 213: With a cursory glance around the internet, or a modest perusal of pertinent writings, one could very easily gain the notion that computer operating systems were a new phenomenon in the mid-twentieth century. In fact, important research in operating systems was being conducted at this time.<ref>Dreyfuss, P. 1958. System design of the Gamma 60. In Proceedings of the May 6-8, 1958, Western Joint Computer Conference: Contrasts in Computers (Los Angeles, California, May 06 - 08, 1958). IRE-ACM-AIEE '58 (Western). ACM, New York, NY, 130-133. </ref><ref>Leiner, A. L., Notz, W. A., Smith, J. L., and Weinberger, A. 1958. Organizing a network of computers to meet deadlines. In Papers and Discussions Presented At the December 9-13, 1957, Eastern Joint Computer Conference: Computers with Deadlines To Meet (Washington, D.C., December 09 - 13, 1957). IRE-ACM-AIEE '57</ref><ref>Leiner, A. L., Smith, J. L., Notz, W. A., and Weinberger, A. 1958. PILOT, the NBS multicomputer system. In Papers and Discussions Presented At the December 3-5, 1958, Eastern Joint Computer Conference: Modern Computers: Objectives, Designs, Applications (Philadelphia, Pennsylvania, December 03 - 05, 1958). AIEE-ACM-IRE '58 (Eastern). ACM, New York, NY, 71-75.</ref><ref>Bauer, W. F. 1958. Computer design from the programmer's viewpoint. In Papers and Discussions Presented At the December 3-5, 1958, Eastern Joint Computer Conference: Modern Computers: Objectives, Designs, Applications (Philadelphia, Pennsylvania, December 03 - 05, 1958). AIEE-ACM-IRE '58 (Eastern). ACM, New York, NY, 46-51.</ref><ref>Leiner, A. L., Notz, W. A., Smith, J. L., and Weinberger, A. 1959. PILOT—A New Multiple Computer System. J. ACM 6, 3 (Jul. 1959), 313-335. </ref><ref>Estrin, G. 1960. Organization of computer systems: the fixed plus variable structure computer. In Papers Presented At the May 3-5, 1960, Western Joint IRE-AIEE-ACM Computer Conference (San Francisco, California, May 03 - 05, 1960). IRE-AIEE-ACM '60 (Western). ACM, New York, NY, 33-40.</ref> While early exploration into operating systems took place in the years leading to 1950; shortly afterward, highly advanced research began on new systems to conquer new problems. In the first decade of the second-half of the [[20th century]], many new questions were asked, many new problems were identified, many solutions were developed and working for years, in controlled production environments. ==== Aboriginal ~~Distributed~~distributed ~~Computing~~computing ==== '''The DYSEAC'''<ref>Leiner, A. L. 1954. System Specifications for the DYSEAC. J. ACM 1, 2 (Apr. 1954), 57-81.</ref> (1954) Line 270 ⟶ 278: ''The decision not to use logical or physical sharing of memory for communication is influenced both by the constraints of currently available hardware and by our perception of cost bottlenecks likely to arise as the number of processors increases. ''</font> === Foundational ~~Work~~work === ==== Coherent memory abstraction ==== Line 306 ⟶ 314: <br />{{pad\|4em}}'''Optimistic recovery in distributed systems'''<ref>Strom, R. and Yemini, S. 1985. Optimistic recovery in distributed systems. ACM Trans. Comput. Syst. 3, 3 </ref> === Current ~~Research~~research === ==== replicated model extended to a component object model ==== Line 314 ⟶ 322: <br />{{pad\|2em}}Design and development of MINIX distributed operating system<ref>Ramesh, K. S. 1988. Design and development of MINIX distributed operating system. In Proceedings of the 1988 ACM Sixteenth Annual Conference on Computer Science (Atlanta, Georgia, United States). CSC '88. ACM, New York, NY, 685.</ref> === Future ~~Directions~~directions === ==== ~~Systems able to provide low-level complexity~~Complexity/Trust exposure~~, in proportion to trust~~ ~~and~~through accepted responsibility ==== ~~{{pad\|2em}}~~:Application performance and flexibility on exokernel systems.<ref>M. Frans Kaashoek, Dawson R. Engler, Gregory R. Ganger, Héctor M. Briceño, Russell Hunt, David Mazières, Thomas Pinckney, Robert Grimm, John Jannotti, and Kenneth Mackenzie. In the Proceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP '97), Saint-Malô, France, October 1997.</ref> ~~<br />{{pad\|2em}}~~:Scale and performance in the Denali isolation kernel.<ref>Whitaker, A., Shaw, M., and Gribble, S. D. 2002. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation</ref> ==== ~~Infrastructures~~Multi/Many-core focused ~~on multi-processor/core processing~~systems ==== ~~{{pad\|2em}}~~:The multikernel: a new OS architecture for scalable multicore systems.<ref>Baumann, A., Barham, P., Dagand, P., Harris, T., Isaacs, R., Peter, S., Roscoe, T., Schüpbach, A., and Singhania, A. 2009. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (Big Sky, Montana, USA, October 11 - 14, 2009). SOSP '09.</ref> ~~<br />{{pad\|2em}}~~:Corey: an Operating System for Many Cores.<ref>S. Boyd-Wickizer, H. Chen, R. Chen, Y. Mao, F. Kashoek, R. Morris, A. Pesterev, L. Stein, M. Wu, Y. Dai, Y. Zhang, and Z. Zhang. Proceedings of the 2008 Symposium on Operating Systems Design and Implementation (OSDI), December 2008.</ref> ==== ~~Systems extending a consistent and stable impression of distributed~~Distributed processing over extremes in heterogeneity ==== ~~{{pad\|2em}}~~:Helios: heterogeneous multiprocessing with satellite kernels.<ref>Nightingale, E. B., Hodson, O., McIlroy, R., Hawblitzel, C., and Hunt, G. 2009. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (Big Sky, Montana, USA, October 11 - 14, 2009). SOSP '09.</ref> ==== ~~Systems~~Effective ~~able to provide effective,~~and stable, ~~and~~in ~~beneficial~~multiple ~~views~~levels of ~~vastly increased~~ complexity ~~on multiple levels~~ ==== ~~{{pad\|2em}}~~:Tesselation == See Also == Line 337 ⟶ 345: {{Reflist}} == Further ~~Reading~~reading == * Coming Soon...

Distributed operating system: Difference between revisions