Distributed operating system: Difference between revisions

Content deleted Content added
JLSjr (talk | contribs)
No edit summary
JLSjr (talk | contribs)
No edit summary
Line 34:
== Overview ==
=== The kernel ===
The Kernelkernel is a minimal, but complete set of node-level utilities necessary for access to a node’s underlying hardware and resources. These mechanisms provide the complete set of “building-blocks” essential for node operation; mainly low-level allocation, management, and disposition of a node’s resources, processes, communication, and I/O management support functions.<ref name="COS">P. Brinch Hansen, Ed. 2000 Classic Operating Systems: from Batch Processing to Distributed Systems. Springer-Verlag New York, Inc.</ref> These functions are made possible by exposing a concise, yet comprehensive array of primitive mechanisms and services. The kernel is arguably the primary consideration in a distributed operating system; however, within the kernel, the subject of foremost importance is that of a well-structured and highly-efficient communications sub-system.<ref name="TLD"/>
 
In a distributed operating system, the kernel is often defined by a relative to absolute minimal architecture. A Kernel of this design is referred to as a Microkernel.<ref>Using LOTOS for specifying the CHORUS distributed operating system kernel Pecheur, C. 1992. Using LOTOS for specifying the CHORUS distributed operating system kernel. Comput. Commun. 15, 2 (Mar. 1992), 93-102.</ref> <ref>COOL: kernel support for object-oriented environments Habert, S. and Mosseri, L. 1990. COOL: kernel support for object-oriented environments. In Proceedings of the European Conference on Object-Oriented Programming on Object-Oriented Programming Systems, Languages, and Applications (Ottawa, Canada). OOPSLA/ECOOP '90. ACM, New York, NY, 269-275.</ref> The microkernel usually contains only the mechanisms and services which, if otherwise removed, would render a node or the global system functionally inoperable. The minimal nature of the microkernel strongly enhances a distributed operating system’s modular potential.<ref name="DCD">Distributed Operating Systems: Concepts and Design Sinha, P. K. 1996 Distributed Operating Systems: Concepts and Design. 1st. Wiley-IEEE Press.</ref> It is generally the case that the kernelmicrokernel is implemented directly onabove theits barenode’s metalhardware ofand a node’s hardwareresources; it is also common for a kernel to be idntically replicated over all the nodes in a system.<ref name="DCP">Distributed Operating Systems Galli, D. L. 1999 Distributed Operating Systems: Concepts and Practice. 1st. Prentice Hall PTR.</ref> The combination of a kernel’smicrokernel’s minimal design and ubiquitous node coverage greatlyenhances aids inthe global system's extensibility, and the ability to dynamically introduce new nodes or services.<ref name="DSA">Distributed Operating Systems and Algorithms Chow, R. and Chow, Y. 1997 Distributed Operating Systems and Algorithms. Addison-Wesley Longman Publishing Co., Inc.</ref>
 
[[Image:System Management Components.PNG|thumbnail|right|175px|alt=General overview of system management components that reside above the microkernel.|System management components overview]]
=== System management components ===
A node’s system management components are a collection of software server processes that basically define the policies of a system node. These components are the composite of a node’s system software not directly required within the kernel. These software services support all of the needs of the node; namely communication, process and resource management, reliability, performance, security, scalability, and heterogeneitysecurity to mention just a few. In this capacity the, system management components compare directly to the centralized operating systemsoftware of a single-entity system.<ref name="TLD"/>
 
However, these system management components have the addedadditional challenges with respect to supporting a node's responsibilities to the global system. In addition, the system management components accept the defensive responsibilities of reliability, availability, and persistence inherent to athe distributed collection ofoperationg networked nodessystem. Quite often, any effort to realize a high-level of success in a particular area, illuminatesincites conflict with similar efforts in other areas. Therefore, a consistent approach of, balanced perspective, and a deep understanding of the overall system and itits goals can help mitigate some complexity, and assist in quickly identifyidentifying potential points of the diminishing returns. ItThis is foran thisexample purposeof thatwhy the separation of policy and mechanism is so critical.<ref name="DSA"/>
 
=== Working together as an operating system ===
The architecture and design of a distributed operating system is specifically aligned with realizing both individual modenode and global system goals,. Any architecture or design must be approached in a manner consistent with separating policy and mechanism. SimplyIn saiddoing so, a distributed operating system attempts to provide a highly efficient and reliable distributed computing framework withallowing afor an absolute minimumminimal user awareness of the underlying command and control efforts.<ref name="DCD"/> The multi-level collaboration between a kernel and the system management components, and in turn between the distinct nodes in a distributed system is the functional opportunity of the distributed operating system. However, this opportunity comes at a very high cost in complexity.
 
The multi-level collaboration between a kernel and the system management components, and in turn between the distinct nodes in a distributed system is the functional challenge of the distributed operating system. This is the point in the system that must maintain a perfect harmony of purpose, and simultaneously maintain a complete disconnect of intent from implementation. This challenge is the distributed operating system's opportunity, to produce the foundation and framework for a reliable, efficient, available, robust, extensible, and scalable system. However, this opportunity comes at a very high cost in complexity.
 
===The price of complexity===
In a distributed operating system, the exceptional degree of inherent complexity could easily render the entire system an anathema to any user. As such, the logical price of realizing a distributed system – including its operatingoperation system must be calculated in terms of overcoming vast amounts of complexity onin many levelsareas, and inon many areaslevels. This calculation includes the depth, breadth, and range of design investment and architectural planning required in achieving even the most modest implementation.<ref>Surajbali, B., Coulson, G., Greenwood, P., and Grace, P. 2007. Augmenting reflective middleware with an aspect orientation support layer. In Proceedings of the 6th international Workshop on Adaptive and Reflective Middleware: Held At the ACM/IFIP/USENIX international Middleware Conference (Newport Beach, CA, November 26 - 30, 2007). ARM '07. ACM, New York, NY, 1-6.</ref>

These design and development considerations are critical and unforgiving. For instance, ana deep understanding of a distributed operating system’s overall architectural and design detail is required fromat thean exceptionally early startpoint.<ref name="LSF"/> There are an exhaustive array of design considerations inherent to the development of a distributed operating system. Each of these design considerations can potentially effect many of the others to a significant degree. This leads to a massive effort in balanced approach, in terms of the individual design considerations, and many of their permutations. As an aid in this effort, most rely strongly on the immense amount of documented experience and research in distributed computing which exists, and continues even today.
 
===Perspectives: past, present, and future===
Many notable experts look to the early 1970s for themeaningful beginnings earliestin distributed systems,operating system research. These complete byin definition and capablecapability of being considered and implemented wholly. Research and experimentation efforts did began in earnest in the mid to late-1970s and continued into the earlythrough 1990s, with afocused fewinterest implementationspeaking achievingin modestthe commerciallate success1980's. TheA subjectnumber of distributed operating systems however,were hasintroduced aduring muchthis richerperiod; historicalhowever, perspectivevery when considering design issues severally with respect to somefew of the individual primordial strides towards distributed computing. There are several instances of fundamental and pioneeringthese implementations ofachieved primitivemodest distributedcommercial system and component concepts dating back to the early 1950ssuccess. Looking to the modern distributed system and its future, the accelerating proliferation of multiprocessor systems and multi-core processors has led to a re-emergence of the distributed system concept. The inherent challenges in many-core and multiprocessor science has led to an enormous increase in distributed system related research. Many of these research efforts investigate and describe interesting and plausible paradigms for the future of distributed computing.
 
The subject of distributed operating systems however, has a much richer historical perspective. This is especially evident when considering distributed operating system design issues severally, and with respect to some of the primordial strides taken towards their realization. There are [[#Pioneering_inspirations|several instances of fundamental and pioneering implementations of primitive distributed operating system component concepts]] dating back to the early 1950s.<ref name=dyseac>Leiner, A. L. 1954. System Specifications for the DYSEAC. J. ACM 1, 2 (Apr. 1954), 57-81.</ref> <ref name=lincoln_tx2>Forgie, J. W. 1957. The Lincoln TX-2 input-output system. In Papers Presented At the February 26-28, 1957, Western Joint Computer Conference: Techniques For Reliability (Los Angeles, California, February 26 - 28, 1957). IRE-AIEE-ACM '57 (Western). ACM, New York, NY, 156-160.</ref> <ref name=intercomm_cells>Lee, C. Y. 1962. Intercommunicating cells, basis for a distributed logic computer. In Proceedings of the December 4-6, 1962, Fall Joint Computer Conference (Philadelphia, Pennsylvania, December 04 - 06, 1962). AFIPS '62 (Fall).</ref> Some of these very early individual steps were not focused directly on ditributed computing, and at the time, many may not have realized there important impact. These pioneering efforts laid important groundwork, and inspired continued research in areas related to distributed computing.
 
Begining in the mid 1970's, many important research efforts produced extremely important advances in distributed computing. These breakthroughs provided a solid, stable foundation and for
 
Considering the modern distributed operating system and its future, one must look no further than the current incredible challenges of many-core and multi-processor science. The accelerating proliferation of multi-processor and multi-core processor systems research has led to a resurgence of the distributed operating system concept. Many of these research efforts are investigating interesting, exciting, and plausible paradigms impacting the future of distributed computing.
 
==Distributed computing models==
Line 89 ⟶ 99:
 
==Major Design Considerations==
 
===Transparency===
Transparency, simply put, is the quality of a distributed system to be seen and understood as a '''single-system image'''. Transparency is the greatest overriding consideration in the high-level conceptual design of a distributed operating system. While a simple concept, the consideration of transparency directly effects decision making in every aspect of design of a distributed operating system. Depending on the degree to which transparency is implemented into a system, certain requirements and/or restrictions may be imposed upon the many design considerations, and the relationships between them.
Line 169 ⟶ 178:
 
====Aboriginal distributed computing====
'''The DYSEAC'''<ref>Leiner, A. L. 1954. System Specifications for the DYSEAC. J. ACM 1, 2 (Apr. 1954), 57-81.<name=dyseac/ref> (1954)
 
One of the first solutions to these new questions was the [[DYSEAC]], a self-described general-purpose [[Synchronization (computer science)|synchronous]] computer; but at this point in history, exhibited signs of being much more than general-purpose. In one of the earliest publications of the [[ACM]], in April of 1954, a researcher at the [[National Bureau of Standards]] – now the National [[nist|Institute of Standards and Technology]] ([[nist|NIST]]) – presented a detailed implementation design specification of the DYSEAC. Without carefully reading the entire specification, one could be misled by summary language in the introduction, as to the nature of this machine. The initial section of the introduction advises that major emphasis will be focused upon the requirements of the intended applications, and these applications would require flexible communication. However, suggesting the external devices could be typewriters, [[Magnetic storage|magnetic medium]], and [[Cathode ray tube|CRTs]], and with the term “[[Input/output|input-output operation]]” used more than once, could quickly limit any paradigm of this system to a complex centralized “ensemble.” Seemingly, saving the best for last, the author eventually describes the true nature of the system.
Line 182 ⟶ 191:
 
====Multi-programming abstraction====
'''The Lincoln TX-2'''<ref name=lincoln_tx2/> (1957)
'''The Lincoln TX-2'''<ref>Forgie, J. W. 1957. The Lincoln TX-2 input-output system. In Papers Presented At the February 26-28, 1957, Western Joint Computer Conference: Techniques For Reliability (Los Angeles, California, February 26 - 28, 1957). IRE-AIEE-ACM '57 (Western). ACM, New York, NY, 156-160.</ref> (1957)
 
Described as an input-output system of experimental nature, the Lincoln TX-2 placed a premium on flexibility in its association of simultaneously operational input-output devices. The design of the TX-2 was modular, supporting a high degree of modification and expansion, as well as flexibility in operating and programming of its devices. The system employed The Multiple-Sequence Program Technique.
Line 191 ⟶ 200:
 
====Memory access abstraction====
'''Intercommunicating Cells, Basis for a Distributed Logic Computer'''<ref>Lee, C. Y. 1962. Intercommunicating cells, basis for a distributed logic computer. In Proceedings of the December 4-6, 1962, Fall Joint Computer Conference (Philadelphia, Pennsylvania, December 04 - 06, 1962). AFIPS '62 (Fall).<name=intercomm_cells/ref> (1962)
 
One early memory access paradigm was Intercommunicating Cells, where a cell is composed of a collection of [[Computer data storage|memory]] elements. A memory element was basically a electronic [[flip-flop]] or [[relay]], capable of two possible values. Within a cell there are two types of elements, symbol and cell elements. Each cell structure stores [[data]] in a [[String (computer science)|string]] of symbols, consisting of a [[Identifier|name]] and a set of associated [[parameter]]s. Consequently, a system's information is linked through various associations of cells.