Content deleted Content added
No edit summary Tags: Mobile edit Mobile app edit Android app edit |
→See also: List of the top supercomputers in the United States |
||
(31 intermediate revisions by 24 users not shown) | |||
Line 1:
{{Short description|Use of Operative System by type of extremely powerful computer}}
A '''supercomputer operating system''' is an [[operating system]] intended for [[supercomputer]]s. Since the end of the 20th century,
Given that modern [[massively parallel]] supercomputers typically separate computations from other services by using multiple types of [[Locale (computer hardware)|nodes]], they usually run different operating systems on different nodes, e.g., using a small and efficient [[
While in a traditional multi-user computer system [[job scheduling]] is in effect a [[task scheduling|tasking]] problem for processing and peripheral resources, in a massively parallel system, the job management system needs to manage the allocation of both computational and communication resources, as well as gracefully dealing with inevitable hardware failures when tens of thousands of processors are present.<ref name=Yariv >Open Job Management Architecture for the Blue Gene/L Supercomputer by Yariv Aridor et al
[[File:Operating systems used on top 500 supercomputers.svg|thumb|right|Operating systems used on top 500 supercomputers]]
▲ |url=http://www.zdnet.com/linux-continues-to-rule-supercomputers-7000016968/ |title=Linux continues to rule supercomputers |last=Vaughn-Nic<ref name=Padua426 /><ref>{{cite web |url=http://www.top500.org/overtime/list/32/os |title=Top500 OS chart |publisher=Top500.org |date= |accessdate=2010-10-31 |deadurl=yes |archiveurl=https://web.archive.org/web/20120305234455/http://www.top500.org/overtime/list/32/os |archivedate=2012-03-05 |df= }}</ref>
==Context and overview==
In the early days of supercomputing, the basic architectural concepts were evolving rapidly, and [[system software]] had to follow hardware innovations that usually took rapid turns.<ref name=Padua426 /> In the early systems, operating systems were custom tailored to each supercomputer to gain speed, yet in the rush to develop them, serious software quality challenges surfaced and in many cases the cost and complexity of system software development became as much an issue as that of hardware.<ref name=Padua426 />
[[File:Pleiades supercomputer.jpg|thumb|240px|left|The supercomputer center at [[NASA Ames]]]]
In the 1980s the cost for software development at [[Cray]] came to equal what they spent on hardware and that trend was partly responsible for a move away from the in-house operating systems to the adaptation of generic software.<ref name=MacKenzie >''Knowing machines: essays on technical change'' by Donald MacKenzie 1998 {{ISBN|0-262-63188-1}} page
By the early 1990s, major changes were occurring in supercomputing system software.<ref name=Padua426 >''Encyclopedia of Parallel Computing'' by David Padua 2011 {{ISBN|0-387-09765-1}} pages
The separation of the operating system into separate components became necessary as supercomputers developed different types of nodes, e.g., compute nodes versus I/O nodes.
==Early systems==
[[File:Cray 1 IMG 9126.jpg|thumb|The first [[Cray-1]] (sample shown with internals) was delivered to the customer with no operating system.<ref>''Targeting the computer: government support and international competition'' by Kenneth Flamm 1987 {{ISBN|0-8157-2851-4}} page 82 [https://books.google.com/books?id=6sf0g4q5Ue8C
The [[CDC 6600]], generally considered the first supercomputer in the world, ran the [[Chippewa Operating System]], which was then deployed on various other [[CDC 6000 series]] computers.<ref name=Vardalas >''The computer revolution in Canada'' by John N. Vardalas 2001 {{ISBN|0-262-22064-4}} page 258.</ref> The Chippewa was a rather simple [[job control (computing)|job control]] oriented system derived from the earlier [[CDC 3000]], but it influenced the later [[CDC KRONOS|KRONOS]] and [[CDC SCOPE (software)|SCOPE]] systems.<ref name=Vardalas /><ref>''Design of a computer: the Control Data 6600'' by James E. Thornton, Scott, Foresman Press 1970 page 163.</ref>
The first [[Cray
In developing supercomputers, rising software costs soon became dominant, as evidenced by the 1980s cost for software development at Cray growing to equal their cost for hardware.<ref name="MacKenzie"/> That trend was partly responsible for a move away from the in-house [[Cray Operating System]] to [[UNICOS]] system based on [[Unix]].<ref name=MacKenzie /> In 1985, the [[Cray
Around the same time, the [[EOS (operating system)|EOS]] operating system was developed by [[ETA Systems]] for use in their [[ETA10]] supercomputers.<ref name=Thorndyke >
Lloyd M. Thorndyke, ''The Demise of the ETA Systems'' in "Frontiers of Supercomputing II by Karyn R. Ames, Alan Brenner 1994 {{ISBN|0-520-08401-2}} pages
By the middle 1990s, despite the extant investment in older operating systems, the trend was toward the use of Unix-based systems, which also facilitated the use of interactive [[graphical user interface]]s (GUIs) for [[scientific computing]] across multiple platforms.<ref>''Frontiers of Supercomputing II'' by Karyn R. Ames, Alan Brenner 1994 {{ISBN|0-520-08401-2}} page 356.</ref> The move toward a ''commodity OS'' had opponents, who cited the fast pace and focus of Linux development as a major obstacle against adoption.<ref>{{cite web |url=http://www.sandia.gov/~rbbrigh/slides/conferences/commodity-os-ipdps03-slides.pdf |title=On the Appropriateness of Commodity Operating Systems for Large-Scale, Balanced Computing Systems |
==Modern approaches==
[[File:IBM Blue Gene P supercomputer.jpg|240px|thumb|The [[Blue Gene]]/P supercomputer at [[Argonne National Laboratory|Argonne National Lab]] ]]
The IBM [[Blue Gene]] supercomputer uses the [[CNK operating system]] on the compute nodes, but uses a modified [[Linux]]-based kernel called I/O Node Kernel ([[INK (operating system)|INK]]) on the I/O nodes.<ref name=EuroPar2004>''Euro-Par 2004 Parallel Processing: 10th International Euro-Par Conference'' 2004, by Marco Danelutto, Marco Vanneschi and Domenico Laforenza {{ISBN|3-540-22924-8}}
While in traditional multi-user computer systems and early supercomputers, [[job scheduling]] was in effect a [[task scheduling]] problem for processing and peripheral resources, in a massively parallel system, the job management system needs to manage the allocation of both computational and communication resources.<ref name=Yariv /> It is essential to tune task scheduling, and the operating system, in different configurations of a supercomputer. A typical parallel job scheduler has a [[Master/slave (technology)|master scheduler]] which instructs some number of slave schedulers to launch, monitor, and control [[Parallel computing|parallel jobs]], and periodically receives reports from them about the status of job progress.<ref name=Yariv />
Some, but not all supercomputer schedulers attempt to maintain locality of job execution. The [[PBS Pro|PBS Pro scheduler]] used on the [[Cray XT3]] and [[Cray XT4]] systems does not attempt to optimize locality on its three-dimensional [[torus interconnect]], but simply uses the first available processor.<ref name=Eitan/> On the other hand, IBM's scheduler on the Blue Gene supercomputers aims to exploit locality and minimize network contention by assigning tasks from the same application to one or more midplanes of an 8x8x8 node group.<ref name=Eitan>''Job Scheduling Strategies for Parallel Processing:'' by Eitan Frachtenberg and Uwe Schwiegelshohn 2010 {{ISBN|3-642-04632-0}} pages
==See also==
* [[Distributed operating system]]
* [[List of the top supercomputers in the United States]]
* [[Supercomputer architecture]]
* [[Usage share of operating systems#Supercomputers|Usage share of supercomputer operating systems]]
==References==
{{Reflist
{{Supercomputer operating systems}}
{{Parallel computing}}
{{Operating system}}
[[Category:Supercomputer operating systems| ]]
|