Supercomputer operating system: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 23:16, 25 February 2012 edit VanishedUserABC (talk \| contribs) 78,528 edits →Context and overview ← Previous edit		Latest revision as of 15:37, 11 July 2025 edit undo Wikideas1 (talk \| contribs) Extended confirmed users 18,615 edits →See also: List of the top supercomputers in the United States
(136 intermediate revisions by 58 users not shown)
Line 1: {{Short description\|Use of Operative System by type of extremely powerful computer}} [[File:JaguarXT5.jpg\|thumb\|330px\|The [[Jaguar (supercomputer)\|Jaguar XT5]] supercomputer at [[Oak Ridge National Laboratory\|Oak Ridge National Labs]]]]▼ A '''supercomputer operating system''' is an [[operating system]] intended for [[supercomputer]]s. Since the end of the 20th century, supercomputer operating systems have undergone major transformations, as fundamental changes have occurred in [[supercomputer architecture]].<ref name=Padua426 /> While early operating systems were custom tailored to each supercomputer to gain speed, the trend has been moving away from in-house operating systems and toward some form of [[Linux]],<ref name=MacKenzie /> with it running all the supercomputers on the [[TOP500]] list in November 2017. In 2021, top 10 computers run for instance [[Red Hat Enterprise Linux]] (RHEL), or some variant of it or other [[Linux distribution]] e.g. [[Ubuntu]]. ~~Modern~~Given that modern [[massively parallel]] supercomputers ~~may~~typically separate computations from other services by using multiple types of [[Locale (computer hardware)\|nodes]], they usually run different operating systems on different nodes, e.g., using a small and efficient [[~~Lightweight~~lightweight ~~Kernel~~kernel ~~Operating~~operating ~~System~~system\|lightweight kernel]] such as [[CNK operating system\|~~CNK~~Compute Node Kernel]] (CNK) or [[Compute Node Linux~~\|CNL~~]] (CNL) on compute nodes, but a larger ~~and more full-fledged~~ system such as a [[Linux~~]]-derivative~~ distribution on server and [[input/output]] (I/O) nodes.<ref name=EuroPar2004/><ref name=Alam>''An Evaluation of the Oak Ridge National Laboratory Cray XT3'' by Sadaf R. Alam, ~~etal~~et al., ''International Journal of High Performance Computing Applications'', February 2008 vol. 22 no. 1 ~~52-80~~52–80.</ref> ▼ ~~Since the end of the 20th century, '''supercomputer operating systems''' have undergone major transformations, as sea changes have taken place in [[supercomputer architecture]].<ref name=Padua426 />~~ While in a traditional multi-user computer system, [[job scheduling]] is in effect a [[task scheduling\|~~scheduling~~tasking]] problem for processing and peripheral resources, in a a massively parallel system, the job management system needs to manage the allocation of both computational and communication resources, as well as gracefully dealing with inevitable hardware failures when tens of thousands of processors are present.<ref name=Yariv />Open Job Management Architecture for the Blue Gene/L Supercomputer by Yariv Aridor et al in ''Job scheduling strategies for parallel processing'' by Dror G. Feitelson 2005 {{ISBN\|978-3-540-31024-2}} pages 95–101.</ref> Although most modern ~~supercomputer~~supercomputers use the Linux operating system,<ref>{{cite web \|url=https://www.zdnet.com/article/linux-continues-to-rule-supercomputers/ \|title=Linux continues to rule supercomputers \|last=Vaughn-Nichols \|first=Steven J. \|date=June 18, 2013 \|website=[[~~Linux~~ZDNet]] ~~operating~~\|access-date=June ~~system~~20, 2013}}</ref> each manufacturer has made its own specific changes to the Linux~~-derivative~~ distribution they use, and no industry standard exists, partly ~~due to the fact that~~because the differences in hardware architectures require changes to optimize the operating system to each ~~architecture~~hardware design.<ref name=Padua426 /><ref>{{cite web \|url=http://www.top500.org/overtime/list/32/os \|title=Top500 OS chart \|publisher=Top500.org \|access-date~~= \|accessdate~~=2010-10-31 \|url-status=dead \|archive-url=https://web.archive.org/web/20120305234455/http://www.top500.org/overtime/list/32/os \|archive-date=2012-03-05 }}</ref>▼ ▲Modern supercomputers may run different operating systems on different nodes, e.g. using a small and efficient [[Lightweight Kernel Operating System\|lightweight kernel]] such as [[CNK operating system\|CNK]] or [[Compute Node Linux\|CNL]] on compute nodes, but a larger and more full-fledged system such as a [[Linux]]-derivative on server and I/O nodes.<ref name=EuroPar2004/><ref name=Alam>''An Evaluation of the Oak Ridge National Laboratory Cray XT3'' by Sadaf R. Alam etal ''International Journal of High Performance Computing Applications'' February 2008 vol. 22 no. 1 52-80</ref> [[File:Operating systems used on top 500 supercomputers.svg\|thumb\|right\|Operating systems used on top 500 supercomputers]] ▲Although most modern supercomputer use the [[Linux]] operating system, each manufacturer has made its own specific changes to the Linux-derivative they use, and no industry standard exists, partly due to the fact that the differences in hardware architectures require changes to optimize the operating system to each architecture.<ref name=Padua426 /><ref>{{cite web\|url=http://www.top500.org/overtime/list/32/os \|title=Top500 OS chart \|publisher=Top500.org \|date= \|accessdate=2010-10-31}}</ref> ==Context and overview== In the early days of supercomputing, the basic architectural concepts were evolving rapidly, and [[system software]] had to follow hardware innovations that usually took rapid turns.<ref name=Padua426 /> In the early systems, operating systems were custom tailored to each supercomputer to gain speed, yet in the rush to develop them, serious software quality challenges surfaced and in many cases the cost and complexity of system software development became as much asan issue as that of hardware.<ref name=Padua426 /> [[File:Pleiades supercomputer.jpg\|thumb\|240px\|left\|The supercomputer center at [[NASA Ames]]]] In the 1980s the cost for software development at [[Cray]] came to equal what ~~the~~they spent on hardware and that trend was partly responsible for a move away from the in-house operating systems to the adaptation of generic software.<ref name=MacKenzie >''Knowing machines: essays on technical change'' by Donald MacKenzie 1998 {{ISBN ~~0262631881~~\|0-262-63188-1}} page ~~149-151~~149–151.</ref> The first wave in operating system changes came in the mid -1980s, as vendor specific operating systems were abandoned in favor of [[Unix,]]. ~~and despite~~Despite early skepticism, this transition proved successful.<ref name=Padua426 /><ref name=MacKenzie /> By the early 1990s, major changes were ~~taking place~~occurring in supercomputing system software.<ref name=Padua426 >''Encyclopedia of Parallel Computing'' by David Padua 2011 {{ISBN ~~0387097651~~\|0-387-09765-1}} pages ~~426-429~~426–429.</ref> By this time, the growing use of Unix ~~in itself~~ had ~~started~~begun to change the way system software was viewed. The use of a high level language ([[C (programming language)\|C]]) to implement the operating system, and the reliance on standardized interfaces was in contrast to the [[assembly language]] oriented approaches of the past.<ref name=Padua426 /> As hardware vendors adapted Unix to their systems, new and useful features were added to Unix, e.g., fast file systems and tunable [[process ~~schedulers~~scheduler]]s.<ref name=Padua426 /> However, all the companies that ~~adpted~~adapted Unix made unique changes to it, rather than collaborating on an industry standard to create "Unix for supercomputers". This was partly because differences in their ~~own~~architectures ~~specfic~~required these changes to ~~it,~~optimize Unix to each architecture.<ref name=Padua426 /> ~~rather than collaborating on an industry standard to create "Unix for supercomputers". This was partly due to the fact that the differences in their architectures required these changes~~ ~~to optimize UNIX to that architecture.<ref name=Padua426 />~~ ~~Thus as~~As general purpose operating systems became stable, supercomputers began to borrow and adapt ~~the~~ critical system code from them, and relied on the rich set of secondary ~~functionality~~functions that came with them~~, not having to reinvent the wheel~~.<ref name=Padua426 /> However, at the same time the size of the code for general purpose operating systems was growing rapidly~~, and~~. byBy the time ~~UNIX~~Unix-based code had reached 500,000 lines ~~of code~~long, its maintenance and use was a challenge.<ref name=Padua426 /> This resulted in the move to use [[microkernel]]s which used a minimal set of the operating system functions. Systems such as [[Mach (kernel)\|~~MACH~~Mach]] at [[Carnegie Mellon University]] and [[ChorusOS~~\|Chorus~~]] at [[INRIA]] were examples of early microkernels.<ref name=Padua426 /> The separation of the operating system into separate components became necessary as supercomputers developed different types of nodes, e.g., compute nodes versus I/O nodes. Thus modern supercomputers usually run different operating systems on different nodes, e.g., using a small and efficient [[lightweight kernel operating system\|lightweight kernel]] such as [[CNK operating system\|CNK]] or [[Compute Node Linux\|CNL]] on compute nodes, but a larger system such as a [[Linux]]-derivative on server and I/O nodes.<ref name=EuroPar2004/><ref name=Alam/> While in a traditional multi-user computer system, [[job scheduling]] is in effect a [[task scheduling\|scheduling]] problem for processing and peripheral resources, in a a massively parallel system, the job management system needs to manage the allocation of both computational and communication resources.<ref name=Yariv /> The need to tune task scheduling and tune the operating system in different configurations of a supercomputer is essential. A typical parallel job scheduler has a master scheduler which instructs a number of slave schedulers to launch, monitor and control parallel jobs, and periodically receives reports from them about the status of job progress.<ref name=Yariv />▼ The separation of the operating system into separate components was also necessary as upercomputers developed different types of nodes, e.g. compute nodes vs I/O nodes. And on the same supercomputer, different Linux-based OS may be running, e.g. Cray uses compute Linux on some nodes, another Linux on another and the "entire OS" is really made of a combination of multiple operating systems. ==Early systems== [[File:Cray 1 IMG 9126.jpg\|thumb\|The first [[Cray-1 ]] ~~supercomputer~~ (sample shown with internals) was delivered to the customer ~~without~~with anno [[operating system]].<ref>''Targeting the computer: government support and international competition'' by Kenneth ~~name=~~Flamm 1987 {{ISBN\|0-8157-2851-4}} page 82 [https://books.google.com/books?id=6sf0g4q5Ue8C&dq=%22Cray-1%22+delivered+%22without+software%22+%22operating+system%22&pg=PA82]</ref>]] The [[CDC 6600]], generally considered the first supercomputer in the world, ran the [[Chippewa Operating System]], which was then deployed on various other [[CDC 6000 series]] computers. <ref name=Vardalas >''The computer revolution in Canada'' by John N. Vardalas 2001 {{ISBN ~~0262220644~~\|0-262-22064-4}} page 258.</ref> The Chippewa was a rather simple [[job control (computing)\|job control]] oriented system derived from the earlier [[CDC 3000]], but it influenced the later [[CDC KRONOS\|KRONOS]] and [[CDC SCOPE (software)\|SCOPE]] systems.<ref name=Vardalas /><ref>''Design of a computer: the Control Data 6600'' by James E. Thornton, Scott, Foresman Press 1970 page 163.</ref~~><ref name=Vardalas /~~> The first [[Cray -1]] was delivered to the Los Alamos Lab ~~without~~with anno operating system, or any other software.<ref name=Flamm >''Targeting the computer: government support and international competition'' by Kenneth Flamm 1987 {{ISBN ~~0815728514~~\|0-8157-2851-4}} pages ~~81-83~~81–83.</ref> Los Alamos developed ~~not only~~ the application software for it, ~~but also~~and the operating system.<ref name=Flamm /> The main timesharing system for the Cray 1, the [[Cray Time Sharing System]] (CTSS), was then developed at the Livermore Labs as a direct descendant of the [[Livermore Time Sharing System]] (~~LTTS~~LTSS) for the CDC 6600 operating system from twenty years earlier.<ref name=Flamm /> ~~The~~In developing supercomputers, rising software costs ~~in developing a supercomputing~~ soon became dominant, as evidenced by ~~the fact that in~~ the 1980s ~~the~~ cost for software development at Cray ~~came~~growing to equal ~~what~~their ~~they~~cost ~~spent on~~for hardware.<ref name="MacKenzie ~~>''Knowing machines: essays on technical change'' by Donald MacKenzie 1998 ISBN 0262631881 page 149-151<~~"/~~ref~~> That trend was partly responsible for a move away from the in-house, [[Cray Operating System]] to [[UNICOS]] system based on [[Unix]].<ref name=MacKenzie /> In 1985, the [[Cray -2]] was the first system to ship with the UNICOS operating system.<ref name="Power ">Lester T. Davis, ''The balance of power, a brief history of Cray Research hardware architectures'' in "High performance computing: technology, methods, and applications" by J. J. Dongarra 1995 {{ISBN ~~0444821635~~\|0-444-82163-5}} page 126 [~~http~~https://books.google.com/books?id=iqSWDaSFNvkC~~&pg=PA126~~&dq=cray+2++%22operating+system%22&hlpg=~~en&ei=yN8-TqWBHYP1sgb5mKgL&sa=X&oi=book_result&ct=result&resnum=1&ved=0CC8Q6AEwAA#v=onepage&q=cray%202%20%20%22operating%20system%22&f=false~~PA126].</ref> Around the same time, the [[EOS (operating system)\|EOS]] operating system was developed by [[ETA Systems]] for use in their [[ETA10]] supercomputers ~~in the 1980s~~.<ref name=Thorndyke > Lloyd M. Thorndyke, ''The Demise of the ETA Systems'' in "Frontiers of Supercomputing II by Karyn R. Ames, Alan Brenner 1994 {{ISBN ~~0520084012~~\|0-520-08401-2}} pages ~~489-497~~489–497.</ref> Written in [[Cybil (~~computer~~programming language)\|Cybil]], a Pascal-like language from [[Control Data Corporation]], EOS highlighted the stability problems in developing stable operating systems for supercomputers and eventually a Unix-like system was offered on the same machine.<ref name=Thorndyke /><ref>''Past, present, parallel: a survey of available parallel computer systems'' by Arthur Trew 1991 {{ISBN ~~3540196641~~\|3-540-19664-1}} page 326.</ref~~><ref name=Thorndyke /~~> The lessons learned from ~~the development of~~developing ETA system software included the high level of risk associated with ~~the development of~~developing a new supercomputer operating system, and the advantages of using Unix with its large ~~existing~~extant base of system software libraries.<ref name=Thorndyke /> By the middle of 1990s, despite the ~~existing~~extant investment in older operating systems, the ~~general~~ trend was ~~towards~~toward the use of Unix-based systems, which also facilitated the use of interactive [[graphical user ~~interfaces~~interface]]s (GUIs) for [[scientific computing]] across multiple platforms.<ref>''Frontiers of Supercomputing II'' by Karyn R. Ames, Alan Brenner 1994 {{ISBN ~~0520084012~~\|0-520-08401-2}} page 356.</ref> ~~That~~The ~~trend~~move ~~continued~~toward toa ~~build~~''commodity ~~momentum~~OS'' ~~and~~had ~~by 2005~~opponents, ~~the~~who ~~[[United~~cited ~~States~~the ~~National~~fast ~~Research~~pace ~~Council]]'s~~and ~~review~~focus of ~~supercomputing~~Linux ~~could~~development ~~directly~~as ~~state~~a major obstacle against adoption.<ref>{{cite web \|url=http://www.sandia.gov/~rbbrigh/slides/conferences/commodity-os-ipdps03-slides.pdf \|title=On the Appropriateness of Commodity Operating Systems for Large-Scale, Balanced Computing Systems \|access-date=January 29, 2013 \|author=Brightwell, Ron Riesen, Rolf Maccabe, Arthur}}</ref> As one author wrote "Linux will likely catch up, but we have large-scale systems now". Nevertheless, that trend continued to gain momentum and by 2005, virtually all supercomputers ~~today use~~used some ~~variant~~[[Unix-like]] ~~of UNIX"~~OS.<ref name=National136 >''Getting up to speed: the future of supercomputing'' by Susan L. Graham, Marc Snir, Cynthia A. Patterson, National Research Council 2005 {{ISBN ~~0309095026~~\|0-309-09502-6}} page 136.</ref> These variants of ~~UNIX~~Unix ~~include~~included [[IBM AIX]] ~~from IBM~~, the open source [[Linux]] system, and other adaptations such as [[UNICOS]] from Cray.<ref name=National136 /> ~~Linux is estimated to command~~By the ~~highest share~~end of the ~~supercomputing~~20th ~~pie~~century, ~~but~~Linux ~~these~~was ~~are~~estimated ~~only~~to ~~estimates~~command ~~since~~the ~~some~~highest ~~sites~~share ~~do not reveal~~of the ~~exact~~supercomputing ~~operating~~pie.<ref ~~system~~name=Padua426 ~~they use.~~/><ref>[~~http~~https://www.forbes.com/2005/03/15/cz_dl_0315linux.html Forbes magazine, 03.15.05: ''Linux Rules Supercomputers'']</ref> ==Modern approaches== ▲[[File:~~JaguarXT5~~IBM Blue Gene P supercomputer.jpg\|240px\|thumb~~\|330px~~\|The [[~~Jaguar~~Blue ~~(supercomputer)\|Jaguar XT5~~Gene]]/P supercomputer at [[~~Oak Ridge~~Argonne National Laboratory\|~~Oak Ridge~~Argonne National ~~Labs~~Lab]] ]] The IBM [[Blue Gene]] ~~runs different operating systems on different nodes. It~~supercomputer uses the [[CNK operating system]] on the compute nodes, but uses a modified [[Linux]]-based kernel called ~~INK (for~~ I/O Node Kernel ([[INK (operating system)\|INK]]) on the I/O nodes.<ref name=EuroPar2004>''Euro-Par 2004 Parallel Processing: 10th International Euro-Par Conference'' 2004, by Marco Danelutto, Marco Vanneschi and Domenico Laforenza {{ISBN\|3-540-22924-8}} ~~3540229248 pages~~page 835.</ref><ref name=EuroPar2006 >''Euro-Par 2006 Parallel Processing: 12th International Euro-Par Conference'', 2006, by Wolfgang E. Nagel, Wolfgang V. Walter and Wolfgang Lehner {{ISBN\|3-540-37783-2}}<!-- ~~3540377832~~stray: page -->.</ref> CNK is a [[~~Lightweight~~lightweight ~~Kernel~~kernel ~~Operating~~operating ~~System~~system\|lightweight kernel]] that runs on each node and supports a single application running for a single user on that node. For the sake of efficient operation, the design of CNK was kept simple and minimal, with physical memory being statically mapped and the CNK neither needing nor providing scheduling or context switching.<ref name=EuroPar2004 /> CNK does not even implement [[Input/output\|file I/O]] on the compute node, but delegates that to dedicated I/O nodes.<ref name=EuroPar2006 /> However, given that on the Blue Gene multiple compute nodes share a single I/O node, the I/O node operating system does require multi-tasking, hence the selection of the Linux-based operating system.<ref name=EuroPar2004/><ref name=EuroPar2006/> ▲While in a traditional multi-user computer ~~system~~systems and early supercomputers, [[job scheduling]] iswas in effect a [[task ~~scheduling\|~~scheduling]] problem for processing and peripheral resources, in a a massively parallel system, the job management system needs to manage the allocation of both computational and communication resources.<ref name=Yariv /> ~~The~~It ~~need~~is essential to tune task scheduling, and ~~tune~~ the operating system, in different configurations of a supercomputer ~~is essential~~. A typical parallel job scheduler has a [[Master/slave (technology)\|master scheduler]] which instructs asome number of slave schedulers to launch, monitor, and control [[Parallel computing\|parallel jobs]], and periodically receives reports from them about the status of job progress.<ref name=Yariv /> Some, but not all supercomputer schedulers attempt to maintain locality of job execution. The [[PBS Pro\|PBS Pro scheduler]] used on the [[Cray XT3]] and [[Cray XT4]] systems does not attempt to optimize locality on its three-dimensional [[torus interconnect]], but simply uses the first available processor.<ref name=Eitan/> On the other hand, IBM's scheduler on the Blue Gene supercomputers aims to exploit locality and minimize network contention by assigning tasks from the same application to one or more midplanes of an 8x8x8 node group.<ref name=Eitan>''Job Scheduling Strategies for Parallel Processing:'' by Eitan Frachtenberg and Uwe Schwiegelshohn 2010 {{ISBN\|3-642-04632-0}} pages 138–144.</ref> The [[Slurm Workload Manager]] scheduler uses a best fit algorithm, and performs [[Hilbert curve scheduling]] to optimize locality of task assignments.<ref name=Eitan/> Several modern supercomputers such as the [[Tianhe-2]] use Slurm, which arbitrates contention for resources across the system. Slurm is [[Open-source software\|open source]], Linux-based, very scalable, and can manage thousands of nodes in a computer cluster with a sustained throughput of over 100,000 jobs per hour.<ref>[http://slurm.schedmd.com/ SLURM at SchedMD]</ref><ref>Jette, M. and M. Grondona, ''SLURM: Simple Linux Utility for Resource Management'' in the Proceedings of ClusterWorld Conference, San Jose, California, June 2003 [http://www.schedmd.com/slurmdocs/slurm_design.pdf]</ref> ==See also== * [[Distributed operating system]] * [[List of the top supercomputers in the United States]] * [[Supercomputer architecture]] * [[Usage share of operating systems#Supercomputers\|Usage share of supercomputer operating systems]] ==References== {{Reflist}} {{Supercomputer operating systems}} [[Category:Supercomputers]]▼ {{Parallel computing}} {{Operating system}} [[Category:Supercomputer operating systems\| ]] [[Category:Operating systems]] ▲[[Category:Supercomputers\|*Supercomputer operating systems]]