Slurm Workload Manager: Difference between revisions

Content deleted Content added
more cleanup
more cleanup
Line 1:
'''Simple Linux Utility for Resource Management''' (or simply '''SLURM''') is an [[opensource]] [[job scheduler]] forused by many of the world's [[supercomputer]]s and computer clusters. It provides three key functions. First it allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job such as [[Message Passing Interface|MPI]]) on a set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending workjobs.
 
SLURM is the batch system on many of the world's fastest [[TOP500]] supercomputersupercomputers, including the fastest one in the world, China's [[Tianhe-I]]. ItSLURM is designed to handle thousands of nodes in a single cluster and can sustain throughput of 120,000 jobs per hour. While complex configuration options are available, simple configurations can be established in a few minutes.
 
==History ==
SLURM began development as a collaborative effort primarily by [[Lawrence Livermore National Laboratory]], Linux NetworX, [[Hewlett-Packard]], and [[Groupe Bull]] as an [[Openopen Source]] resource manager. It has since evolved into a sophisticated batch scheduler capable of satisfying the requirements of many large computer centers. SLURM is currently used on many of the largest computers in the world.
 
==Structure==
SLURM's design is very modular with dozens of optional plugins. In its simplest configuration, it can be installed and configured in a couple of minutes. More sophisticated configurations provide database integration for accounting plus, management of resource limits and workload prioritization. SLURM also works with several meta-schedulers such as [[Moab Cluster Suite]], [[Maui Cluster Scheduler]], and [[Platform LSF]].
 
==License==