Load (computing): Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 11:03, 12 October 2022 edit 149.140.247.196 (talk) No edit summary Tags: Reverted Mobile edit Mobile web edit ← Previous edit		Latest revision as of 17:40, 23 May 2025 edit undo Comp.arch (talk \| contribs) Extended confirmed users 41,514 edits m →Reckoning CPU load: Hz is correct upper case, while lower as hertz. Tag: 2017 wikitext editor
(19 intermediate revisions by 14 users not shown)
Line 1: {{Short description\|Amount of computational work that a computer system performs}} system '''load''' is a measure of the amount of computational work that a computer system performs. The '''load average''' represents the average system load over a period of time. It conventionally appears in the form of three numbers which represent the system load during the and fifteen-minute periods.▼ {{Use dmy dates\|date=September 2022}} {{more citations needed\|date=November 2010}} ▲[[File:Big-load.png\|thumb\|[[htop]] displaying a significant computing load (top right: ''Load average:'')]] In [[UNIX]] [[computing]], the system '''load''' is a measure of the amount of computational work that a computer system performs. The '''load average''' represents the average system load over a period of time. It conventionally appears in the form of three numbers which represent the system load during the last one-, five-, and fifteen-minute periods. == Unix-style load calculation == All Unix and Unix-like systems generate a dimensionless [[Software metric\|metric]] of three "load average" numbers in the [[kernel (~~computer~~operating ~~science~~system)\|kernel]]. Users can easily query the current result from a [[Unix shell]] by running the <code>[[uptime]]</code> command: <syntaxhighlight lang="console"> $ uptime 14:34:03 up 10:43, 4 users, load average: 0.06, 0.11, 0.09 </syntaxhighlight> The [[W (Unix)\|<code>w</code>]] and [[~~Top~~top (~~Unix~~software)\|<code>top</code>]] commands show the same three load average numbers, as do a range of [[graphical user interface]] utilities. ~~In [[Linux]], they can also be accessed by reading the [[procfs\|<code>/proc/loadavg</code>]] file.~~ In operating systems based on the [[Linux kernel]], this information can be easily accessed by reading the [[procfs\|<code>/proc/loadavg</code>]] file. An idle computer has a load number of 0 (the idle process is not counted). Each [[process (computing)\|process]] using or waiting for [[Central processing unit\|CPU]] (the ''ready queue'' or [[run queue]]) increments the load number by 1. Each process that terminates decrements it by 1. Most UNIX systems count only processes in the ''running'' (on CPU) or ''runnable'' (waiting for CPU) [[Process states\|states]]. However, Linux also includes processes in [[uninterruptible sleep]] states (usually waiting for [[Hard disk\|disk]] activity), which can lead to markedly different results if many processes remain blocked in [[Input/output\|I/O]] due to a busy or stalled I/O system.<ref>{{Cite web\|url=http://linuxtechsupport.blogspot.com/2008/10/what-exactly-is-load-average.html\|title=Linux Tech Support: What exactly is a load average?\|date=23 October 2008}}</ref> This, for example, includes processes blocking due to an [[Network File System (protocol)\|NFS]] server failure or too slow [[Data storage device\|media]] (e.g., [[Universal Serial Bus\|USB]] 1.x storage devices). Such circumstances can result in an elevated load average which does not reflect an actual increase in CPU use (but still gives an idea of how long users have to wait).▼ To explore this kind of information in depth, according to the Linux's [[Filesystem Hierarchy Standard]], architecture-dependent information are exposed on the file <code>/proc/stat</code>.<ref>{{Cite web Systems calculate the load ''average'' as the [[Moving average#Exponential moving average\|exponentially damped/weighted moving average]] of the load ''number''. The three values of load average refer to the past one, five, and fifteen minutes of system operation.<ref name="drdobbs">{{cite web \|url=http://www.linuxjournal.com/article/9001 \|title=Examining Load Average \|first=Ray \|last=Walker \|date=1 December 2006 \|work=Linux Journal \|access-date=13 March 2012 }}</ref>▼ \|url = https://www.kernel.org/doc/html/latest/admin-guide/cpu-load.html \|title = CPU load \|access-date=2023-10-04 }}</ref><ref>{{Cite web \|url = https://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/proc.html \|title = /proc \|access-date=2023-10-04 \|website = Linux Filesystem Hierarchy }}</ref><ref>{{Cite web \|url = https://www.kernel.org/doc/html/latest/filesystems/proc.html#miscellaneous-kernel-statistics-in-proc-stat \|title = Miscellaneous kernel statistics in /proc/stat \|access-date=2023-10-04 }}</ref> ▲An idle computer has a load number of 0 (the idle process is not counted). Each [[process (computing)\|process]] using or waiting for [[Central processing unit\|CPU]] (the ''ready queue'' or [[run queue]]) increments the load number by 1. Each process that terminates decrements it by 1. Most UNIX systems count only processes in the ''running'' (on CPU) or ''runnable'' (waiting for CPU) [[Process ~~states~~state\|states]]. However, Linux also includes processes in [[uninterruptible sleep]] states (usually waiting for [[Hard disk drive\|disk]] activity), which can lead to markedly different results if many processes remain blocked in [[Input/output\|I/O]] due to a busy or stalled I/O system.<ref>{{Cite web\|url=~~http~~https://linuxtechsupport.blogspot.com/2008/10/what-exactly-is-load-average.html\|title=Linux Tech Support: What exactly is a load average?\|date=23 October 2008}}</ref> This, for example, includes processes blocking due to an [[Network File System ~~(protocol)~~\|NFS]] server failure or too slow [[Data storage ~~device~~\|media]] (e.g., [[~~Universal Serial Bus\|~~USB]] 1.x storage devices). Such circumstances can result in an elevated load average, which does not reflect an actual increase in CPU use (but still gives an idea of how long users have to wait). Mathematically speaking, all three values always average all the system load since the system started up. They all decay exponentially, but they decay at different ''speeds'': they decay exponentially by ''e'' after 1, 5, and 15 minutes respectively. Hence, the 1-minute load average consists of 63% (more precisely: 1 - 1/''e'') of the load from the last minute and 37% (1/''e'') of the average load since start up, excluding the last minute. For the 5- and 15-minute load averages, the same 63%/37% ratio is computed over 5 minutes and 15 minutes respectively. Therefore, it is not technically accurate that the 1-minute load average only includes the last 60 seconds of activity, as it includes 37% of the activity from the past, but it is correct to state that it includes ''mostly'' the last minute.▼ ▲Systems calculate the load ''average'' as the [[Moving average#Exponential moving average\|exponentially damped/weighted moving average]] of the load ''number''. The three values of load average refer to the past one, five, and fifteen minutes of system operation.<ref name="drdobbs">{{cite web \|url=~~http~~https://www.linuxjournal.com/article/9001 \|title=Examining Load Average \|first=Ray \|last=Walker \|date=1 December 2006 \|work=Linux Journal \|access-date=13 March 2012 }}</ref> ▲Mathematically speaking, all three values always average all the system load since the system started up. They all decay exponentially, but they decay at different ''speeds'': they decay exponentially by ''e'' after 1, 5, and 15 minutes respectively. Hence, the 1-minute load average consists of 63% (more precisely: 1 - 1/''e'') of the load from the last minute and 37% (1/''e'') of the average load since start up, excluding the last minute. For the 5- and 15-minute load averages, the same 63%/37% ratio is computed over 5 minutes and 15 minutes, respectively. Therefore, it is not technically accurate that the 1-minute load average only includes the last 60 seconds of activity, as it includes 37% of the activity from the past, but it is correct to state that it includes ''mostly'' the last minute. === Interpretation === Line 20 ⟶ 41: For example, one can interpret a load average of "1.73 0.60 7.98" on a single-CPU system as: * ~~during~~During the last minute, the system was overloaded by 73% on average (1.73 runnable processes, so that 0.73 processes had to wait for a turn for a single CPU system on average). * ~~during~~During the last 5 minutes, the CPU was idling 40% of the time, on average. * ~~during~~During the last 15 minutes, the system was overloaded 698% on average (7.98 runnable processes, so that 6.98 processes had to wait for a turn for a single CPU system on average). This means that this system (CPU, disk, memory, etc.) could have handled all of the work scheduled for the last minute if it were 1.73 times as fast. In a system with four CPUs, a load average of 3.73 would indicate that there were, on average, 3.73 processes ready to run, and each one could be scheduled into a CPU. On modern UNIX systems, the treatment of [[Thread (~~computer science~~computing)\|threading]] with respect to load averages varies. Some systems treat threads as processes for the purposes of load average calculation: each thread waiting to run will add 1 to the load. However, other systems, especially systems implementing so-called [[Thread (~~computer science~~computing)#M:N (hybrid threading)\|M:N threading]], use different strategies such as counting the process exactly once for the purpose of load (regardless of the number of threads), or counting only threads currently exposed by the user-thread scheduler to the kernel, which may depend on the level of concurrency set on the process. Linux appears to count each thread separately as adding 1 to the load.<ref>See http://serverfault.com/a/524818/27813</ref> == CPU load vsvis-à-vis CPU utilization == The comparative study of different load indices carried out by Ferrari et al.<ref name="Empirical load">Ferrari, Domenico; and Zhou, Songnian; "[http://www.eecs.berkeley.edu/Pubs/TechRpts/1987/CSD-87-353.pdf An Empirical Investigation of Load Indices For Load Balancing Applications]", Proceedings of Performance '87, the 12th International Symposium on Computer Performance Modeling, Measurement, and Evaluation, North Holland Publishers, Amsterdam, ~~The~~the Netherlands, 1988, pp. 515–528</ref> reported that CPU load information based upon the CPU queue length does much better in load balancing compared to CPU utilization. The reason CPU queue length did better is probably because when a host is heavily loaded, its CPU utilization is likely to be close to 100%, and it is unable to reflect the exact load level of the utilization. In contrast, CPU queue lengths can directly reflect the amount of load on a CPU. As an example, two systems, one with 3 and the other with 6 processes in the queue, are both very likely to have utilizations close to 100%, although they obviously differ.{{original research inline\|date=May 2013}} == Reckoning CPU load == On Linux systems, the load-average is not calculated on each clock tick, but driven by a variable value that is based on the HZ frequency setting and tested on each clock tick. This setting defines the kernel clock tick rate in [[~~Hertz~~hertz]] (times per second), and it defaults to 100 for ~~10ms~~10 ms ticks. Kernel activities use this number of ticks to time themselves. Specifically, the timer.c::calc_load() function, which calculates the load average, runs every {{tt\|1=LOAD_FREQ = (5HZ+1)}} ticks, or about every five seconds: <syntaxhighlight lang="c"> Line 73 ⟶ 94: The "sampled" calculation of load averages is a somewhat common behavior; FreeBSD, too, only refreshes the value every five seconds. The interval is usually taken to not be exact so that they do not collect processes that are scheduled to fire at a certain moment.<ref>{{cite web \|title=How is load average calculated on FreeBSD? \|url=https://unix.stackexchange.com/a/342778 \|website=Unix & Linux Stack Exchange}}</ref> A post on the Linux mailing list considers its {{tt\|+1}} tick insufficient to avoid ~~Moire~~[[Moiré pattern\|Moiré artifacts]] from such collection, and suggests an interval of 4.61 seconds instead.<ref>{{cite web \|last1=Ripke \|first1=Klaus \|title=Linux-Kernel Archive: LOAD_FREQ (4HZ+61) avoids loadavg Moire \|url=~~http~~https://lkml.iu.edu/hypermail/linux/kernel/1111.1/02446.html \|website=lkml.iu.edu \|date=2011}} [~~http~~https://ripke.com/loadavg/moire graph & patch<!-- Actual title at target: Moiré patterns in linux load average (Moiré mangled in title as MoirÃ©) -->]</ref> This change is common among [[Android (operating system)\|Android system]] kernels, although the exact expression used assumes an HZ of 100.<ref>{{cite web \|title=Patch kernel with the 4.61s load thing · Issue #2109 · AOSC-Dev/aosc-os-abbs \|url=https://github.com/AOSC-Dev/aosc-os-abbs/issues/2109 \|website=GitHub \|language=en}}</ref> == Other system performance commands == Other commands for assessing system performance include: * <code>[[uptime]]</code>{{Snd}} the system reliability and load average * <code>[[~~Top~~top (~~Unix~~software)\|~~<code>~~top]]</code>]]{{Snd}} for an overall system view * ~~[[Vmstat (Unix)\|~~<code>[[vmstat]]</code>]]{{Snd}} vmstat reports information about runnable or blocked processes, memory, paging, block I/O, traps, and CPU. * ~~[[Htop (Unix)\|~~<code>[[htop]]</code>]]{{Snd}} interactive process viewer * <code>dool</code> (formerly <code>dstat</code>),<ref>{{cite web \|url=https://github.com/scottchiefbaker/dool \|title=dool - Python3 compatible clone of dstat \|last=Baker \|first=Scott \|date=September 28, 2022 \|website=[[GitHub]] \|access-date=November 22, 2022 \|quote=...Dag Wieers ceased development of Dstat...}}</ref> <code>atop</code>{{Snd}} helps correlate all existing resource data for processes, memory, paging, block I/O, traps, and CPU activity. * ~~[[iftop\|~~<code>[[iftop]]</code>]]{{Snd}} interactive network traffic viewer per interface * <code>nethogs</code>{{Snd}} interactive network traffic viewer per process * <code>iotop</code>{{Snd}} interactive I/O viewer<ref>{{Cite web\|url=~~http~~https://man7.org/linux/man-pages/man8/iotop.8.html\|title = Iotop(8) - Linux manual page}}</ref> * ~~[[Iostat (Unix)\|~~<code>[[iostat]]</code>]]{{Snd}} for storage I/O statistics * ~~[[Netstat (Unix)\|~~<code>[[netstat]]</code>]]{{Snd}} for network statistics * <code>[[mpstat]]</code>{{Snd}} for CPU statistics * <code>tload</code>{{Snd}} load average graph for terminal Line 95 ⟶ 117: * [[CPU usage]] == References == {{reflist}} == External links == * {{cite web \|author = Brendan Gregg \|title = Linux Load Averages: Solving the Mystery \|url = ~~http~~https://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html \|date = 8 August 2017 \|access-date = 2018-01-22 Line 115 ⟶ 137: * {{cite web \|title = Understanding Linux CPU Load{{Snd}} when should you be worried? \|url = ~~http~~https://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages \|author = Andre Lewis \|date = 31 July 2009 Line 123 ⟶ 145: \|author = Ray Walker \|title = Examining Load Average \|url = ~~http~~https://www.linuxjournal.com/article/9001 \|publisher = Linux Journal \|date = 1 December 2006