Load (computing): Difference between revisions

Content deleted Content added
Unix-style load calculation: mention also "/proc/stat" that is more level in addition to /proc/loadavg that is easily readable
No edit summary
Line 12:
The [[W (Unix)|<code>w</code>]] and [[Top (Unix)|<code>top</code>]] commands show the same three load average numbers, as do a range of [[graphical user interface]] utilities.
 
In operating systems based on the [[Linux (kernel)|Linux kernel]], thesethis information can be easily accessed by reading the [[procfs|<code>/proc/loadavg</code>]] file.
 
To explore this kind of information in dept, according to the Linux's [[Filesystem Hierarchy Standard]], architecture-dependent information are exposed on the file <code>/proc/stat</code>.<ref>{{Cite web
|url = https://www.kernel.org/doc/html/latest/admin-guide/cpu-load.html
|title = CPU load
Line 29:
}}</ref>
 
An idle computer has a load number of 0 (the idle process is not counted). Each [[process (computing)|process]] using or waiting for [[Central processing unit|CPU]] (the ''ready queue'' or [[run queue]]) increments the load number by 1. Each process that terminates decrements it by 1. Most UNIX systems count only processes in the ''running'' (on CPU) or ''runnable'' (waiting for CPU) [[Process state|states]]. However, Linux also includes processes in uninterruptible sleep states (usually waiting for [[Hard disk drive|disk]] activity), which can lead to markedly different results if many processes remain blocked in [[Input/output|I/O]] due to a busy or stalled I/O system.<ref>{{Cite web|url=http://linuxtechsupport.blogspot.com/2008/10/what-exactly-is-load-average.html|title=Linux Tech Support: What exactly is a load average?|date=23 October 2008}}</ref> This, for example, includes processes blocking due to an [[Network File System|NFS]] server failure or too slow [[Data storage|media]] (e.g., [[USB]] 1.x storage devices). Such circumstances can result in an elevated load average, which does not reflect an actual increase in CPU use (but still gives an idea of how long users have to wait).
 
Systems calculate the load ''average'' as the [[Moving average#Exponential moving average|exponentially damped/weighted moving average]] of the load ''number''. The three values of load average refer to the past one, five, and fifteen minutes of system operation.<ref name="drdobbs">{{cite web |url=http://www.linuxjournal.com/article/9001 |title=Examining Load Average |first=Ray |last=Walker |date=1 December 2006 |work=Linux Journal |access-date=13 March 2012 }}</ref>
 
Mathematically speaking, all three values always average all the system load since the system started up. They all decay exponentially, but they decay at different ''speeds'': they decay exponentially by ''e'' after 1, 5, and 15 minutes respectively. Hence, the 1-minute load average consists of 63% (more precisely: 1 - 1/''e'') of the load from the last minute and 37% (1/''e'') of the average load since start up, excluding the last minute. For the 5- and 15-minute load averages, the same 63%/37% ratio is computed over 5 minutes and 15 minutes, respectively. Therefore, it is not technically accurate that the 1-minute load average only includes the last 60 seconds of activity, as it includes 37% of the activity from the past, but it is correct to state that it includes ''mostly'' the last minute.
 
=== Interpretation ===
Line 40:
For example, one can interpret a load average of "1.73 0.60 7.98" on a single-CPU system as:
 
* duringDuring the last minute, the system was overloaded by 73% on average (1.73 runnable processes, so that 0.73 processes had to wait for a turn for a single CPU system on average).
* duringDuring the last 5 minutes, the CPU was idling 40% of the time, on average.
* duringDuring the last 15 minutes, the system was overloaded 698% on average (7.98 runnable processes, so that 6.98 processes had to wait for a turn for a single CPU system on average).
 
This means that this system (CPU, disk, memory, etc.) could have handled all of the work scheduled for the last minute if it were 1.73 times as fast.
 
In a system with four CPUs, a load average of 3.73 would indicate that there were, on average, 3.73 processes ready to run, and each one could be scheduled into a CPU.
Line 51:
 
== CPU load vs CPU utilization ==
The comparative study of different load indices carried out by Ferrari et al.<ref name="Empirical load">Ferrari, Domenico; and Zhou, Songnian; "[http://www.eecs.berkeley.edu/Pubs/TechRpts/1987/CSD-87-353.pdf An Empirical Investigation of Load Indices For Load Balancing Applications]", Proceedings of Performance '87, the 12th International Symposium on Computer Performance Modeling, Measurement, and Evaluation, North Holland Publishers, Amsterdam, the Netherlands, 1988, pp. 515–528</ref> reported that CPU load information based upon the CPU queue length does much better in load balancing compared to CPU utilization. The reason CPU queue length did better is probably because when a host is heavily loaded, its CPU utilization is likely to be close to 100%, and it is unable to reflect the exact load level of the utilization. In contrast, CPU queue lengths can directly reflect the amount of load on a CPU. As an example, two systems, one with 3 and the other with 6 processes in the queue, are both very likely to have utilizations close to 100%, although they obviously differ.{{original research inline|date=May 2013}}
 
== Reckoning CPU load ==