Floating point operations per second: Difference between revisions

Content deleted Content added
Added Nvidia Hopper
Partial restore with corrected model
 
(21 intermediate revisions by 14 users not shown)
Line 5:
'''Floating point operations per second''' ('''FLOPS''', '''flops''' or '''flop/s''') is a measure of [[computer performance]] in [[computing]], useful in fields of scientific computations that require [[floating-point]] calculations.<ref>{{cite web |title=Understand measures of supercomputer performance and storage system capacity |url=https://kb.iu.edu/d/apeq |website=kb.iu.edu |access-date=23 March 2024}}</ref>
 
For such cases, it is a more accurate measure than measuring [[instructions per second]].{{cn|date=March 2024}}
 
==Floating-point arithmetic==
{{Anchor|multipliers}}
{| class="wikitable floatright sortable"
|+ Multipliers for flops
Line 23 ⟶ 24:
|-
| [[Giga-|giga]]FLOPS
| GFLOPS<ref>{{cite web | title = GPU GFLOPS Statistics 2007-2025: NVIDIA AMD Intel | url = https://gpus.axiomgaming.net/gflops-statistics | website = Axiom Gaming | publisher = Axiom Gaming | access-date = 14 August 2025}}</ref>
| GFLOPS
| 10<sup>9</sup>
|-
Line 72 ⟶ 73:
: <math>\text{FLOPS} = \text{cores} \times \frac{\text{cycles}}{ \text{second}} \times \frac{\text{FLOPs}}{\text{cycle}}.</math>
 
FLOPS can be recorded in different measures of precision, for example, the [[TOP500]] supercomputer list ranks computers by 64 -bit ([[double-precision floating-point format]]) operations per second, abbreviated to ''FP64''.<ref name="top500faq">{{cite web |title=FREQUENTLY ASKED QUESTIONS |url=https://www.top500.org/resources/frequently-asked-questions/ |website=top500.org |access-date=June 23, 2020}}</ref> Similar measures are available for [[Single-precision floating-point format|32-bit]] (''FP32'') and [[Half-precision floating-point format|16-bit]] (''FP16'') operations.
 
{{anchor|FLOPSforProcessors}}
Line 90 ⟶ 91:
|-
|[[Intel 80486]]
|[[x87]] (3280-bit)
| {{dunno}}
|0.128<ref name=":1" />
Line 99 ⟶ 100:
*Intel [[P6 (microarchitecture)|P6]] [[Pentium Pro]]
}}
|[[x87]] (3280-bit)
| {{dunno}}
|0.5<ref name=":1">{{Cite web|title=home.iae.nl |url=http://home.iae.nl/users/mhx/flops_4.tbl|access-date=|website=}}</ref>
Line 108 ⟶ 109:
*Intel [[P6 (microarchitecture)|P6]] [[Pentium II]]
}}
|[[MMX (instruction set)|MMXx87]] (6480-bit)
| {{dunno}}
|1<ref name=":0">{{Cite web|title=Computing Power throughout History|url=https://www.alternatewars.com/BBOW/Computing/Computing_Power.htm|access-date=2021-02-13|website=alternatewars.com}}</ref>
Line 200 ⟶ 201:
|{{ublist|
|AMD [[Zen (microarchitecture)|Zen]]<br/>(Ryzen 1000 series, Threadripper 1000 series, Epyc [[Epyc|Naples]])
|AMD [[Zen+]]<ref name="tpeak_jos"/><ref>{{Cite web | url=http://www.agner.org/optimize/blog/read.php?i=838 | title=Agner's CPU blog - Test results for AMD Ryzen}}</ref><ref>https://arstechnica.com/gadgets/2017/03/amds-moment-of-zen-finally-an-architecture-that-can-compete/2/ "each core now has a pair of 128-bit FMA units of its own"</ref><ref>{{cite conference |url=https://www.hotchips.org/wp-content/uploads/hc_archives/hc28/HC28.23-Tuesday-Epub/HC28.23.90-High-Perform-Epub/HC28.23.930-X86-core-MikeClark-AMD-final_v2-28.pdf#page=7 |title=A New x86 Core Architecture for the Next Generation of Computing |author=Mike Clark |date=August 23, 2016 |publisher=AMD |conference=HotChips 28 |access-date=October 8, 2017 |archive-date=July 31, 2020 |archive-url=https://web.archive.org/web/20200731171730/https://www.hotchips.org/wp-content/uploads/hc_archives/hc28/HC28.23-Tuesday-Epub/HC28.23.90-High-Perform-Epub/HC28.23.930-X86-core-MikeClark-AMD-final_v2-28.pdf#page=7 |url-status=dead }} [https://web.archive.org/web/20161209125020/http://images.anandtech.com/doci/10591/HC28.AMD.Mike%20Clark.final-page-007.jpg page 7]</ref><br/>(Ryzen 2000 series, Threadripper 2000 series)
}}
| [[Advanced Vector Extensions|AVX2]] & [[FMA instruction set|FMA]]<br/>(128-bit, 256-bit decoding)<ref>{{Cite web |title=The microarchitecture of Intel and AMD CPUs |url=https://www.agner.org/optimize/microarchitecture.pdf}}</ref>
Line 211 ⟶ 212:
| [[Advanced Vector Extensions|AVX2]] & [[FMA instruction set|FMA]] (256-bit)
| 16 || 32 || 0
|-
|-
|{{ublist|
|AMD [[Zen 4]]<br/>(Ryzen 7000 series, Threadripper 7000 series, Epyc [[Epyc|Genoa]],[[Epyc|Bergamo]], [[Epyc|Siena]])
}}
| [[Advanced Vector Extensions|AVX-512]] & [[FMA instruction set|FMA]] (256-bit)
| 16 || 32 || 0
|-
|{{ublist|
|AMD [[Zen 5]]<ref>{{Cite web | url=https://community.amd.com/t5/server-processors/leadership-hpc-performance-with-5th-generation-amd-epyc/ba-p/739498 | title=Leadership HPC Performance with 5th Generation AMD EPYC Processors}}</ref><br/>(Ryzen 9000 series, Threadripper 9000 series, Epyc [[Epyc|Turin]])
}}
| [[Advanced Vector Extensions|AVX-512]] & [[FMA instruction set|FMA]] (512-bit)
| 32 || 64 || 0
|-
! colspan="5" |ARM CPU
Line 388 ⟶ 402:
|[[ENIAC]] @ 100&nbsp;kHz in 1945
|
|0.00400385<ref>ENIAC @ 100&nbsp;kHz with 385 Flops {{Cite web|title=Computers of Yore|url=https://www.clear.rice.edu/comp201/08-spring/lectures/lec02/computers.shtml|access-date=2021-02-26|website=clear.rice.edu}}</ref><br/>(~{{val|32.6|e=-83|u=FLOPS|upl=W}})<ref>consumed 150 kilowatts of power {{Cite web|title=National Museum of the United States Army|url=https://www.thenmusa.org/armyinnovations/innovationeniaccomputer/|access-date=2025-08-08}}</ref>
|
|
Line 451 ⟶ 465:
==Performance records==
===Single computer records===
The [[NEC SX-2]], a [[supercomputer]] developed by [[NEC]] in 1983, achieved gigaFLOPS (GFLOPS) performance with 1.3 [[billion]] FLOPS.<ref>{{Cite web |title=【NEC】 SX-1, SX-2 |url=https://museum.ipsj.or.jp/en/computer/super/0008.html |access-date=2025-08-25 |website=IPSJ Computer Museum |publisher=[[Information Processing Society of Japan]]}}</ref>
 
In June 1997, [[Intel]]'s [[ASCI Red]] was the world's first computer to achieve one teraFLOPS and beyond. Sandia director Bill Camp said that ASCI Red had the best reliability of any supercomputer ever built, and "was supercomputing's high-water mark in longevity, price, and performance".<ref name="jacobsequity.com">{{cite web |title=Sandia's ASCI Red, world's first teraflop supercomputer, is decommissioned |url=http://www.jacobsequity.com/ASCI%20Red%20Supercomputer.pdf |access-date=November 17, 2011 |archive-url=https://web.archive.org/web/20101105131112/http://www.jacobsequity.com/ASCI%20Red%20Supercomputer.pdf |archive-date=November 5, 2010 }}</ref>
 
Line 553 ⟶ 569:
|-
| 1964
| $2.3[[billion|B]]3B
| ${{Inflation|US|2.3|1964|r=3|fmt=c}}B
| Base model [[CDC 6600]] price: $6,891,300.
Line 601 ⟶ 617:
|-
| {{sort|2012/08|August 2012}}
| 75.00¢
| ${{Inflation|US|.75|2012|r=2|fmt=c}}¢
| Quad [[Radeon HD 7000 series|AMD Radeon 7970]] System
| A quad [[AMD]] [[Radeon HD 7000 series|Radeon 7970]] desktop computer reaching 16 TFLOPS of single-precision, 4 TFLOPS of double-precision computing performance. Total system cost was $3000; built using only commercially available hardware.<ref>{{cite web |url=http://www.overclock3d.net/reviews/gpu_displays/hd7970_quadfire_eyefinity_review/12 |title=HD7970 Quadfire Eyefinity Review |date=January 9, 2012 |website=OC3D.net |author=Tom Logan}}</ref>
Line 631 ⟶ 647:
|-
| {{sort|2017/07|June 2017}}
| 6.00¢
| {{Inflation|US|6.00|2017|r=2|fmt=c}}¢
| [[Zen (first generation)|AMD Ryzen 7 1700]] & [[Radeon Pro|AMD Radeon Vega Frontier Edition]] system
Line 730 ⟶ 746:
* [[Moore's law]]
* [[Multiply–accumulate operation]]
* [[Performance per watt#FLOPS per watt|Performance per watt § FLOPS per watt]]
* [[SPECfp]]
* [[SPECint]]