Floating point operations per second: Difference between revisions

Content deleted Content added
Partial restore with corrected model
 
(26 intermediate revisions by 17 users not shown)
Line 5:
'''Floating point operations per second''' ('''FLOPS''', '''flops''' or '''flop/s''') is a measure of [[computer performance]] in [[computing]], useful in fields of scientific computations that require [[floating-point]] calculations.<ref>{{cite web |title=Understand measures of supercomputer performance and storage system capacity |url=https://kb.iu.edu/d/apeq |website=kb.iu.edu |access-date=23 March 2024}}</ref>
 
For such cases, it is a more accurate measure than measuring [[instructions per second]].{{cn|date=March 2024}}
 
==Floating-point arithmetic==
{{Anchor|multipliers}}
{| class="wikitable floatright sortable"
|+ Multipliers for flops
Line 23 ⟶ 24:
|-
| [[Giga-|giga]]FLOPS
| GFLOPS<ref>{{cite web | title = GPU GFLOPS Statistics 2007-2025: NVIDIA AMD Intel | url = https://gpus.axiomgaming.net/gflops-statistics | website = Axiom Gaming | publisher = Axiom Gaming | access-date = 14 August 2025}}</ref>
| GFLOPS
| 10<sup>9</sup>
|-
Line 72 ⟶ 73:
: <math>\text{FLOPS} = \text{cores} \times \frac{\text{cycles}}{ \text{second}} \times \frac{\text{FLOPs}}{\text{cycle}}.</math>
 
FLOPS can be recorded in different measures of precision, for example, the [[TOP500]] supercomputer list ranks computers by 64 -bit ([[double-precision floating-point format]]) operations per second, abbreviated to ''FP64''.<ref name="top500faq">{{cite web |title=FREQUENTLY ASKED QUESTIONS |url=https://www.top500.org/resources/frequently-asked-questions/ |website=top500.org |access-date=June 23, 2020}}</ref> Similar measures are available for [[Single-precision floating-point format|32-bit]] (''FP32'') and [[Half-precision floating-point format|16-bit]] (''FP16'') operations.
 
{{anchor|FLOPSforProcessors}}
Line 90 ⟶ 91:
|-
|[[Intel 80486]]
|[[x87]] (3280-bit)
| {{dunno}}
|0.128<ref name=":1" />
Line 99 ⟶ 100:
*Intel [[P6 (microarchitecture)|P6]] [[Pentium Pro]]
}}
|[[x87]] (3280-bit)
| {{dunno}}
|0.5<ref name=":1">{{Cite web|title=home.iae.nl |url=http://home.iae.nl/users/mhx/flops_4.tbl|access-date=|website=}}</ref>
Line 108 ⟶ 109:
*Intel [[P6 (microarchitecture)|P6]] [[Pentium II]]
}}
|[[MMX (instruction set)|MMXx87]] (6480-bit)
| {{dunno}}
|1<ref name=":0">{{Cite web|title=Computing Power throughout History|url=https://www.alternatewars.com/BBOW/Computing/Computing_Power.htm|access-date=2021-02-13|website=alternatewars.com}}</ref>
Line 193 ⟶ 194:
|[[Advanced Vector Extensions|AVX]] (128-bit)<br/>(Bulldozer, Steamroller)
|[[AVX2]] (128-bit) (Excavator)
|[[FMA instruction set|FMA3]] (Bulldozer)<ref>{{Cite web|url=https://developer.amd.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf|title=New instructions support for Bulldozer (FMA3) and Piledriver (FMA3+4 and CVT, BMI, TB M)}}</ref>
M)}}</ref>
|[[FMA instruction set|FMA3/4]] (Piledriver, Excavator)
}}
Line 201:
|{{ublist|
|AMD [[Zen (microarchitecture)|Zen]]<br/>(Ryzen 1000 series, Threadripper 1000 series, Epyc [[Epyc|Naples]])
|AMD [[Zen+]]<ref name="tpeak_jos"/><ref>{{Cite web | url=http://www.agner.org/optimize/blog/read.php?i=838 | title=Agner's CPU blog - Test results for AMD Ryzen}}</ref><ref>https://arstechnica.com/gadgets/2017/03/amds-moment-of-zen-finally-an-architecture-that-can-compete/2/ "each core now has a pair of 128-bit FMA units of its own"</ref><ref>{{cite conference |url=https://www.hotchips.org/wp-content/uploads/hc_archives/hc28/HC28.23-Tuesday-Epub/HC28.23.90-High-Perform-Epub/HC28.23.930-X86-core-MikeClark-AMD-final_v2-28.pdf#page=7 |title=A New x86 Core Architecture for the Next Generation of Computing |author=Mike Clark |date=August 23, 2016 |publisher=AMD |conference=HotChips 28 |access-date=October 8, 2017 |archive-date=July 31, 2020 |archive-url=https://web.archive.org/web/20200731171730/https://www.hotchips.org/wp-content/uploads/hc_archives/hc28/HC28.23-Tuesday-Epub/HC28.23.90-High-Perform-Epub/HC28.23.930-X86-core-MikeClark-AMD-final_v2-28.pdf#page=7 |url-status=dead }} [https://web.archive.org/web/20161209125020/http://images.anandtech.com/doci/10591/HC28.AMD.Mike%20Clark.final-page-007.jpg page 7]</ref><br/>(Ryzen 2000 series, Threadripper 2000 series)
}}
| [[Advanced Vector Extensions|AVX2]] & [[FMA instruction set|FMA]]<br/>(128-bit, 256-bit decoding)<ref>{{Cite web |title=The microarchitecture of Intel and AMD CPUs |url=https://www.agner.org/optimize/microarchitecture.pdf}}</ref>
Line 212:
| [[Advanced Vector Extensions|AVX2]] & [[FMA instruction set|FMA]] (256-bit)
| 16 || 32 || 0
|-
|-
|{{ublist|
|AMD [[Zen 4]]<br/>(Ryzen 7000 series, Threadripper 7000 series, Epyc [[Epyc|Genoa]],[[Epyc|Bergamo]], [[Epyc|Siena]])
}}
| [[Advanced Vector Extensions|AVX-512]] & [[FMA instruction set|FMA]] (256-bit)
| 16 || 32 || 0
|-
|{{ublist|
|AMD [[Zen 5]]<ref>{{Cite web | url=https://community.amd.com/t5/server-processors/leadership-hpc-performance-with-5th-generation-amd-epyc/ba-p/739498 | title=Leadership HPC Performance with 5th Generation AMD EPYC Processors}}</ref><br/>(Ryzen 9000 series, Threadripper 9000 series, Epyc [[Epyc|Turin]])
}}
| [[Advanced Vector Extensions|AVX-512]] & [[FMA instruction set|FMA]] (512-bit)
| 32 || 64 || 0
|-
! colspan="5" |ARM CPU
Line 309 ⟶ 322:
}}
| [[Parallel Thread Execution|PTX]] || {{frac|1|32}} || {{nowrap|2&nbsp;(FP32) + 0&nbsp;(INT32)}}<br/>''or''<br/>{{nowrap|1&nbsp;(FP32) + 1&nbsp;(INT32)}} || 8
|-
| Nvidia [[Hopper (microarchitecture)|Hopper]] || [[Parallel Thread Execution|PTX]] || 2 || 2&nbsp;(FP32) + 1&nbsp;(INT32) || 32
|-
! colspan="5" |AMD GPU
Line 387 ⟶ 402:
|[[ENIAC]] @ 100&nbsp;kHz in 1945
|
|0.00400385<ref>ENIAC @ 100&nbsp;kHz with 385 Flops {{Cite web|title=Computers of Yore|url=https://www.clear.rice.edu/comp201/08-spring/lectures/lec02/computers.shtml|access-date=2021-02-26|website=clear.rice.edu}}</ref><br/>(~{{val|32.6|e=-83|u=FLOPS|upl=W}})<ref>consumed 150 kilowatts of power {{Cite web|title=National Museum of the United States Army|url=https://www.thenmusa.org/armyinnovations/innovationeniaccomputer/|access-date=2025-08-08}}</ref>
|
|
Line 450 ⟶ 465:
==Performance records==
===Single computer records===
The [[NEC SX-2]], a [[supercomputer]] developed by [[NEC]] in 1983, achieved gigaFLOPS (GFLOPS) performance with 1.3 [[billion]] FLOPS.<ref>{{Cite web |title=【NEC】 SX-1, SX-2 |url=https://museum.ipsj.or.jp/en/computer/super/0008.html |access-date=2025-08-25 |website=IPSJ Computer Museum |publisher=[[Information Processing Society of Japan]]}}</ref>
 
In June 1997, [[Intel]]'s [[ASCI Red]] was the world's first computer to achieve one teraFLOPS and beyond. Sandia director Bill Camp said that ASCI Red had the best reliability of any supercomputer ever built, and "was supercomputing's high-water mark in longevity, price, and performance".<ref name="jacobsequity.com">{{cite web |title=Sandia's ASCI Red, world's first teraflop supercomputer, is decommissioned |url=http://www.jacobsequity.com/ASCI%20Red%20Supercomputer.pdf |access-date=November 17, 2011 |archive-url=https://web.archive.org/web/20101105131112/http://www.jacobsequity.com/ASCI%20Red%20Supercomputer.pdf |archive-date=November 5, 2010 }}</ref>
 
Line 464 ⟶ 481:
On October 25, 2007, [[NEC]] Corporation of Japan issued a press release announcing its SX series model [[SX-9]],<ref>{{cite news|url=http://www.nec.co.jp/press/en/0710/2501.html|title=NEC Launches World's Fastest Vector Supercomputer, SX-9|date=October 25, 2007|publisher=NEC|access-date=July 8, 2008}}</ref> claiming it to be the world's fastest vector supercomputer. The [[SX-9]] features the first CPU capable of a peak vector performance of 102.4 gigaFLOPS per single core.
 
On February 4, 2008, the [[National Science Foundation|NSF]] and the [[University of Texas at Austin]] opened full scale research runs on an [[AMD]], [[Sun Microsystems|Sun]] supercomputer named [[Texas Advanced Computing Center#Ranger|Ranger]],<ref>{{cite web
|url = http://www.tacc.utexas.edu/resources/hpcsystems/
|title = University of Texas at Austin, Texas Advanced Computing Center
Line 485 ⟶ 502:
In October 2010, China unveiled the [[Tianhe-1]], a supercomputer that operates at a peak computing rate of 2.5 petaFLOPS.<ref>{{cite news| url=https://www.bbc.co.uk/news/technology-11644252 | publisher=BBC News | title=China claims supercomputer crown | date=October 28, 2010}}</ref><ref>{{cite web|last=Dillow |first=Clay |url=http://www.popsci.com/technology/article/2010-10/china-unveils-2507-petaflop-supercomputer-worlds-fastest |title=China Unveils 2507 Petaflop Supercomputer, the World's Fastest |website=Popsci.com |date=October 28, 2010 |access-date=February 9, 2012 }}</ref>
 
{{As of|2010}} the fastest PC [[microprocessor|processor]] reached 109&nbsp;gigaFLOPS ([[Intel Core#Core i7|Intel Core i7]] [[Gulftown (microprocessor)|980 XE]])<ref>{{Cite web |url=http://techgage.com/article/intels_core_i7-980x_extreme_edition_-_ready_for_sick_scores/8 |title=Intel's Core i7-980X Extreme Edition – Ready for Sick Scores?: Mathematics: Sandra Arithmetic, Crypto, Microsoft Excel |website=Techgage |date=March 10, 2010 |access-date=February 9, 2012}}</ref> in double precision calculations. [[Graphics processing unit|GPU]]s are considerably more powerful. For example, [[Nvidia Tesla]] C2050 GPU computing processors perform around 515 gigaFLOPS<ref name="nvidia.com">{{cite web|url=http://www.nvidia.com/object/product_tesla_C2050_C2070_us.html |title=NVIDIA Tesla Personal Supercomputer |publisher=Nvidia.com |access-date=February 9, 2012}}</ref> in double precision calculations, and the AMD FireStream 9270 peaks at 240 gigaFLOPS.<ref name="ati.amd.com">{{cite web|url=https://www.amd.com/us/products/workstation/firestream/firestream-9270/pages/firestream-9270.aspx |title=AMD FireStream 9270 GPU Compute Accelerator |publisher=Amd.com |access-date=February 9, 2012}}</ref>
 
In November 2011, it was announced that Japan had achieved 10.51 petaFLOPS with its [[K computer]].<ref name="Petaflops">{{cite web|url=http://www.fujitsu.com/global/news/pr/archives/month/2011/20111102-02.html |title='K computer' Achieves Goal of 10 Petaflops |publisher=Fujitsu.com |access-date=February 9, 2012}}</ref> It has 88,128 [[SPARC64 VIIIfx]] [[central processing unit|processor]]s in 864 racks, with theoretical performance of 11.28 petaFLOPS. It is named after the Japanese word "[[wikt:京#Japanese|kei]]", which stands for 10 [[1,000,000,000,000,000|quadrillion]],<ref>See [[Japanese numerals#Large numbers|Japanese numbers]]</ref> corresponding to the target speed of 10 petaFLOPS.
Line 543 ⟶ 560:
|${{Inflation|US|1.265|1945|r=3|fmt=c}}T
|[[ENIAC]]: {{US$|long=no|487000}} in 1945 and ${{Inflation|US|487000|1945|fmt=c|r=-3}} in 2023.
|{{US$|long=no|487000}} / {{val|0.000000385|ul=GFLOPS}}. [[Vacuum-tube computer|First-generation]] ([[vacuum tube]]-based) electronic digital computer.
|-
| 1961
Line 549 ⟶ 566:
| ${{Inflation|US|18.672|1961|r=3|fmt=c}}B
| A basic installation of [[IBM 7030 Stretch]] had a cost at the time of {{US$|7.78 million}} each.
| The [[IBM 7030 Stretch]] performs one floating-point multiply every {{val|2.4 |ul=microseconds}}.<ref>{{cite web|url=http://computer-history.info/Page4.dir/pages/IBM.7030.Stretch.dir/ |title=The IBM 7030 (STRETCH) |publisher=Norman Hardy |access-date=February 24, 2017}}</ref> [[Transistor computer|Second-generation]] (discrete [[Transistor computer|transistor]]-based) computer.
|-
| 1964
| $2.3[[billion|B]]3B
| ${{Inflation|US|2.3|1964|r=3|fmt=c}}B
| Base model [[CDC 6600]] price: $6,891,300.
Line 600 ⟶ 617:
|-
| {{sort|2012/08|August 2012}}
| 75.00¢
| ${{Inflation|US|.75|2012|r=2|fmt=c}}¢
| Quad [[Radeon HD 7000 series|AMD Radeon 7970]] System
| A quad [[AMD]] [[Radeon HD 7000 series|Radeon 7970]] desktop computer reaching 16 TFLOPS of single-precision, 4 TFLOPS of double-precision computing performance. Total system cost was $3000; built using only commercially available hardware.<ref>{{cite web |url=http://www.overclock3d.net/reviews/gpu_displays/hd7970_quadfire_eyefinity_review/12 |title=HD7970 Quadfire Eyefinity Review |date=January 9, 2012 |website=OC3D.net |author=Tom Logan}}</ref>
Line 630 ⟶ 647:
|-
| {{sort|2017/07|June 2017}}
| 6.00¢
| {{Inflation|US|6.00|2017|r=2|fmt=c}}¢
| [[Zen (first generation)|AMD Ryzen 7 1700]] & [[Radeon Pro|AMD Radeon Vega Frontier Edition]] system
Line 729 ⟶ 746:
* [[Moore's law]]
* [[Multiply–accumulate operation]]
* [[Performance per watt#FLOPS per watt|Performance per watt § FLOPS per watt]]
* [[SPECfp]]
* [[SPECint]]