Symmetric multiprocessing: Difference between revisions

Content deleted Content added
Variable SMP: As "not only" was removed, "but" should also be removed.
 
(39 intermediate revisions by 30 users not shown)
Line 1:
{{Short description|Equal sharing of all resources by multiple identical processors}}
{{more citations needed|date=November 2012}}
 
{{Multiple issues|
{{more citations needed|date=November 2012}}
{{Expert needed|Computing|date=June 2025}}
}}
[[File:SMP - Symmetric Multiprocessor System.svg|thumb|upright=2|Diagram of a symmetric multiprocessing system]]
'''Symmetric multiprocessing''' or '''shared-memory multiprocessing'''<ref>{{cite book |last1=Patterson |first1=David |last2=Hennessy |first2=John |author-link1=David Patterson (computer scientist) |author-link2=John L. Hennessy |date=2018 |title=Computer Organisation and Design: The Hardware/Software Interface |___location=Cambridge, United States |publisher=Morgan Kaufmann |page=509 |isbn=978-0-12-812275-4|edition=RISC-V }}</ref> ('''SMP''') involves a [[multiprocessor]] computer hardware and software architecture where two or more identical processors are connected to a single, shared [[main memory]], have full access to all input and output devices, and are controlled by a single operating system instance that treats all processors equally, reserving none for special purposes. Most multiprocessor systems today use an SMP architecture. In the case of [[multi-core processor]]s, the SMP architecture applies to the cores, treating them as separate processors.
 
Professor John D. Kubiatowicz considers traditionally SMP systems to contain processors without caches.<ref>{{cite conference|url=https://parlab.eecs.berkeley.edu/2013bootcampagenda|conference=2013 Short Course on Parallel Programming|author=John Kubiatowicz|title=Introduction to Parallel Architectures and Pthreads}}</ref> Culler and Pal-Singh in their 1998 book "Parallel Computer Architecture: A Hardware/Software Approach" mention: "The term SMP is widely used but causes a bit of confusion. [...] The more precise description of what is intended by SMP is a shared memory multiprocessor where the cost of accessing a memory ___location is the same for all processors; that is, it has uniform access costs when the access actually is to memory. If the ___location is cached, the access will be faster, but cache access times and memory access times are the same on all processors."<ref>{{cite book|isbn=978-15586034311-55860-343-1|author1=David Culler|authorlink1author-link1=David Culler|author2=Jaswinder Pal Singh|author3=Anoop Gupta|title=Parallel Computer Architecture: A Hardware/Software Approach|url=https://books.google.com/books?id=MHfHC4Wf3K0C&pg=PA32|page=47|year=1999|publisher=[[Morgan Kaufmann]]}}</ref>
 
SMP systems are ''[[multiprocessing#Processor coupling|tightly coupled multiprocessor]] systems'' with a pool of homogeneous processors running independently of each other. Each processor, executing different programs and working on different sets of data, has the capability of sharing common resources (memory, I/O device, interrupt system and so on) that are connected using a [[system bus]] or a [[crossbar switch|crossbar]].
Line 13 ⟶ 17:
Processors may be interconnected using buses, [[crossbar switch]]es or on-chip mesh networks. The bottleneck in the scalability of SMP using buses or crossbar switches is the bandwidth and power consumption of the interconnect among the various processors, the memory, and the disk arrays. Mesh architectures avoid these bottlenecks, and provide nearly linear scalability to much higher processor counts at the sacrifice of programmability:
 
<blockquote>Serious programming challenges remain with this kind of architecture because it requires two distinct modes of programming;, one for the CPUs themselves and one for the interconnect between the CPUs. A single programming language would have to be able to not only partition the workload, but also comprehend the memory locality, which is severe in a mesh-based architecture.<ref name="AutoMQ-1"/></blockquote>
 
SMP systems allow any processor to work on any task no matter where the data for that task is located in memory, provided that each task in the system is not in execution on two or more processors at the same time. With proper [[operating system]] support, SMP systems can easily move tasks between processors to balance the workload efficiently.
 
==History==
{{Expand section|date=April 2013}}
The earliest production system with multiple identical processors was the Burroughs [[B5000]], which was functional around 1961. However at run-time this was [[Asymmetric multiprocessing#Burroughs B5000 and B5500|asymmetric]], with one processor restricted to application programs while the other processor mainly handled the operating system and hardware interrupts. The Burroughs D825 first implemented SMP in 1962.<ref>{{cite web|url=http://ei.cs.vt.edu/~history/Parallel.html|title=The History of the Development of Parallel Computing|author=Gregory V. Wilson|date=October 1994}}</ref><ref>{{cite web|url=http://ed-thelen.org/comp-hist/BRL64-b.html#BURROUGHS-D825|title=A Fourth Survey of Domestic Electronic Digital Computing Systems|author=Martin H. Weik|publisher=[[Ballistic Research Laboratories]], [[Aberdeen Proving Grounds]]|at=Burroughs D825|date=January 1964}}</ref>
 
Line 27 ⟶ 30:
| version = Fourth Edition
| date = September 1968
| url = http://www.bitsavers.org/pdf/ibm/360/funcCharfunctional_characteristics/A22-6884-3_360-65_funcChar.pdf}}</ref> and 67-267–2.<ref>{{cite book
| publisher = IBM
| title = IBM System/360 Model 67 Functional Characteristics
| id = GA27-2719-2
| url = http://www.bitsavers.org/pdf/ibm/360/funcCharfunctional_characteristics/GA27-2719-2_360-67_funcChar.pdf
| version = Third Edition
| date = February 1972}}</ref> The operating systems that ran on these machines were [[OS/360]] M65MP<ref>[http://doi.acm.org/10.1145/800186.810634 M65MP: An Experiment in OS/360 multiprocessing]</ref> and [[TSS/360]]. Other software developed at universities, notably the [[Michigan Terminal System]] (MTS), used both CPUs. Both processors could access data channels and initiate I/O. In OS/360 M65MP, peripherals could generally be attached to either processor since the operating system kernel ran on both processors (though with a "big lock" around the I/O handler).<ref>{{cite book |url=http://bitsavers.org/pdf/ibm/360/os/R21.7_Apr73/plm/GY28-6616-9_OS_IO_Superv_PLM_R21.7_Apr73.pdf |title=Program Logic Manual, OS I/O Supervisor Logic, Release 21 (R21.7) |publisher=IBM |id=GY28-6616-9 |edition=Tenth |date=April 1973}}</ref> The MTS supervisor (UMMPS) has the ability to run on both CPUs of the IBM System/360 model 67-267–2. Supervisor locks were small and used to protect individual common data structures that might be accessed simultaneously from either CPU.<ref>[https://1a9f2076-a-62cb3a1a-s-sites.googlegroups.com/site/michiganterminalsystem/documentation/documents/timeSharingSupervisorPrograms-1971.pdf?attachauth=ANoY7crPBadRVtxTmN8sqSjFc3xC84Q_pDpvpRo7VRWz0_Ql-UKQ2SVe6hJ7lVOjGZbLkOSXco8c9_ZI6TmQZS8EpBTMlByIPM4iByyUXlXE__YfWN0jqwIQglhyvR0oSxl0I_C0JenDItLzN4btLtkug9HSHRX1s-WtlkSQ-pzJLpczJYsuzTvZVIggSTW0arjTnQsls6xcrCsMcyl58Y98Q0Sw2yecmFLiTcYjnYrgAhLGSu9b2s28oV04R6_6p6fD8UUjvnRawHn7N6qFgRIEuGj4QuZlkthZM5_fZwaPyXvLxccgLCk%3D&attredirects=0 ''Time Sharing Supervisor Programs''] by Mike Alexander (May 1971) has information on MTS, TSS, CP/67, and Multics</ref>
 
Other mainframes that supported SMP included the [[UNIVAC 1100/2200 series#1108|UNIVAC 1108 II]], released in 1965, which supported up to three CPUs, and the [[GE-600 series|GE-635 and GE-645]],<ref>{{cite book|url=http://www.bitsavers.org/pdf/ge/GE-6xx/CPB-371A_GE-635_System_Man_Jul64.pdf|title=GE-635 System Manual|date=July 1964|publisher=[[General Electric]]}}</ref><ref>{{cite book|url=http://www.bitsavers.org/pdf/ge/GE-645/GE-645_SystemMan_Jan68.pdf|title=GE-645 System Manual|date=January 1968|publisher=General Electric}}</ref> although [[General Comprehensive Operating System|GECOS]] on multiprocessor GE-635 systems ran in a master-slave asymmetric fashion, unlike [[Multics]] on multiprocessor GE-645 systems, which ran in a symmetric fashion.<ref>{{cite newsgroup|url=https://groups.google.com/d/msg/alt.folklore.computers/v-hkdKaPTXc/MX7UI3DgOokJ|title=Fear of Multiprocessing?|author=Richard Shetron|date=May 5, 1998|newsgroup=alt.folklore.computers|message-id=354e95a9.0@news.wizvax.net}}</ref>
 
Starting with its version 7.0 (1972), [[Digital Equipment Corporation]]'s operating system [[TOPS-10]] implemented the SMP feature, the earliest system running SMP was the [[PDP-10|DECSystem 1077]] dual KI10 processor system.<ref>[http://www.ultimate.com/phil/pdp10/10periphs DEC 1077 and SMP]</ref> Later KL10 system could aggregate up to 8 CPUs in a SMP manner. In contrast, DECs first multi-processor [[VAX]] system, the VAX-11/782, was asymmetric,<ref>[http://www.bitsavers.org/pdf/dec/vax/EG-21731-18_VAX_Product_Sales_Guide_Apr82.pdf VAX Product Sales Guide, pages 1-23 and 1-24]: the VAX-11/782 is described as an asymmetric multiprocessing system in 1982</ref> but later VAX multiprocessor systems were SMP.<ref>[http://www.bitsavers.org/pdf/dec/vax/8800/EK-8840H-UG-001_88xx_System_Hardware_Users_Guide_Mar88.pdf VAX 8820/8830/8840 System Hardware User's Guide]: by 1988 the VAX operating system was SMP</ref>
 
Early commercial Unix SMP implementations included the [[Sequent Computer Systems]] Balance 8000 (released in 1984) and Balance 21000 (released in 1986).<ref>{{Cite book |last1 = Hockney |first1 = R.W. |last2 = Jesshope |first2 = C.R. |title = Parallel Computers 2: Architecture, Programming and Algorithms |publisher = Taylor & Francis |year = 1988 | pagespage = 46 |isbn = 0-85274-811-6}}</ref> Both models were based on 10&nbsp;MHz [[National Semiconductor]] [[NS320xx|NS32032]] processors, each with a small write-through cache connected to a common memory to form a [[Shared memory architecture|shared memory]] system. Another early commercial Unix SMP implementation was the NUMA based Honeywell Information Systems Italy XPS-100 designed by Dan Gielan of VAST Corporation in 1985. Its design supported up to 14 processors, but due to electrical limitations, the largest marketed version was a dual processor system. The operating system was derived and ported by VAST Corporation from AT&T 3B20 Unix SysVr3 code used internally within AT&T.
 
Earlier non-commercial multiprocessing UNIX ports existed, including a port named MUNIX created at the [[Naval Postgraduate School]] by 1975.<ref>{{Cite web|url=https://core.ac.uk/download/pdf/36714194.pdf|title=MUNIX, A Multiprocessing Version Of UNIX|last=Hawley|first=John Alfred|date=June 1975|website=core.ac.uk|access-date=11 November 2018}}</ref>
Line 50 ⟶ 53:
Multithreaded programs can also be used in time-sharing and server systems that support multithreading, allowing them to make more use of multiple processors.
 
== Advantages/Disadvantagesdisadvantages ==
 
In current SMP systems, all of the processors are tightly coupled inside the same box with a bus or switch; on earlier SMP systems, a single CPU took an entire cabinet. Some of the components that are shared are global memory, disks, and I/O devices. Only one copy of an OS runs on all the processors, and the OS must be designed to take advantage of this architecture. Some of the basic advantages involves cost-effective ways to increase throughput. To solve different problems and tasks, SMP applies multiple processors to that one problem, known as [[parallel programming]].
Line 65 ⟶ 68:
 
== Performance ==
When more than one program executes at the same time, an SMP system has considerably better performance than a uni-processoruniprocessor system, because different programs can run on different CPUs simultaneously. SimilarlyConversely, [[Asymmetricasymmetric multiprocessing]] (AMP) usually allows only one processor to run a program or task at a time. For example, AMP can be used in assigning specific tasks to CPU based to priority and importance of task completion. AMP was created well before SMP in terms of handling multiple CPUs, which explains the lack of performance based on the example provided.
 
When more than one program executes at the same time, an SMP system has considerably better performance than a uni-processor, because different programs can run on different CPUs simultaneously. Similarly, [[Asymmetric multiprocessing]] (AMP) usually allows only one processor to run a program or task at a time. For example, AMP can be used in assigning specific tasks to CPU based to priority and importance of task completion. AMP was created well before SMP in terms of handling multiple CPUs, which explains the lack of performance based on the example provided.
 
In cases where an SMP environment processes many jobs, administrators often experience a loss of hardware efficiency. Software programs have been developed to schedule jobs and other functions of the computer so that the processor utilization reaches its maximum potential. Good software packages can achieve this maximum potential by scheduling each CPU separately, as well as being able to integrate multiple SMP machines and clusters.
 
Access to RAM is serialized; this and [[cache coherency]] issues causescause performance to lag slightly behind the number of additional processors in the system.
 
== Alternatives ==
 
[[Image:Shared memory.svg|right|350px|thumb|Diagram of a typical SMP system. Three processors are connected to the same memory module through a [[system bus]] or [[crossbar switch]]. ]]
 
SMP uses a single shared [[system bus]] that represents one of the earliest styles of multiprocessor machine architectures, typically used for building smaller computers with up to 8 processors.
Line 83 ⟶ 85:
 
== Variable SMP ==
{{POV section|talk=Undue weight on vSMP|date=August 2017}}
 
Variable Symmetric Multiprocessing (vSMP) is a specific mobile use case technology initiated by NVIDIA. This technology includes an extra fifth core in a quad-core device, called the Companion core, built specifically for executing tasks at a lower frequency during mobile active standby mode, video playback, and music playback.
 
Project Kal-El ([[Tegra 3]]),<ref name="AutoMQ-4" /> patented by NVIDIA, was the first SoC (System on Chip) to implement this new vSMP technology. This technology not only reduces mobile power consumption during active standby state, butand also maximizes quad core performance during active usage for intensive mobile applications. Overall this technology addresses the need for increase in battery life performance during active and standby usage by reducing the power consumption in mobile processors.
 
Unlike current SMP architectures, the vSMP Companion core is OS transparent meaning that the operating system and the running applications are totally unaware of this extra core but are still able to take advantage of it. Some of the advantages of the vSMP architecture includes cache coherency, OS efficiency, and power optimization. The advantages for this architecture are explained below:
 
*Cache Coherency: There are no consequences for synchronizing caches between cores running at different frequencies since vSMP does not allow the Companion core and the main cores to run simultaneously.
*OS Efficiency: It is inefficient when multiple CPU cores are run at different asynchronous frequencies because this could lead to possible scheduling issues.{{How|date=August 2017}} With vSMP, the active CPU cores will run at similar frequencies to optimize OS scheduling.
*Power Optimization: In asynchronous clocking based architecture, each core is on a different power plane to handle voltage adjustments for different operating frequencies. The result of this could impact performance.{{How|date=August 2017}} vSMP technology is able to dynamically enable and disable certain cores for active and standby usage, reducing overall power consumption.
 
These advantages lead the vSMP architecture to considerably benefit{{Peacock term|date=August 2017}} over other architectures using asynchronous clocking technologies.
 
== See also ==
* [[Asymmetric multiprocessing]]
* [[Binary Modular Dataflow Machine]]
* [[Cellular multiprocessing]]
* [[Locale (computer hardware)]]
* [[Massively parallel]]
Line 109 ⟶ 103:
== References ==
{{Reflist|refs=
<ref name="AutoMQ-1">{{cite journal |title= Trends in Multi-core DSP Platforms |authorsauthor= Lina J. Karam, |author2=Ismail AlKamal, |author3=Alan Gatherer, |author4=Gene A. Frantz, |author5=David V. Anderson, |author6=Brian L. Evans |journal= IEEE Signal Processing Magazine|volume= 26 |issue= 6 |pages= 38–49 |year= 2009 |url= http://users.ece.utexas.edu/~bevans/papers/2009/multicore/MulticoreDSPsForIEEESPMFinal.pdf |bibcode= 2009ISPM...26...38K |doi= 10.1109/MSP.2009.934113 |s2cid= 9429714 }}</ref>
<ref name="AutoMQ-4">[http://www.nvidia.com/content/pdf/tegra_white_papers/tegra-whitepaper-0911b.pdf Variable SMP – A Multi-Core CPU Architecture for Low Power and High Performance. NVIDIA. 2011.]</ref>
}}
Line 115 ⟶ 109:
== External links ==
* [http://ei.cs.vt.edu/~history/Parallel.html History of Multi-Processing]
* [https://web.archive.org/web/20120120160334/http://www.ibm.com/developerworks/library/l-linux-smp/ Linux and Multiprocessing]
* [https://www.amd.com/us-en/Processors/ProductInformation/0,,30_118,00.html AMD]