Remote direct memory access: Difference between revisions

Content deleted Content added
== RDMA over TCP/IP ==
Acceptance: remove as of 2013 phrasing as that is over a decade ago, so more past tense phrasing sounds better.
 
(162 intermediate revisions by more than 100 users not shown)
Line 1:
{{Short description|Low-level hardware direct memory access}}
'''Remote Direct Memory Access''' (RDMA) is a concept whereby two or more [[computer]]s communicate via [[Direct Memory Access]] directly from the [[main memory]] of one system to the main memory of another. As there is no CPU, cache, or context switching overhead, this is particularly useful in applications where high thoughput, low latency networking is needed such as in massively parallel [[Linux]] clusters. The most common RDMA implementation is over [[Infiniband]]. Although RDMA over Infiniband is technologically superior to most alternatives, it faces an uncertain commercial future.
In [[computing]], '''remote direct memory access''' ('''RDMA''') is [[direct memory access]] from the [[main memory|memory]] of one computer into that of another without involving either computer's [[operating system]]. This permits high-throughput, low-[[Network latency|latency]] memory access over a network, which is especially useful in massively parallel [[computer cluster]]s.
 
== RDMA over TCP/IPOverview ==
RDMA supports [[zero-copy]] networking by enabling the [[network adapter]] to transfer data from the wire directly to application memory or from application memory directly to the wire, eliminating the need to copy data between application memory and the data buffers in the operating system. Such transfers require no work to be done by [[Central processing unit|CPUs]], [[CPU cache|caches]], or [[context switch]]es, and transfers continue in parallel with other system operations. This reduces latency in message transfer.
 
However, this strategy presents several problems related to the fact that the target node is not notified of the completion of the request (single-sided communications).
An alternate proposal is RDMA over [[TCP/IP]], in which the TCP/IP protocol is used to move the data over a commodity data networking technology such as [[Gigabit Ethernet]]. Unlike conventional TCP/IP implementations, the RDMA implementation would have its TCP/IP stack implemented on the network adapter card, which would thus act as a [[I/O processor]], taking up the load of RDMA processing.
 
== Acceptance ==
This also has the advantage that software-based RDMA emulation will be possible, allowing interoperation between systems with dedicated RDMA hardware and those without.
As of 2018 RDMA had achieved broader acceptance as a result of implementation enhancements that enable good performance over ordinary networking infrastructure.<ref>RoCE Rocks over Lossy Network: https://dl.acm.org/citation.cfm?id=3098588&dl=ACM&coll=DL</ref> For example [[RDMA over Converged Ethernet]] (RoCE) now is able to run over either lossy or lossless infrastructure. In addition [[iWARP]] enables an [[Ethernet]] RDMA implementation at the physical layer using [[Transmission Control Protocol|TCP]]/[[Internet Protocol|IP]] as the transport, combining the performance and latency advantages of RDMA with a low-cost, standards-based solution.<ref>{{cite web|url=https://www.intel.com/content/dam/support/us/en/documents/network/sb/understanding_iwarp_final.pdf|title=Understanding iWARP|publisher=Intel Corporation|accessdate=16 May 2018}}</ref> The RDMA Consortium and the DAT Collaborative<ref>{{cite web|url=http://www.datcollaborative.org/|title=DAT Collaborative website|accessdate=14 October 2014|url-status=dead|archiveurl=https://web.archive.org/web/20150117180600/http://www.datcollaborative.org/|archivedate=17 January 2015}}</ref> have played key roles in the development of RDMA protocols and [[Application programming interface|APIs]] for consideration by standards groups such as the [[Internet Engineering Task Force]] and the Interconnect Software Consortium.<ref>[http://www.opengroup.org/icsc/ The Interconnect Software Consortium website] {{webarchive|url=https://web.archive.org/web/20050830201232/http://www.opengroup.org/icsc/ |date=2005-08-30 }}</ref>
 
Hardware vendors have started working on higher-capacity RDMA-based network adapters, with rates of 100&nbsp;Gbit/s reported.<ref>{{cite web|url=http://www.mellanox.com/page/file_storage/|title=Microsoft Based Solutions - Mellanox Technologies|accessdate=14 October 2014}}</ref><ref name="chelsio">{{cite web|url=http://www.chelsio.com/chelsio-to-demonstrate-40g-smb-direct-rdma-over-ethernet-for-windows-server-2012/|title=40Gbe SMB Direct RDMA Over Ethernet For Windows Server 2012 - Chelsio Communications|date=2 April 2013 |accessdate=14 October 2014}}</ref> Software vendors, such as [[IBM]],<ref>{{Cite web | url=https://www.openfabrics.org/wp-content/uploads/2022-workshop/2022-workshop-presentations/201_RPolig.pdf |title = SOFA-STORAGE: CREATING A VENDOR AGNOSTIC FRAMEWORK TO ENABLE SEAMLESS STORAGE OFFLOAD USING SMARTNICS}}</ref> [[Red Hat]] and [[Oracle Corporation]], support these APIs in their latest products,<ref>{{Cite web | url=https://access.redhat.com/solutions/22188 |title = What RDMA hardware is supported in Red Hat Enterprise Linux?| date=2 June 2016 }}</ref> and since 2013, engineers have been developing network adapters that implement RDMA over Ethernet.<ref>
{{cite web
| url= http://www.chelsio.com/chelsio-to-demonstrate-40g-smb-direct-rdma-over-ethernet-for-windows-server-2012/
| title= 40Gbe SMB Direct RDMA Over Ethernet For Windows Server 2012 - Chelsio Communications
| date = 2013-04-02
| publisher= Chelsio Communications
| accessdate= 2016-07-15
| quote = The demonstration will show Microsoft's Windows Server 2012 SMB Direct running at line-rate 40Gb using RDMA over Ethernet (iWARP).
}}
</ref>
Both [[Red Hat Enterprise Linux]] and [[Red Hat Enterprise MRG]]<ref>{{cite web|url=https://investors.redhat.com/news-and-events/press-releases/2011/06-23-2011|title=Red Hat Enterprise MRG 2.0 Now Available|accessdate=23 June 2011|url-status=dead|archiveurl=https://web.archive.org/web/20160825215016/https://investors.redhat.com/news-and-events/press-releases/2011/06-23-2011|archivedate=25 August 2016}}</ref> have support for RDMA. Microsoft supports RDMA in [[Windows Server 2012]] via [[Server Message Block|SMB Direct]]. [[VMware ESXi]] also supports RDMA as of 2015.
 
Common RDMA implementations include the [[Virtual Interface Architecture]], [[RDMA over Converged Ethernet]] (RoCE), [[InfiniBand]], [[Omni-Path]], [[iWARP]] and Ultra Ethernet.
 
== Using RDMA ==
Applications access control structures using well-defined APIs originally designed for the InfiniBand Protocol (although the APIs can be used for any of the underlying RDMA implementations). Using send and completion queues, applications perform RDMA operations by submitting work queue entries (WQEs) into the submission queue (SQ) and getting notified of responses from the completion queue (CQ). <ref>Storm: a fast transactional dataplane for remote data structures: https://dl.acm.org/doi/abs/10.1145/3319647.3325827</ref>
 
== Transport types ==
RDMA can transport data reliably or unreliably over the Reliably Connected (RC) and Unreliable Datagram (UD) transport protocols, respectively. The former has the benefit of preserving requests (no requests are lost), while the latter requires fewer queue pairs when handling multiple connections. This is due to the fact that UD is connection-less, allowing a single host to communicate with any other using a single queue.<ref>Storm: a fast transactional dataplane for remote data structures: https://dl.acm.org/doi/pdf/10.1145/3319647.3325827</ref>
 
== References ==
{{Reflist}}
 
== External links ==
* [http://www.rdmaconsortium.org/home RDMA Consortium]
* {{IETF RFC|5040}}: A Remote Direct Memory Access Protocol Specification
* [http://www.hpcwire.com/2006/09/15/a_tutorial_of_the_rdma_model-1/ A Tutorial of the RDMA Model]
* [https://www.hpcwire.com/2006/10/06/why_compromise-1/ "Why Compromise?"] // HPCwire, Gilad Shainer (Mellanox Technologies), 2006
* [http://www.hpcwire.com/hpcwire/2006-08-18/a_critique_of_rdma-1.html A Critique of RDMA] for high-performance computing
* [https://www.cs.utah.edu/~stutsman/cs6450/public/papers/rdma.pdf RDMA Reads: To Use or Not to Use?]
* [https://www.openfabrics.org/wp-content/uploads/2022-workshop/2022-workshop-presentations/201_RPolig.pdf]
 
[[Category:Computer memory]]
[[Category:Operating system technology]]
[[Category:Local area networks]]