Revision as of 20:37, 18 April 2025 edit Michel Bakni (talk \| contribs) Extended confirmed users 765 edits →Disk storage replication: Vectorial version ← Previous edit		Latest revision as of 21:24, 27 April 2025 edit undo Citation bot (talk \| contribs) Bots 5,870,798 edits Alter: title, template type, pages. Add: url, isbn, page, chapter. Removed parameters. Formatted dashes. \| Use this bot. Report bugs. \| Suggested by Headbomb \| Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox \| #UCB_webform_linked 366/492
Line 1: {{short description\|Sharing information to ensure consistency in computing}} {{More footnotes needed\|date=October 2012}} '''Replication''' in [[computing]] refers to maintaining multiple copies of data, processes, or resources to ensure consistency across redundant components. This fundamental technique spans [[database management system\|databases]], [[file system\|file systems]], and [[distributed computing\|distributed systems]], serving to improve [[high availability\|availability]], [[fault-tolerance]], accessibility, and performance.<ref name="kleppmann"/> Through replication, systems can continue operating when components fail ([[failover]]), serve requests from geographically distributed locations, and balance load across multiple machines. The challenge lies in maintaining consistency between replicas while managing the fundamental tradeoffs between data consistency, system availability, and [[Network partition\|network partition tolerance]] – constraints known as the [[CAP theorem]].<ref>{{cite ~~journal~~book \|last=Brewer \|first=Eric A. \|~~title~~chapter=Towards robust distributed systems (Abstract) \|~~journal~~page=7 \|title=Proceedings of the ~~Annual~~nineteenth annual ACM ~~Symposium~~symposium on Principles of ~~Distributed~~distributed ~~Computing~~computing \|year=2000 \|doi=10.1145/343477.343502\|isbn=1-58113-183-6 }}</ref> == {{Anchor\|MASTER-ELECTION}}Terminology == Line 31: * '''Transactional replication''': used for replicating [[transactional data]], such as a database. The [[one-copy serializability]] model is employed, which defines valid outcomes of a transaction on replicated data in accordance with the overall [[ACID]] (atomicity, consistency, isolation, durability) properties that transactional systems seek to guarantee. * '''[[State machine replication]]''': assumes that the replicated process is a [[deterministic finite automaton]] and that [[atomic broadcast]] of every event is possible. It is based on [[Consensus (computer science)\|distributed consensus]] and has a great deal in common with the transactional replication model. This is sometimes mistakenly used as a synonym of active replication. State machine replication is usually implemented by a replicated log consisting of multiple subsequent rounds of the [[Paxos algorithm]]. This was popularized by Google's Chubby system, and is the core behind the open-source [[Keyspace (data store)\|Keyspace data store]].<ref name=keyspace>{{cite web \| access-date=2010-04-18 \| year = 2009 \| url=http://scalien.com/whitepapers \|title=Keyspace: A Consistently Replicated, Highly-Available Key-Value Store \| author=Marton Trencseni, Attila Gazso}}</ref><ref name=chubby>{{cite web \| access-date=2010-04-18 \| year=2006 \| url=http://labs.google.com/papers/chubby.html \| title=The Chubby Lock Service for Loosely-Coupled Distributed Systems \| author=Mike Burrows \| url-status=dead \| archive-url=https://web.archive.org/web/20100209225931/http://labs.google.com/papers/chubby.html \| archive-date=2010-02-09 }}</ref> * '''[[Virtual synchrony]]''': involves a group of processes which cooperate to replicate in-memory data or to coordinate actions. The model defines a distributed entity called a ''process group''. A process can join a group and is provided with a checkpoint containing the current state of the data replicated by group members. Processes can then send [[multicast]]s to the group and will see incoming multicasts in the identical order. Membership changes are handled as a special multicast that delivers a new "membership view" to the processes in the group.<ref>{{Cite book \|last1=Birman \|first1=K. \|last2=Joseph \|first2=T. \|title=Proceedings of the eleventh ACM Symposium on Operating systems principles - SOSP '87 \|chapter=Exploiting virtual synchrony in distributed systems \|date=1987-11-01 \|chapter-url=https://doi.org/10.1145/41457.37515 ~~\|series=SOSP '87~~ \|___location=New York, NY, USA \|publisher=Association for Computing Machinery \|pages=123–138 \|doi=10.1145/41457.37515 \|isbn=978-0-89791-242-6\|s2cid=7739589 }}</ref> == {{Anchor\|DATABASE}}Database replication == [[Database]] replication involves maintaining copies of the same data on multiple machines, typically implemented through three main approaches: single-leader, multi-leader, and leaderless replication.<ref name="kleppmann">{{cite book \|last=Kleppmann \|first=Martin \|title=Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems \|year=2017 \|publisher=O'Reilly Media \|isbn=9781491903100 \|pages=~~151-185~~151–185}}</ref> In [[Master–slave (technology)\|single-leader]] (also called primary/replica) replication, one database instance is designated as the leader (primary), which handles all write operations. The leader logs these updates, which then propagate to replica nodes. Each replica acknowledges receipt of updates, enabling subsequent write operations. Replicas primarily serve read requests, though they may serve stale data due to replication lag – the delay in propagating changes from the leader. Line 116: Modern multi-primary replication protocols optimize for the common failure-free operation. Chain replication<ref>{{Cite journal \|last1=van Renesse \|first1=Robbert \|last2=Schneider \|first2=Fred B. \|date=2004-12-06 \|title=Chain replication for supporting high throughput and availability \|url=https://dl.acm.org/doi/abs/10.5555/1251254.1251261 \|journal=Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation - Volume 6 \|series=OSDI'04 \|___location=USA \|publisher=USENIX Association \|pages=7 \|doi=}}</ref> is a popular family of such protocols. State-of-the-art protocol variants<ref>{{Cite journal \|last1=Terrace \|first1=Jeff \|last2=Freedman \|first2=Michael J. \|date=2009-06-14 \|title=Object storage on CRAQ: high-throughput chain replication for read-mostly workloads \|url=https://dl.acm.org/doi/abs/10.5555/1855807.1855818 \|journal=USENIX Annual Technical Conference \|series=USENIX'09 \|___location=USA \|pages=11 \|doi=}}</ref> of chain replication offer high throughput and strong consistency by arranging replicas in a chain for writes. This approach enables local reads on all replica nodes but has high latency for writes that must traverse multiple nodes sequentially. A more recent multi-primary protocol, [https://hermes-protocol.com/ Hermes],<ref>{{Cite book \|last1=Katsarakis \|first1=Antonios \|last2=Gavrielatos \|first2=Vasilis \|last3=Katebzadeh \|first3=M.R. Siavash \|last4=Joshi \|first4=Arpit \|last5=Dragojevic \|first5=Aleksandar \|last6=Grot \|first6=Boris \|last7=Nagarajan \|first7=Vijay \|title=Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems \|chapter=Hermes: A Fast, Fault-Tolerant and Linearizable Replication Protocol \|date=2020-03-13 \|chapter-url=https://doi.org/10.1145/3373376.3378496 \|series=ASPLOS '20 \|___location=New York, NY, USA \|publisher=Association for Computing Machinery \|pages=201–217 \|doi=10.1145/3373376.3378496 \|hdl=20.500.11820/c8bd74e1-5612-4b81-87fe-175c1823d693 \|isbn=978-1-4503-7102-5\|s2cid=210921224 \|url=https://www.pure.ed.ac.uk/ws/files/130434070/Hermes_a_Fast_KATASARAKIS_DOA02122019_AFV.pdf }}</ref> combines cache-coherent-inspired invalidations and logical timestamps to achieve strong consistency with local reads and high-performance writes from all replicas. During fault-free operation, its broadcast-based writes are non-conflicting and commit after just one multicast round-trip to replica nodes. This design results in high throughput and low latency for both reads and writes. ==See also==

Replication (computing): Difference between revisions