Copy-on-write: Difference between revisions

Content deleted Content added
Rescuing 1 sources and tagging 0 as dead.) #IABot (v2.0.9.3) (Whoop whoop pull up - 12943
reworded explanation. +comma. checked URLs.
Line 2:
{{More citations needed|date=August 2020}}
 
'''Copy-on-write''' ('''COW'''), sometimes referred to as '''implicit sharing'''<ref>{{cite web |title= Implicit Sharing |urlwebsite=Qt httpProject |url=https://doc.qt.io/qt-5/implicit-sharing.html|website= Qt Project|access-date= 4 August 2016}}</ref> or '''shadowing''',<ref>{{cite journal |last= Rodeh |first= Ohad |title= B-Trees, Shadowing, and Clones |journal= ACM Transactions on Storage |datevolume=3 |issue=4 |date=1 February 2008 |volumepage=1 3|issueciteseerx=10.1.1.161.6863 4|pages2cid=207166167 1|doi= 10.1145/1326542.1326544 |url= http://liw.fi/larch/ohad-btrees-shadowing-clones.pdf |access-date= 4 August 2016|citeseerx= 10.1.1.161.6863|s2cid= 207166167|archiveurl-datestatus=dead 2 January 2017|archive-url= https://web.archive.org/web/20170102212904/http://liw.fi/larch/ohad-btrees-shadowing-clones.pdf |urlarchive-statusdate=2 deadJanuary 2017}}</ref> is a [[resource management (computing)|resource-management]] technique used in [[computer programming]] to efficiently implement a "duplicate" or "copy" operation on modifiable [[system resource#General resources.|resources]]<ref name="Linux">{{cite book |title=Understanding the Linux Kernel |url=https://books.google.com/books?id=9yIEji1UheIC&q=%22copy%20on%20write%22&pg=PA295 |last1=Bovet |first1=Daniel Pierre |last2=Cesati |first2=Marco |date=2002-01-01 |publisher=O'Reilly Media |isbn=9780596002138 |page=295 |url=https://books.google.com/books?id=9yIEji1UheIC&q=%22copy%20on%20write%22&pg=PA295}}</ref> If(most a resource is duplicated butcommonly notmemory modifiedpages, itstorage is not necessary to create a new resource; the resource can be shared between the copy and the original. Modifications must still create a copysectors, hence the technique: the copy operation is deferred until the first write. By sharing resources in this wayfiles, itand is possible to significantly reduce the resource consumption of unmodified copies, while adding a small overhead to resource-modifyingdata operationsstructures).
 
When a resource will be duplicated (but not modified), it is not necessary to copy the contents of that resource because, with a little care, every copy can share the same stored contents. (The original counts as one copy because it does not matter which one was the original.) When and if a computer program begins to write into any copy (to change or modify it), the computer must first make a non-shared copy by actually copying the contents, and then write into the new copy. Hence the name of the technique. The actual copy operation is deferred until something needs to write into any copy (which might not ever happen before the copy is discarded). Sharing storage in this way can substantially reduce the consumption of resources by unmodified copies, at the cost of adding some overhead (testing and possibly additional storage) to resource-modifying operations.
 
==In virtual memory management==
Copy-on-write finds its main use in [[operating system]]s, sharing the [[virtualphysical memory]] of [[operatingcomputers system]]running multiple [[Computer process (computing)|processes]], in the implementation of the [[Forkfork (system call)|fork() system call]]. Typically, the new process does not modify any memory and immediately executes a new process, replacing the address space entirely. Thus, itIt would bewaste wastefulprocessor time and memory to copy all of the old process's memory during athe fork, andonly insteadto immediately discard the copy. Instead the new process continues running from the same physical memory, in a cheap copy-on-write techniquecopy isof usedthe old process's memory, before discarding it.
 
Copy-on-write can be implemented efficiently using the [[page table]] by marking certain pages of [[Computercomputer storage|memory]] as read-only and keeping a count of the number of references to the page. When data is written to these pages, the operating-system [[Kernelkernel (operating system)|kernel]] intercepts the write attempt and allocates a new physical page, initialized with the copy-on-write data, although the allocation can be skipped if there is only one reference. The kernel then updates the page table with the new (writable) page, decrements the number of references, and performs the write. The new allocation ensures that a change in the memory of one process is not visible in another's.
 
The copy-on-write technique can be extended to support efficient [[memory allocation]] by havingkeeping aone page of [[physical memory]] filled with zeros. When the memory is allocated, all the pages returned refer to the page of zeros and are all marked copy-on-write. This way, physical memory is not allocated for the process until data is written, allowing processes to reserve more virtual memory than physical memory and use memory sparsely, at the risk of running out of virtual address space. The combined algorithm is similar to [[demand paging]].<ref name="Linux" />
 
Copy-on-write pages are also used in the [[Linux kernel]]'s [[Kernelkernel same-page merging|same-page merging]] feature.<ref>{{cite web |last=Abbas |first=Ali |title=The Kernel Samepage Merging Process |website=alouche.net |url=http://alouche.net/blog/2011/07/18/the-kernel-samepage-merging-process/|website=alouche.net |access-date=4 August 2016 |url-status=usurped |archive-url=https://web.archive.org/web/20160808174912/http://alouche.net/blog/2011/07/18/the-kernel-samepage-merging-process/ |archive-date=8 August 2016 }}</ref>
 
==In software==
{{expandExpand section|date=October 2017}}
 
COW is also used in [[Librarylibrary (computer science)|library]], [[Applicationapplication software|application]], and [[Systemsystem software|system]] code.
 
===Examples===
The [[Stringstring (C++)|string]] class provided by the [[C++ standard library]] was specifically designed to allow copy-on-write implementations in the initial C++98 standard,<ref name="meyers">{{cite book |first=Scott |last=Meyers |author-link=Scott Meyers |date=2012 |title=Effective STL |publisher=Addison-Wesley |pages=64–65 |isbn=9780132979184 |url=https://books.google.com/books?id=U7lTySXdFk0C&pg=PT734|isbn=9780132979184 }}</ref> but not in the newer C++11 standard:<ref>{{cite web |title=Concurrency Modifications to Basic String |website=Open Standards |url=httphttps://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2534.html|website=Open Standards|access-date=13 February 2015}}</ref>
<syntaxhighlight lang="cpp">
std::string x("Hello");
Line 28 ⟶ 30:
</syntaxhighlight>
 
In the [[PHP]] programming language, all types except references are implemented as copy-on-write. For example, strings and arrays are passed by reference, but when modified, they are duplicated if they have non-zero reference counts. This allows them to act as value types without the performance problems of copying on assignment or making them immutable.<ref>{{cite web |last1=Pauli |first1=Julien |last2=Ferrara |first2=Anthony |last3=Popov |first3=Nikita |title=Memory management |website=PhpInternalsBook.com |date=2013 |url=httphttps://www.phpinternalsbook.com/php5/zvals/memory_management.html#reference-counting-and-copy-on-write |website=PhpInternalsBook.com |access-date=4 August 2016 |date=2013 }}</ref>
 
In the [[Qt (software)|Qt]] framework, many types are copy-on-write ("implicitly shared" in Qt's terms). Qt uses atomic [[compare-and-swap]] operations to increment or decrement the internal reference counter. Since the copies are cheap, Qt types can often be safely used by [[Multithreadingmultithreading (computer architecture)|multiple threads]] without the need of [[Locklock (computer science)|locking mechanisms]] such as [[Mutualmutual exclusion|mutexes]]. The benefits of COW are thus valid in both single- and multithreaded systems.<ref>{{cite web |title=Threads and Implicitly Shared Classes |website=Qt Project |url=httphttps://doc.qt.io/qt-5/threads-modules.html#threads-and-implicitly-shared-classes |website=Qt Project |access-date=4 August 2016 }}</ref>
 
==In computer storage==
COW may also be used as the underlying mechanism for [[Snapshotsnapshot (computer storage)|snapshots]], such as those provided by [[logical volume management]], file systems such as [[Btrfs]] and [[ZFS]],<ref>{{cite web |url=http://sakisk.me/files/copy-on-write-based-file-systems.pdf |title=Copy-on-Write Based File Systems Performance Analysis and Implementation |last=Kasampalis |first=Sakis |date=2010 |page=19 |url=https://sakisk.me/files/copy-on-write-based-file-systems.pdf |access-date=11 January 2013 }}</ref> and database servers such as [[Microsoft SQL Server#Replication Services|Microsoft SQL Server]]. Typically, the snapshots store only the modified data, and are stored close to the original, so they are only a weak form of [[incremental backup]] and cannot substitute for a [[full backup]].<ref>{{cite web |last=Chien |first=Tim |title=Snapshots Are NOT Backups |website=Oracle.com |publisher=Oracle |url=httphttps://www.oracle.com/technetworkdatabase/documentationtechnologies/rman-fra-snapshot-322251.html |website=Oracle.com |publisher=Oracle |access-date=4 August 2016 }}</ref>
 
==See also==