Copy-on-write: Difference between revisions

Content deleted Content added
Monkbot (talk | contribs)
m Task 18 (cosmetic): eval 10 templates: del empty params (1×); hyphenate params (1×); del |url-status= (1×);
Monkbot (talk | contribs)
m Task 18 (cosmetic): eval 10 templates: hyphenate params (8×);
Line 2:
{{More citations needed|date=August 2020}}
 
'''Copy-on-write''' ('''COW'''), sometimes referred to as '''implicit sharing'''<ref>{{cite web|title= Implicit Sharing|url= http://doc.qt.io/qt-5/implicit-sharing.html|website= Qt Project|accessdateaccess-date= 4 August 2016}}</ref> or '''shadowing''',<ref>{{cite journal|last= Rodeh|first= Ohad|title= B-Trees, Shadowing, and Clones|journal= ACM Transactions on Storage|date= 1 February 2008|volume= 3|issue= 4|page= 1|doi= 10.1145/1326542.1326544|url= http://liw.fi/larch/ohad-btrees-shadowing-clones.pdf |accessdateaccess-date= 4 August 2016|citeseerx= 10.1.1.161.6863|s2cid= 207166167}}</ref> is a resource-management technique used in [[computer programming]] to efficiently implement a "duplicate" or "copy" operation on modifiable resources.<ref name="Linux">{{cite book |title=Understanding the Linux Kernel |url=https://books.google.com/books?id=9yIEji1UheIC&q=%22copy%20on%20write%22&pg=PA295 |last1=Bovet |first1=Daniel Pierre |last2=Cesati |first2=Marco |date=2002-01-01 |publisher=O'Reilly Media |isbn=9780596002138 |page=295}}</ref> If a resource is duplicated but not modified, it is not necessary to create a new resource; the resource can be shared between the copy and the original. Modifications must still create a copy, hence the technique: the copy operation is deferred until the first write. By sharing resources in this way, it is possible to significantly reduce the resource consumption of unmodified copies, while adding a small overhead to resource-modifying operations.
 
==In virtual memory management==
Line 11:
The copy-on-write technique can be extended to support efficient [[memory allocation]] by having a page of [[physical memory]] filled with zeros. When the memory is allocated, all the pages returned refer to the page of zeros and are all marked copy-on-write. This way, physical memory is not allocated for the process until data is written, allowing processes to reserve more virtual memory than physical memory and use memory sparsely, at the risk of running out of virtual address space. The combined algorithm is similar to [[demand paging]].<ref name="Linux" />
 
Copy-on-write pages are also used in the [[Linux kernel]]'s [[kernel same-page merging]] feature.<ref>{{cite web|last1=Abbas|first1=Ali|title=The Kernel Samepage Merging Process|url=http://alouche.net/blog/2011/07/18/the-kernel-samepage-merging-process/|website=alouche.net|accessdateaccess-date=4 August 2016|archive-url=https://web.archive.org/web/20160808174912/http://alouche.net/blog/2011/07/18/the-kernel-samepage-merging-process/|archive-date=8 August 2016|url-status=dead}}</ref>
 
Loading the libraries for an application is also a use of copy-on-write technique. The dynamic linker maps libraries as private like follows. Any writing action on the libraries will trigger a COW in virtual memory management.
Line 27:
 
===Examples===
The [[String (C++)|string]] class provided by the [[C++ standard library]] was specifically designed to allow copy-on-write implementations in the initial C++98 standard,<ref name="meyers">{{citation |first=Scott |last=Meyers |author-link=Scott Meyers |year=2012 |title=Effective STL |publisher=Addison-Wesley |pages=64–65 |url=https://books.google.com/books?id=U7lTySXdFk0C&pg=PT734|isbn=9780132979184 }}</ref> but not in the newer C++11 standard:<ref>{{cite web|title=Concurrency Modifications to Basic String|url=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2534.html|website=Open Standards|accessdateaccess-date=13 February 2015}}</ref>
<syntaxhighlight lang="cpp">
std::string x("Hello");
Line 36:
// x still uses the same old buffer
</syntaxhighlight>
In the [[PHP]] programming language, all types except references are implemented as copy-on-write. For example, strings and arrays are passed by reference, but when modified, they are duplicated if they have non-zero reference counts. This allows them to act as value types without the performance problems of copying on assignment or making them immutable.<ref>{{cite web|last1=Pauli|first1=Julien|last2=Ferrara|first2=Anthony|last3=Popov|first3=Nikita|title=Memory management|url=http://www.phpinternalsbook.com/zvals/memory_management.html#reference-counting-and-copy-on-write|website=www.phpinternalsbook.com|publisher=PHP Internals Book|accessdateaccess-date=4 August 2016|date=2013}}</ref>
 
In the [[Qt (software)|Qt]] framework, many types are copy-on-write ("implicitly shared" in Qt's terms). Qt uses atomic [[compare-and-swap]] operations to increment or decrement the internal reference counter. Since the copies are cheap, Qt types can often be safely used by multiple threads without the need of locking mechanisms such as [[Mutual exclusion|mutexes]]. The benefits of COW are thus valid in both single- and multithreaded systems.<ref>{{cite web|title=Threads and Implicitly Shared Classes|url=http://doc.qt.io/qt-5/threads-modules.html#threads-and-implicitly-shared-classes|website=Qt Project|accessdateaccess-date=4 August 2016}}</ref>
 
== In computer storage ==
 
COW may also be used as the underlying mechanism for [[Snapshot (computer storage)|snapshots]], such as those provided by [[logical volume management]], file systems such as [[Btrfs]] and [[ZFS]],<ref>{{cite web|url=http://sakisk.me/files/copy-on-write-based-file-systems.pdf|title=Copy On Write Based File Systems Performance Analysis And Implementation|last=Kasampalis|first=Sakis|year=2010|page=19|accessdateaccess-date=11 January 2013}}</ref> and database servers such as [[Microsoft SQL Server#Replication Services|Microsoft SQL Server]]. Typically, the snapshots store only the modified data, and are stored close to the main array, so they are only a weak form of [[incremental backup]] and cannot substitute for a [[full backup]].<ref>{{cite web|last1=Chien|first1=Tim|title=Snapshots Are NOT Backups|url=http://www.oracle.com/technetwork/documentation/rman-fra-snapshot-322251.html|website=www.oracle.com|publisher=Oracle|accessdateaccess-date=4 August 2016}}</ref> Some systems also use a COW technique to avoid the [[fuzzy backup]]s, otherwise incurred when any file in the set of files being backed up changes during that backup.
 
When implementing snapshots, there are two techniques: