Copy-on-write: Difference between revisions

Content deleted Content added
m Format unordered list per MOS:BULLETLIST.
Line 1:
{{shortShort description|Programming technique for efficiently duplicating data}}
{{More citations needed|date=August 2020}}
 
Line 5:
 
==In virtual memory management==
Copy-on-write finds its main use in sharing the [[virtual memory]] of [[operating system]] [[computer process|process]]es, in the implementation of the [[Fork (system call)|fork system call]]. Typically, the process does not modify any memory and immediately executes a new process, replacing the address space entirely. Thus, it would be wasteful to copy all of the process's memory during a fork, and instead the copy-on-write technique is used.
 
Copy-on-write can be implemented efficiently using the [[page table]] by marking certain pages of [[computer storage|memory]] as read-only and keeping a count of the number of references to the page. When data is written to these pages, the [[kernel (computing)|kernel]] intercepts the write attempt and allocates a new physical page, initialized with the copy-on-write data, although the allocation can be skipped if there is only one reference. The kernel then updates the page table with the new (writable) page, decrements the number of references, and performs the write. The new allocation ensures that a change in the memory of one process is not visible in another's.
 
The copy-on-write technique can be extended to support efficient [[memory allocation]] by having a page of [[physical memory]] filled with zeros. When the memory is allocated, all the pages returned refer to the page of zeros and are all marked copy-on-write. This way, physical memory is not allocated for the process until data is written, allowing processes to reserve more virtual memory than physical memory and use memory sparsely, at the risk of running out of virtual address space. The combined algorithm is similar to [[demand paging]].<ref name="Linux" />
Line 22:
==In software==
{{expand section|date=October 2017}}
COW is also used in [[Library (computer science)|library]], [[Application software|application]] and [[System software|system]] code.
 
In [[Multithreading (computer architecture)|multithreaded]] systems, COW can be implemented without the use of traditional [[Lock (software engineering)|locking]] and instead use [[compare-and-swap]] to increment or decrement the internal reference counter. Since the original resource will never be altered, it can safely be copied by multiple threads (after the reference count was increased) without the need of performance-expensive locking such as [[Lock_Lock (computer_sciencecomputer science)|mutexes]]. If the reference counter turns 0, then by definition only 1 thread was holding a reference so the resource can safely be de-allocated from memory, again without the use of performance-expensive locking mechanisms. The benefit of not having to copy the resource (and the resulting performance gain over traditional deep-copying) will therefore be valid in both single- and multithreaded systems.
 
===Examples===
Line 44:
 
When implementing snapshots, there are two techniques:
#* Redirect-on-write or ROW: the original storage is never modified. When a write request is made, it is redirected away from the original data into a new storage area.
#* Copy-on-write or COW: when a write request is made, the data are copied into a new storage area, and then the original data are modified.
 
Despite their names, copy-on-write usually refers to the first technique. COW does two data writes compared to ROW's one; it is difficult to implement efficiently and thus used infrequently.
Line 58:
[[Phantom OS]] uses COW at all levels, not just a database or file system. At any time, a computer running this system can fail, and then, when it starts again, the software and operating system resume operation. Only small amounts of work can be lost.
 
The basic approach is that all program data are kept in virtual memory. On some schedule, a summary of all software data is written to virtual memory, forming a log that tracks the current value and ___location of each value.
 
When the computer fails, a recent copy of the log and other data remain safe on disk. When operation resumes, operating system software reads the log to restore consistent copies of all the programs and data.
Line 80:
 
{{File systems}}
 
[[Category:Articles with example C++ code]]
[[Category:Computer data storage]]