Persistent data structure: Difference between revisions

Content deleted Content added
m Partial versus full persistence: fix capitalization, improve wording
Line 9:
 
==Partial versus full persistence==
In the partial persistence model, a programmer may query any previous version of a data structure, but may only update the latest version. This implies a [[Total order|linear ordering]] among each version of the data structure.<ref>{{Citation|last1=Conchon|first1=Sylvain|chapter=Semi-persistent Data Structures|pages=322–336|publisher=Springer Berlin Heidelberg|isbn=9783540787389|last2=Filliâtre|first2=Jean-Christophe|doi=10.1007/978-3-540-78739-6_25|title=Programming Languages and Systems|volume=4960|series=Lecture Notes in Computer Science|year=2008|doi-access=free}}</ref> In the fully persistent model, both updates and queries are allowed on any version of the data structure. In some cases the [[Computer performance|performance characteristics]] of querying or updating older versions of a data structure may be allowed to degrade, as is true with the [[Rope (data structure)|Roperope data structure]].<ref>{{Cite book|title=RRB-Trees: Efficient Immutable Vectors|last=Tiark|first=Bagwell, Philip Rompf|date=2011|oclc=820379112}}</ref> In addition, a data structure can be referred to as confluently persistent if, in addition to being fully persistent, two versions of the same data structure can be combined to form a new version which is still fully persistent.<ref>{{Citation|last1=Brodal|first1=Gerth Stølting|title=Purely Functional Worst Case Constant Time Catenable Sorted Lists|date=2006|work=Lecture Notes in Computer Science|pages=172–183|publisher=Springer Berlin Heidelberg|isbn=9783540388753|last2=Makris|first2=Christos|last3=Tsichlas|first3=Kostas|doi=10.1007/11841036_18|citeseerx=10.1.1.70.1493}}</ref>
 
==Techniques for preserving previous versions==
Line 35:
Whenever a node is accessed, the modification box is checked, and its timestamp is compared against the access time. (The access time specifies the version of the data structure being considered.) If the modification box is empty, or the access time is before the modification time, then the modification box is ignored and only the normal part of the node is considered. On the other hand, if the access time is after the modification time, then the value in the modification box is used, overriding that value in the node.
 
Modifying a node works like this. (It is assumed that each modification touches one pointer or similar field.) If the node's modification box is empty, then it is filled with the modification. Otherwise, the modification box is full. A copy of the node is made, but using only the latest values. The modification is performed directly on the new node, without using the modification box. (One of the new node's fields is overwritten and its modification box stays empty.) Finally, this change is cascaded to the node's parent, just like path copying. (This may involve filling the parent's modification box, or making a copy of the parent recursively. If the node has no parent—it's the root—it is added the new root to a [[sorted array]] of roots.)
 
With this [[algorithm]], given any time t, at most one modification box exists in the data structure with time t. Thus, a modification at time t splits the tree into three parts: one part contains the data from before time t, one part contains the data from after time t, and one part was unaffected by the modification.