Persistent data structure: Difference between revisions

Content deleted Content added
m Fixing archives for YouTube videos (WP:Link_Rot, WP:CEFC#Pre-emptive_archiving, phab:T294880)
Line 80:
====Persistent Data Structure Method====
We can notice that what really takes time in the data structure used in the naïve method is that whenever we move from a strip to the next, we need to take a snap shot of whatever data structure we are using to keep things in sorted order. We can notice that once we get the segments that intersect <math>s_{i}</math>, when we move to <math>s_{i+1}</math> either one thing leaves or one thing enters. If the difference between what is in <math>s_{i}</math> and what is in <math>s_{i+1}</math> is only one insertion or deletion then it is not a good idea to copy everything from <math>s_{i}</math> to <math>s_{i+1}</math>. The trick is that since each copy differs from the previous one by only one insertion or deletion, then we need to copy only the parts that change. Let us assume that we have a tree rooted at <math>T</math>. When we insert a key <math>k</math> into the tree, we create a new leaf containing <math>k</math>. Performing rotations to rebalance the tree will only modify the nodes of the path from <math>k</math> to <math>T</math>. Before inserting the key <math>k</math> into the tree, we copy all the nodes on the path from <math>k</math> to <math>T</math>. Now we have 2 versions of the tree, the original one which doesn't contain <math>k</math> and the new tree that contains <math>k</math> and whose root is a copy of the root of <math>T</math>. Since copying the path from <math>k</math> to <math>T</math> doesn't increase the insertion time by more than a constant factor then the insertion in the persistent data structure takes <math>O(Log(n))</math> time. For the deletion, we need to find which nodes will be affected by the deletion. For each node <math>v</math> affected by the deletion, we copy the path from the root to <math>v</math>. This will provide a new tree whose root is a copy of the root of the original tree. Then we perform the deletion on the new tree. We will end up with 2 versions of the tree. The original one which contains <math>k</math> and the new one which doesn't contain <math>k</math>. Since any deletion only modifies the path from the root to <math>v</math> and any appropriate deletion algorithm runs in <math>O(Log(n))</math>, thus the deletion in the persistent data structure takes <math>O(Log(n))</math>. Every sequence of insertion and deletion will cause the creation of a sequence of dictionaries or versions or trees <math>S_{1}, S_{2}, \dots S_{i}</math> where each <math>S_{i}</math> is the result of operations <math>S_{1}, S_{2}, \dots S_{i}</math>. If each <math>S_{i}</math> contains <math>m</math> elements, then the search in each <math>S_{i}</math> takes <math>O(Log(m))</math>. Using this persistent data structure we can solve the next element search problem in <math>O(Log(n))</math> query time and <math>O(n\cdot Log(n))</math> space instead of <math>O(n^{2})</math>. Please find below the source code for an example related to the next search problem.
 
====Source Code Example of Next Search Problem====
Below is a source code sample of how to find the segment above a given point q. This programs takes as input a list of segments and a query point and returns the starting point and the ending point of the segment above it.
<syntaxhighlight lang="C++" line>
#include <iostream>
using namespace std;
class Point
{
public:
int x;
int y;
 
Point();
Point(int,int);
};
 
Point::Point()
{
x=0;
y=0;
}
 
Point::Point(int xValue, int yValue)
{
x=xValue;
y=yValue;
}
 
std::ostream& operator<<(std::ostream &strm, const Point &p) {
return strm << "(" << p.x << ","<<p.y<<")";
}
/*____________________________________________________________________________________*/
 
class Segment
{
public:
Point startPoint;
Point endPoint;
 
Segment();
Segment(Point,Point);
};
 
Segment::Segment()
{
startPoint= Point();
}
 
Segment::Segment(Point startPointValue,Point endPointValue)
{
startPoint=Point(startPointValue.x,startPointValue.y);
endPoint=Point(endPointValue.x,endPointValue.y);
}
 
std::ostream& operator<<(std::ostream &strm, const Segment &s) {
return strm << "(" << s.startPoint << ","<<s.endPoint<<")";
}
/*____________________________________________________________________________________*/
 
class BST
{
public:
Point point;
BST *left, *right;
 
BST();
BST(Point);
BST* insert(BST*, Point);
BST* insertToCopy(BST*,BST*, Point);
BST* minValueNode(BST*);
BST* deleteNode(BST*, Point);
BST* deleteFromCopy(BST*,BST*, Point);
Point findSmallestLargerOrEqualYCoordinatePoint(BST*, int);
bool searchByYCoordinate(BST*, int);
void Traverse(BST*);
};
 
BST::BST()
{
point=Point();
left=NULL;
right=NULL;
}
BST::BST(Point p)
{
point = Point(p.x,p.y);
left = right = NULL;
}
 
bool BST::searchByYCoordinate(BST* node, int yCoordinate)
{
if (node == NULL)
return false;
if (node->point.y == yCoordinate)
return true;
bool res1 = searchByYCoordinate(node->left, yCoordinate);
 
if(res1) return true;
bool res2 = searchByYCoordinate(node->right, yCoordinate);
return res2;
}
BST* BST::insert(BST* root, Point p)
{
if (!root)
{
return new BST(p);
}
if (p.y > root->point.y)
{
root->right = insert(root->right, p);
}
else
{
root->left = insert(root->left, p);
}
return root;
}
 
BST* BST::insertToCopy(BST* source, BST* destination,Point p)
{
if(source==NULL)
{
destination = insert(destination, p);
}
else if (p.y > source->point.y)
{
destination= insert(destination, source->point);
destination->left=source->left;
destination->right=insertToCopy(source->right,destination->right,p);
}
else if (p.y < source->point.y)
{
destination= insert(destination, source->point);
destination->right=source->right;
destination->left =insertToCopy(source->left,destination->left,p);
}
 
return destination;
}
 
BST* BST::minValueNode(BST* node)
{
BST* current = node;
while (current && current->left != NULL)
current = current->left;
return current;
}
 
BST* BST::deleteNode(BST* root, Point p)
{
if (root == NULL)
return root;
if (p.y < root->point.y)
root->left = deleteNode(root->left, p);
else if (p.y > root->point.y)
root->right = deleteNode(root->right, p);
else {
if (root->left==NULL and root->right==NULL)
return NULL;
else if (root->left == NULL) {
BST* temp = root->right;
free(root);
return temp;
}
else if (root->right == NULL) {
BST* temp = root->left;
free(root);
return temp;
}
BST* temp = minValueNode(root->right);
root->point = temp->point;
root->right = deleteNode(root->right, temp->point);
}
return root;
}
 
BST* BST::deleteFromCopy(BST* source, BST* destination,Point p)
{
if (p.y > source->point.y)
{
destination= insert(destination, source->point);
destination->left=source->left;
destination->right=deleteFromCopy(source->right,destination->right,p);
}
else if (p.y < source->point.y)
{
destination= insert(destination, source->point);
destination->right=source->right;
destination->left =deleteFromCopy(source->left,destination->left,p);
}
else
{
if(source->right==NULL && source->left==NULL)
{
destination==NULL;
}
else if(source->right==NULL)
{
destination= source->left;
}
else if(source->left==NULL)
{
destination= source->right;
}
else
{
destination=insert(destination,minValueNode(source->right)->point);
destination->left=source->left;
deleteFromCopy(source->right,destination->right,minValueNode(source->right)->point);
}
}
 
return destination;
}
 
void BST::Traverse(BST* root)
{
if (!root) {
return;
}
Traverse(root->left);
cout << root->point << endl;
Traverse(root->right);
}
 
Point BST::findSmallestLargerOrEqualYCoordinatePoint(BST* root, int key)
{
if (root->left == NULL && root->right == NULL&& root->point.y < key)
return Point(-1,-1);
if ((root->point.y >= key && root->left == NULL)|| (root->point.y >= key && root->left->point.y < key))
return root->point;
if (root->point.y <= key)
return findSmallestLargerOrEqualYCoordinatePoint(root->right, key);
else
return findSmallestLargerOrEqualYCoordinatePoint(root->left, key);
}
/*____________________________________________________________________________________*/
 
class Strip
{
public:
Point point;
BST *copy;
 
Strip();
Strip(Point);
Strip(Point,BST*);
void sort(Strip*,int);
};
 
Strip::Strip()
{
point= Point();
copy=NULL;
}
 
Strip::Strip(Point p)
{
point= Point(p.x,p.y);
copy=NULL;
}
 
Strip::Strip(Point p,BST *t)
{
point= Point(p.x,p.y);
copy=t;
}
 
void swap(int* a, int* b)
{
int t = *a;
*a = *b;
*b = t;
}
 
int getIndexOfStrip( Strip *strips, int key, int stripSize )
{
int top = 0;
int bot = stripSize;
int mid=0;
while ( top <= bot )
{
mid = (top+bot)/2;
if ( key < strips[mid].point.x)
{
if ( mid == 0 )
return -1;
else
bot = mid-1;
}
else
{
if ( mid == stripSize )
break;
else if ( key >= strips[mid+1].point.x )
top = mid+1;
else
break;
}
}
while(strips[mid].point.x==key && strips[mid+1].point.x==key)
{
mid=mid+1;
}
 
return mid;
}
 
/*___________________________________________________________________________________*/
 
int main()
{
Strip temp;
BST b, *root = NULL;
Point points[] = { Point(1,1), Point(3,1), Point(2,2), Point(4,2), Point(2,4),Point(4,4),Point(2,5),Point(4,5),Point(3,7),Point(5,7) };
Point q(3,3);
Segment segments[]={Segment(points[0],points[1]),Segment(points[2],points[3]),Segment(points[4],points[5]),Segment(points[6],points[7])};
int stripSize=(sizeof points / sizeof points[0]);
int segmentSize=(sizeof segments / sizeof segments[0]);
Strip *strips=new Strip [stripSize];
 
//Initialize the array of strips
for(int i=0;i<stripSize;i++)
{
strips[i]=Strip(points[i]);
}
//Sort the array of strips by x coordinate
for (int i = 0; i < stripSize; i++)
for (int j = stripSize - 1; j > i; j--)
if (strips[j].point.x <strips[j - 1].point.x )
{
temp = strips[j];
strips[j] = strips[j - 1];
strips[j - 1] = temp;
}
 
strips[0].copy=(strips[0].copy)->insert(strips[0].copy,strips[0].point);
 
for (int i = 1; i < stripSize; i++)
{
/*If the y coordinate of the point is in the tree then delete it
otherwise insert the y coordinate in the appropriate strip*/
if(!(strips[i-1].copy->searchByYCoordinate(strips[i-1].copy,strips[i].point.y)))
{
strips[i].copy=strips[i].copy->insertToCopy(strips[i-1].copy,strips[i].copy,strips[i].point);
}
else
{
strips[i].copy=strips[i].copy->deleteFromCopy(strips[i-1].copy,strips[i].copy,strips[i].point);
}
}
for (int i = 1; i < stripSize; i++)
{
cout<<"For strip "<<i<<" the items are:"<< endl;
strips[i].copy->Traverse(strips[i].copy);
cout<<";";
}
 
//search for the appropriate strip based on x coordinate, then search for the smallest larger or equal y coordinate in that strip
int indexOfStrip= getIndexOfStrip(strips,q.x,stripSize);
Point target=strips[indexOfStrip].copy->findSmallestLargerOrEqualYCoordinatePoint(strips[indexOfStrip].copy,q.y);
if((target.x==-1 && target.y==-1))
{
cout<<"There is no segment above the query point q";
}
else
{
cout<<"Segment above the query point q has starting point:";
cout<<target;
for(int i=0;i<segmentSize;i++)
{
if(segments[i].startPoint.x==target.x && segments[i].startPoint.y==target.y)
{
cout<<"and end point:";
cout<<segments[i].endPoint;
break;
}
}
}
delete[] strips;
return 0;
}
</syntaxhighlight>
 
==Examples of persistent data structures==