Jumping table Skiplist [data structure] principle and implementation

Why Jump Table

Currently, the balanced data structures often used are: B-tree, red-black tree, AVL Tree, Splay Tree, Treep, etc.

Imagine giving you a piece of scratch paper, a pen, and an editor. Can you immediately realize a red-black tree, or an AVL tree? It’s difficult. It takes time. There are a lot of details to consider. You have to refer to a bunch of trees such as algorithms and data structures, as well as to refer to the code on the Internet, which is quite troublesome.

Let’s use the jump table. The jump table is a randomized data structure. It is currently used by the open source software Redis and LevelDB. Its efficiency is comparable to that of the red-black tree and the AVL tree, but the jump The principle of the table is quite simple, as long as you can operate the linked list proficiently, you can easily implement a SkipList.

Consider an ordered list:
Write picture description here
From this ordered list Search for elements in <23, 43, 59 >, the number of comparisons required are <2, 4, 6 >, and the total number of comparisons is 2 + 4 + 6 = 12 times. Is there an optimized algorithm? The linked list is ordered, but binary search cannot be used. Similar to a binary search tree, we extract some nodes and use them as indexes. Get the following structure:

Write the picture description here

It is extracted as a primary index, so that the number of comparisons can be reduced when searching.
We can also extract some elements from the primary index as a secondary index, a tertiary index…
Write the picture description here
There are not many elements here, and there is no advantage. If there are enough elements , This index structure can show its advantages.

jump table

The following structure is the jump table:
where -1 means INT_MIN, the minimum value of the linked list, and 1 means INT_MAX, the one of the linked list Maximum value.

Write picture description here

The jump table has the following properties:
(1) It is composed of many layers
(2) Each layer is an order The linked list of
(3) The linked list of the lowest level (Level 1) contains all elements.
(4) If an element appears in the linked list of Level i, it will also appear in the linked list of Level i.
(5) Each node contains two pointers, one pointing to the next element in the same linked list, and one pointing to the element at the next level.

Write the picture description here

Example: Find element 117
(1) Compare 21, greater than 21, look back
(2) Compare 37, greater than 37, smaller than the maximum value of the linked list, start from the level below 37
(3) Compare 71, greater than 71, than The maximum value of the linked list is small, starting from the layer below 71
(4) Compare 85, which is greater than 85, and look from the back
(5) Compare 117, which is equal to 117, and find the node.

The specific search algorithm is as follows:

/* If there is x, return the node where x is located, * otherwise return The successor node of x*/ find(x) {p = top; while (< span class="hljs-number">1) {while (p->next ->key < x) p = p->next; if (p-> span>down == NULL) return p->next; p = p-> span>down;} }

Insert jump table

First determine the number of layers K that the element will occupy (using the method of throwing a coin, this It is completely random)
Then insert elements into the linked lists of each level of Level 1… Level K.
Example: Insert 119, K = 2
Write picture description here

If K is greater than the number of levels in the linked list, a new level must be added.
Example: Insert 119, K = 4

write here Picture description

Throwing a coin determines K.
When inserting an element, the number of layers occupied by the element is completely Random, generated by the following random algorithm:

int random_level() {K = 1; while (random(0,1)) K++; return K; }

It is equivalent to doing a coin toss experiment. If you encounter heads, Continue to throw, and stop when you encounter the opposite side.
Use the number of coin throws K in the experiment as the number of layers occupied by the element. Obviously, the random variable K satisfies the geometric distribution of the parameter p = 1/2, the expected value of
K E[K] = 1/p = 2. That is, the number of layers of each element, the expected value is 2 layers.

The height of the jump table.
Jump table of n elements, when each element is inserted, an experiment is required to determine the number of layers K occupied by the element.
The height of the jump table is equal to the maximum K generated in these n experiments, to be continued. . .

Analysis of the space complexity of the skip table
According to the above analysis, the expected height of each element is 2, a skip table of size n, and the expected value of the number of nodes
is 2n.

Deletion of jump table

Find the node containing x in each layer, and delete the node using the standard delete from list method.
Example: Delete 71

Write picture description here

Why choose to skip the table< /h3>

The balanced data structures frequently used at present are: B-tree, red-black tree, AVL tree, Splay Tree, Treep, etc.

Imagine giving you a piece of scratch paper, a pen, and an editor. Can you immediately realize a red-black tree, or an AVL tree? It’s difficult. It takes time. There are a lot of details to consider. You have to refer to a bunch of trees such as algorithms and data structures, as well as to refer to the code on the Internet, which is quite troublesome.

Let’s use the jump table. The jump table is a randomized data structure. It is currently used by the open source software Redis and LevelDB. Its efficiency is comparable to that of the red-black tree and the AVL tree, but the jump The principle of the table is quite simple, as long as you can operate the linked list proficiently, you can easily implement a SkipList.

Consider an ordered list:
Write picture description here
From this ordered list Search for elements in <23, 43, 59 >, the number of comparisons required are <2, 4, 6 >, and the total number of comparisons is 2 + 4 + 6 = 12 times. Is there an optimized algorithm? The linked list is ordered, but binary search cannot be used. Similar to a binary search tree, we extract some nodes and use them as indexes. Get the following structure:

Write the picture description here

It is extracted as a primary index, so that the number of comparisons can be reduced when searching.
We can also extract some elements from the primary index as a secondary index, a tertiary index…
Write the picture description here
There are not many elements here, and there is no advantage. If there are enough elements , This index structure can show its advantages.

jump table

The following structure is the jump table:
where -1 means INT_MIN, the minimum value of the linked list, and 1 means INT_MAX, the one of the linked list Maximum value.

Write picture description here

The jump table has the following properties:
(1) It is composed of many layers
(2) Each layer is an order The linked list of
(3) The linked list of the lowest level (Level 1) contains all elements.
(4) If an element appears in the linked list of Level i, it will also appear in the linked list of Level i.
(5) Each node contains two pointers, one pointing to the next element in the same linked list, and one pointing to the element at the next level.

Write the picture description here

Example: Find element 117
(1) Compare 21, greater than 21, look back
(2) Compare 37, greater than 37, smaller than the maximum value of the linked list, start from the level below 37
(3) Compare 71, greater than 71, than The maximum value of the linked list is small, starting from the layer below 71
(4) Compare 85, which is greater than 85, and look from the back
(5) Compare 117, which is equal to 117, and find the node.

The specific search algorithm is as follows:

/* If there is x, return the node where x is located, * otherwise return The successor node of x*/ find(x) {p = top; while (< span class="hljs-number">1) {while (p->next ->key < x) p = p->next; if (p-> span>down == NULL) return p->next; p = p-> span>down;} }

Insert jump table

First determine the number of layers K that the element will occupy (using the method of throwing a coin, this It is completely random)
Then insert elements into the linked lists of each level of Level 1… Level K.
Example: Insert 119, K = 2
Write picture description here

If K is greater than the number of levels in the linked list, a new level must be added.
Example: Insert 119, K = 4

write here Picture description

Throwing a coin determines K.
When inserting an element, the number of layers occupied by the element is completely Random, generated by the following random algorithm:

int random_level() {K = 1; while (random(0,1)) K++; return K; }

It is equivalent to doing a coin toss experiment. If you encounter heads, Continue to throw, and stop when you encounter the opposite side.
Use the number of coin throws K in the experiment as the number of layers occupied by the element. Obviously, the random variable K satisfies the geometric distribution of the parameter p = 1/2, the expected value of
K E[K] = 1/p = 2. That is, the number of layers of each element, the expected value is 2 layers.

The height of the jump table.
Jump table of n elements, when each element is inserted, an experiment is required to determine the number of layers K occupied by the element.
The height of the jump table is equal to the maximum K generated in these n experiments, to be continued. . .

Analysis of the space complexity of the skip table
According to the above analysis, the expected height of each element is 2, a skip table of size n, and the expected value of the number of nodes
is 2n.

Deletion of jump table

Find the node containing x in each layer, and delete the node using the standard delete from list method.
Example: Delete 71

Write picture description here

Leave a Comment

Your email address will not be published.