[Data Structure] Analysis of B Tree

1. The concept of B-tree

B-tree, in general, is a balanced multi-branch tree in which a node can have more than 2 child nodes.

Features:

1> The root node has at least two child nodes

2>Each non-follow node node has (M/2)-1 to M -1 key

3>Each non-root node has [M/2 ,M] children

4>between key[i] and key[i+1] The value of the child node is between key[i] and key[i+1]

5> All leaf nodes are in the same layer

Second, B-tree application

Because the efficiency of adding, deleting, checking and modifying is very high (time complexity log M-1 is the logarithm of the base N), B-trees are commonly used in databases and file systems. Generally, M is defined to be very large, so that the height of the B-tree is very low, and binary search can be used in each node, so the B-tree is highly efficient. The disadvantage is that it consumes more memory.

Three, implement a B-tree

1. Node content

According to the characteristics of the B-tree, a node should contain a key array of size M-1 , But considering that the B-tree needs to insert nodes first and then split, so the key array gives M large space. Second, it should contain an array of node pointer type, storing pointers to children, and the size should be M+1. Also, a pointer to the parent node, and a size representing the actual number of key values ​​of the current node.

2. Node insertion

1>When the tree is empty, directly call the node’s constructor, new a root.

2>When it is not empty, search the tree first. If the key code to be inserted already exists, the insertion fails. If it does not exist, let the Find() function return the pointer of the inserted position. Then, Find not only returns whether the key value is in the tree (bool), but also returns the position where the node is to be inserted (Node*). Therefore, we use a type pair that exists in the library, which can bring back two return values.

3>Next, insert the key value into the position returned by Find.

4>Check whether the number of key values ​​after insertion exceeds M-1, if not, the insertion is successful, if it exceeds, you need to split. According to the characteristics of the B-tree, it can be known that the insertion position can only be a leaf node.

What is the split? See the picture below

Splitting is the difficulty of insertion. After understanding splitting, there is no problem with insertion.

3. Code implementation

#include using namespace std;templatestruct BTreeNode{ typedef BTreeNode Node; K _keys[M]; //An extra location is given to facilitate splitting. Node* _sub[M + 1]; Node* _parent; size_t _size; //Record the number of actual keywords BTreeNode() :_parent(NULL), _size(0) {for (size_t i = 0; i class BTree{public: typedef BTreeNode Node; BTree() :_root(NULL) {} void InOrder() {_InOrder(_root);} ~BTree() {_Destory(_root);} bool Insert(const K& key) {if (NULL == _root) { _root = new Node(); _root->_keys[0] = key; _root->_size++; return true;} pair tmp = _Find(key); if (tmp.second != -1) //If the key value is already existing, it cannot be inserted {return false;} Node* cur = tmp.first; Node* sub = NULL; K newkey = key; while (1) {_Insertkey(cur,newkey,sub); //First put the key into the node to be inserted //Judging whether the key number of the node meets the standard. if (cur->_size _size >= M) {size_t mid = cur->_size / 2; //1. Split a new node Node* NewNode = new Node; for (size_t i = mid+1; i _size; i++)//The key value after mid is given to the new node { int j = 0; NewNode->_keys[j] = cur->_keys[i]; NewNode->_size++; //cur->_size--; //Be careful not to change the size of cur, otherwise it will Affect the next cycle cur->_keys[i] = K(); //The key corresponding to cur after the new node is assigned should be set to the initial value j++;} int j = 0; for (size_t i = mid+1; i _size+1; i++) {NewNode->_sub[j] = cur->_sub[i]; if (NewNode->_sub[j]) NewNode->_sub[j]->_parent = NewNode ; j++; cur->_sub[i] = NULL;} if (cur == _root) //Create a new root node {Node* tmp = new Node(); tmp->_keys[0] = cur-> _keys[mid]; cur->_keys[mid] = K(); cur->_size=mid; tmp->_size++; tmp->_sub[0] = cur; cur->_parent = tmp; tmp->_sub [1] = NewNode; NewNode->_parent = tmp; _root = tmp; return true;} newkey = cur->_keys[mid]; cur-> _keys[mid] = K(); cur->_size = mid; sub = NewNode;} cur = cur->_parent;} }protected: void _Destory(Node* root) {if (NULL == root) return; size_t i = 0; for (; i _size; i++) {_Destory(root->_sub[i]); delete root->_sub[i];} _Destory(root->_sub[i]); delete root->_sub[i];} void _InOrder(Node* root) {if (NULL == root) return; size_t i = 0; for (; i _size; i++) {_InOrder(root->_sub [i]); cout << root->_keys[i]<<" ";} _InOrder(root->_sub[i]);} pair _Find(const K& key) {Node* cur = _root; Node* parent = NULL; while (cur) {size_t i = 0; while (i< cur->_size) //Find the key of the current node {if (cur->_keys[i] _keys[i]>key) //cur = cur->_sub[i]; break; else // A keyword equal to the passed key is found return pair( cur,i);} parent = cur; cur = cur->_sub[i];} return pair(parent, -1); //Not found} void _Insertkey(Node* cur,const K& key,Node*sub) {int i = cur- >_size-1; while (i >=0) {if (cur->_keys[i]> key) {//Move the position of the key cur->_keys[i + 1] = cur->_keys[i] ; //Move the position of the subtree cur->_sub[i + 2] = cur->_sub[i + 1]; i--;} else break;} //i records the previous position to be inserted cur- >_keys[i + 1] = key; cur->_sub[i + 2] = sub; if (sub) sub->_parent = cur; cur->_size++; }private: Node* _root;};void TestBTree( ){ BTree bt; int a[] = {53, 75, 139, 49, 145, 36, 101 }; for (int i = 0; i 

Leave a Comment

Your email address will not be published.