Stability detection of consistency hash algorithm and its problems and solvement strategies

Join a machine node:

     assumes p2p network To add a new machine node N(new), then N(new) must first establish a connection with the N(x) node in the p2p network, and use N(x) to find the corresponding N(new) according to the routing algorithm of “route query” The hash value H(N(new))=new, and then find its successor node, assuming that its successor node is N(s), and the predecessor node of N(s) is N(j), then N (new) If the nodes are between them, the p2p network architecture relationship between them must be re-established.

  In a non-concurrent environment, you can do two steps like this:

     1. Change the records of the predecessor and successor nodes of N(j), N(new), and N(s) to form a new network architecture.

    2. After the N(new) node is inserted into the N(j) and N(s) nodes, then N(new) is N(s) The predecessor node of ), N(s) is the successor node of N(new), and the key value of N(s) that originally fell in the hash value space of the predecessor node N(new) is migrated to the N(new) node .

   In a concurrent environment, there may be multiple nodes between N(j) and N(s) at the same time. Join and do this:

    1. The successor node of N(new) is referred to as N(s), and the predecessor of N(new) Node refers to null.

    2. Stability detection (This step is not just for adding a new node, but Each node will be automatically completed periodically, through It can complete predecessor and subsequent updates and data migration. In the p2p network architecture, each node will execute it regularly)

Stability detection algorithm flow:

   1. Assume that N(s) is N(c) N(c) asks N(s) for the predecessor node N(p),

   2. If N(p) Between N(c) and N(s), N(c) records N(p) as its successor node.

  3. Let N(x) be the successor of N(c), which may be N(s) or N(p) , It depends on the previous step, if the predecessor node of N(x) is null or N(c) is located between N(x) and its predecessor node, then N(c) will send a message to tell N(x) , It is the predecessor node of N(x), and N(x) sets its predecessor node to N(c).

  4. N(x) transfers part of the data (hash value less than c) on it that belongs to N(c) to N( c) On,

Don’t say anything about the picture above:

        Share a picture—-Before adding N(c) to the p2p network– ——————“After —>Share pictures

      The red line represents the successor node , The blue line represents the predecessor node.


     Suppose that after a certain period of time N(p) starts to perform stability testing, at this time,
< /p>

                                        images/images/strong_images/images/strong_images/images. /p>

After adding a new node, the routing table of the original node may be incorrect. The routing item that should point to the new node points to the old node. Therefore, each node needs to periodically check the routing table. For each item in the routing table k=2 to the power of i (0<=i<=m-1 ) In addition, the local node executes the routing algorithm. After the query is obtained, if the retained content of the routing item is different, it will be updated to the new content to complete the update of the routing table.

  There are two ways for nodes to leave the p2p network: normal leave and abnormal leave.

  Normal leaving will generally notify the predecessor and successor nodes to migrate data, and the routing table failure can be updated by “adding a new node”. The abnormal departure is caused by a machine failure. At this time, the data may be lost. To avoid this, the data can be backed up on multiple machines.

Problems with consistent hashing:

em>

       1. The mapping of machine nodes to the ring structure position is random and easy Leading to unbalanced machine load,

       2. Large-scale data center, There are both high-performance and high-configuration machines, as well as low-performance and low-configuration machines. Consistent hashing does not consider this situation, and treats all machines as equals, which is likely to cause low-configuration and high-load situations.

In response to its problems, a “virtual node” is introduced:

< p>    A physical node is virtualized into several virtual nodes, which are distributed to different positions in the ring structure of consistent hashing. (It can lead to better load balancing and reduce the problem of machine heterogeneity)

Leave a Comment

Your email address will not be published.