ZooKeeper simple section

1. What is ZooKeeper

  ZooKeeper is a distributed, open source distributed application coordination service. It is mainly used to solve the consistency problem of the application system in the distributed cluster. It can provide data storage based on the directory node tree method similar to the file system, but Zookeeper is not used to store data specifically, its role is mainly used Maintain and monitor the status changes of the stored data. Distributed applications can implement synchronization services, configuration maintenance and naming services based on it.

  The data model of ZooKeeper is a tree node. After the service is started, all data is loaded into the memory to increase server throughput and reduce latency.

2. How to use:

   As mentioned earlier, the role of ZooKeeper is to maintain and monitor the state of the data in the data model (directory node tree), so operations are like operations The directories are the same: add and delete directory nodes, add and delete subdirectories, set permissions, set data, and monitor directory changes.

  • zkCli command line use:
    • Create: create a node, and set data, there are two types of creation identification: persistent directory node (PERSISTENT), sequential directory node (SEQUENTIAL ) A total of four combinations.
      • create [-s] [-e] PATH DATA acl # Path and data, data can be empty -s is a sequential node (the serial number will be automatically appended to the name) -e is a temporary node, if not added, it is a permanent node

        $ create /zktest/test1 test1

        < /li>

    • Delete:
      • 1 < span style="color: #000000">delete PATH [version] # Delete the node of the specified path, and the node must be empty, if you want to delete the node with child nodes, you can’t
        2 rmr PATH # Recursive deletion

    • Limit:
      • setquota -n|-b val path # -n is to limit the number of child nodes -b is to limit the data length of the node val is the quota path is the node 
        delquota [
        -n|-b] PATH # Delete quota settings
        listquota PATH # Display node quota information
    • acl permissions:
      • Reset access permissions for a directory node. Note that the directory node permissions in Zookeeper are not It is transitive. The permissions of the parent directory node cannot be transferred to the child directory node. Directory node ACL consists of two parts: perms and id.
        Perms includes ALL, READ, WRITE, CREATE, DELETE, ADMIN
        and id identifies the list of identities that access the directory node. By default, there are two types:
        ANYONE_ID_UNSAFE = new Id(“world “, “anyone”) and AUTH_IDS = new Id(“auth”, “”) respectively indicate that anyone can access and the creator has access rights.
    • Settings
      • set PATH DATA [version] # Path and data, you can set the version number or Do not set

  • Python usage (this example uses Python3, kazoo, or zkpython):
    • share picture

      # -*- coding: utf-8 -*-
      import sys, time
      import kazoo
      from kazoo.client import KazooClient
      from kazoo.client import DataWatch
      from span> kazoo.client import ChildrenWatch


      < span style="color: #0000ff">class ZkWatcher(object):
      def __init__(self, node, hosts="127.0.0.1:2181", timeout=15.0):
      self._hosts
      = hosts
      self._timeout < /span>= timeout
      self._node
      = node
      self._zk
      = KazooClient(hosts=self._hosts, timeout=timeout or self._timeout)
      self. _node_children_list
      = []

      def connect(self , timeout=15.0):
      if self._zk is None:
      self._zk
      = KazooClient(hosts=self._hosts, timeout=timeout or self._timeout)
      self._zk.start()

      def close(self):
      if self._zk:
      self._zk.stop()
      self._zk.close()
      self._zk
      = None

      def watcher_start(self):
      self.connect()
      self._node_children_list
      = self._zk.get_children(self._node)
      try:
      ChildrenWatch(self._zk, self._node, func
      =self._node_children_change)
      DataWatch(self._zk, path
      =self._node, func=self._node_data_change)

      while True:
      time.sleep(
      60)
      except Exception as e:
      print(e.args)

      def _node_children_change(self, children):
      len_now
      = len(self._node_children_list)
      if len(children) > len_now:
      print("add:", set(children) - set(self._node_children_list))
      else:
      < /span>print("remove: < /span>", set(self._n ode_children_list) - set(children))
      self._node_children_list
      = children

      def _node_data_change(self, data, stat):
      < span style="color: #0000ff">print
      ("Update length{}, content: {}".format(stat.data_length, data.decode("< span style="color: #800000">utf-8
      ")))
      print("Version Number:", stat.version)
      print("cve rsion:", stat.cversion)
      print("Number of child nodes:", stat.numChildren)

      def register(self):
      """
      register a node for this worker
      """
      if not self._zk.exists( self._node):
      self._zk.create(self._node, bytes(
      "%s"% time.time()), ephemeral=False, sequence=True)


      if __name__ == "__main__":
      zw
      = ZkWatcher("/zktest", hosts="192.168.0.189:2181")
      zw.watcher_start()

      View Code

  • Introduction:
    • Time and version number in ZK:

      • ZXID: The state change of the ZK node will cause the node to receive a zxid Format timestamp, this timestamp is globally ordered, and each update will generate a new one. If the value of zxid1 is less than zxid2, then the change in zxid2 is after zxid1. Zxid is a unique transaction ID, which is incremental. The establishment or update of a znode will generate a new zxid value. There are 3 cZxid (node ​​creation time) and mZxid (the node modification time, which has nothing to do with child nodes). , PZxid (the last creation or modification time of the child node of the node, regardless of the grandchildren node)
      • version: Each operation on the node will increase the version number of the node. There are three version numbers: dataversion (data version number), cversion (sub-node version number), aclversion (the ACL version number owned by the node)
      • < tr>

        < tr>

        cZxid The transaction ID when the node was created
        ctime
        The time when the node was created

        mZxid
        Last modified node Time transaction ID
        mtime
        The time when the node was last modified
        pZxid

        < /tbody>

        Indicates the transaction ID of the last modification of the child node list of the node. Adding or deleting a child node will affect the child node list, but modifying the data content of the child node does not affect the ID.
        cversion
        The version number of the child node, and the version number of the child node is increased by 1 each time the child node is modified
        dataversion
        Data version number, data Each time you modify the version number, add 1
        aclversion
        The permission version number, the version number is increased by 1 each time the permission is modified
        dataLength
        The data length of the node
        numChildren < tr>

        The number of child nodes that this node has

        < /td>

3. Why is it used this way, application scenarios

  ZooKeeper from the design pattern From a point of view, it is a distributed service management framework designed based on the observer pattern. It is responsible for storing and managing the data that everyone cares about, and then accepting the registration of the observer. Once the state of these data changes, ZooKeeper will be responsible for notifying the Those observers registered on ZooKeeper react accordingly, thus realizing a similar Master/Slave management mode in the cluster.

   is mainly used for general configuration, such as: machine list, switch of certain parameters. It has the following features:

  • The volume of information transmitted is small
  • Dynamic adjustment of supply services
  • A group of applications use the same configuration

   is used for distributed decentralized cluster management: if there are multiple servers To form a service cluster, a “manager” must know the service status of each machine in the current cluster. Once a machine cannot provide services, other clusters in the cluster must know to adjust the redistribution service strategy. Similarly, when the service capacity of the cluster is increased, one or more servers will be added, and the “manager” must also be notified to adjust the allocation. There are two functions:

  • Maintain the status of the servers in the cluster
  • Automatically select the cluster The’master’ Master in

The election of   Leaders is based on its own ZAB algorithm, the atomic message broadcast protocol. The ZAB protocol includes two basic modes, crash recovery and message broadcasting. The general process:

  1. => Recovery mode: During the cluster startup process, the leader is disconnected, In abnormal situations such as crash exit or restart, ZAB will enter the recovery mode and elect a new leader. When a new leader is generated and more than half of the cluster is corrupted to complete synchronization with the leader’s state (data synchronization), then ZAB protocol exits and restores Mode, enter the message broadcast mode.
  2. => Broadcast mode: ZooKeeper uses a single main process to receive client transaction requests, generates a transaction proposal and initiates a broadcast, if it is the Leader server; if not After the Leader server receives the client request, it forwards the request to the Leader server. .
  3. Broadcast=>Recovery: If a new server is added to the current cluster, the new server will automatically enter the recovery mode. After the synchronization with the cluster leader is completed Just enter the message broadcast mode

  

4. Principle:

  1. Atomic Message Broadcasting Protocol

< ul>

    • Message broadcast:
  •     Leader server before broadcasting the transaction The transaction will first be assigned a globally monotonically increasing unique ID, which is the transaction ID (ZXID), and each transaction must be processed in the order of ZXID. And the Leader server will allocate a separate queue for each Follower, and then put the transactions that need to be broadcast into the queue. After each follower server receives the transaction, it will write it to the local disk in the form of transaction log. After successful writing, it will feed back an ACK to the leader. When the leader receives half of the ACK response, it will broadcast a Commit The message is sent to all followers to notify them to submit, and the leader will also complete its own submission.

      • Crash recovery:

        其The purpose is to ensure that a new leader is elected as soon as possible and notified to other followers, while ensuring that the data status in the entire cluster is consistent

      2 Node data management: tree file storage system and through the client and ZK Establish a long TCP connection to maintain the session. Through this connection, you can detect the heartbeat and save the session with the server, you can also send a request and receive a response from the server, or you can receive a WATCH event.

    5. Reference:

    • The principle of zookeeper
    • Understanding Zookeeper (3): Znode features in Zookeeper

    create [-s] [-e] PATH DATA acl # Path and data, data can be empty -s It is a sequential node (the serial number will be automatically appended to the name) -e is a temporary node, if it is not added, it is a permanent node

     1 delete PATH [version] # Delete the node of the specified path, and the node must be empty, if you want to delete the node with child nodes, you can’t
    2 rmr PATH # Recursive deletion

    setquota -n|-b val path # -n is the limit The number of child nodes-b is the limit of the node data length val is the quota path is the node
    delquota [
    -n|-b] PATH # Delete quota settings
    listquota PATH # Display node quota information

    set PATH DATA [version] # Path and data, You can set the version number or not.

    Share picture

    # -*- coding: utf-8 -*-
    import sys, time
    import kazoo
    from kazoo.client import KazooClient
    from kazoo.client < span style="color: #0000ff">import DataWatch
    from kazoo.client import ChildrenWatch


    class ZkWatcher(object):
    def __init__(self, node, hosts="127.0.0.1:2181", timeout=15.0):
    self._hosts
    = hosts
    self._timeout
    = timeout
    self._node
    = node< br /> self._zk = KazooClient(hosts=self._hosts, timeout=timeout or self ._timeout)
    self._node_children_list
    = []

    def connect(self, timeout=15.0):
    if self._zk is None:
    self._zk = KazooClient(hosts=self._hosts, timeout=timeout or self._timeout)
    self._zk.start()

    def close(self):
    if self._zk:
    self._zk.stop()
    self._zk.close()
    self._zk
    = None

    def watcher_start(self):
    self.connect()
    self._node_children_list
    = self._zk.get_children(self._node)
    try:
    ChildrenWatch(self._zk, self._node, func
    =self._node_children_change)
    DataWatch(self._zk, path
    =self._node, func=self._node_data_change)

    while True:
    time.sleep(
    60)
    except Exception as e:< br /> print(e.args)

    def _node_children_change(self, children):
    len_now
    = len(self._node_children_list)
    if len(children) > len_now:
    print("add:", set(children) - set(self._node_children_list))< br /> else:
    print("remove: ", set(self._node_children_list) - set(children))
    self._node_children_list
    = children

    def _node_data_change(self, data, stat):
    print("Update length{}, content:{}".format(stat.data_length, data.decode("utf-8")))
    print< /span>("Version number:" , stat.version)
    print("cversion:", stat.cversion)
    print("Number of child nodes:", stat. numChildren)

    def register(self):
    """
    register a node for this worker
    """
    if not self._zk.exists(self._node):
    self._zk.create(self._node, bytes(
    "%s"% time. time()), ephemeral=False, sequence=True)


    if __name__ == "__main__":
    zw
    = ZkWatcher("/zktest", hosts="192.168.0.189:2181" )
    zw.watcher_start()

    View Code

    < pre># -*- coding: utf-8 -*-
    import sys, time
    import kazoo
    from kazoo.client import KazooClient
    from kazoo.client import DataWatch
    from kazoo. client import ChildrenWatch

    class ZkWatcher(object):
    def __init__(self, node, hosts=127.0.0.1:2181, timeout=15.0):
    self._hosts
    = hosts
    self._timeout
    = timeout
    self._node
    = node
    self._zk
    = KazooClient(hosts=self._hosts, timeout=timeout or self._timeout)
    self._node_children_list
    = []< br />
    def connect(self, timeout=15.0):
    if self._zk is None:
    self._zk
    = KazooClie nt(hosts=self._hosts, timeout=timeout or self._timeout)
    self._zk. start()

    def close(self):
    if self._zk:
    self._zk.stop()
    self._zk.close()
    self._zk
    = None

    def watcher_start(self):
    self.connect()
    self._node_children_list
    = self._zk.get_children(self._node)
    try:
    ChildrenWatch(self._zk, self._node, func
    =self._node_children_change)
    DataWatch(self._zk, path =self._node, func=self._node_data_change)

    while span> True:
    time.sleep(
    60)
    except Exception as e:
    print< /span>(e.args)

    def _node_children_change(self, children):
    len_now
    = len(self._node_children_list)
    if len(children) > len_now:
    print(add:, set(children) – set(self._node_children_list))
    else:
    print(remove: , set(self._node_children_list) – set(children))
    self._node_children_list
    = children

    def _node_data_change(self, data, stat):
    print(更新长度{},内容:{}.format(stat.data_length, data.decode(utf-8)))
    print(版本号:, stat.version)
    print(cversion:, stat.cversion)
    print(子节点数:, stat.numChildren)

    def register(self):
    “””
    register a node for this worker
    “””
    if not self._zk.exists(self._node):
    self._zk.create(self._node, bytes(
    %s % time.time()), ephemeral=False, sequence=True)

    if __name__ == __main__:
    zw
    = ZkWatcher(/zktest, hosts=192.168.0.189:2181)
    zw.watcher_start()

    Leave a Comment

    Your email address will not be published.