ZooKeeper + Kafka Cluster - cluster, construction, Kafka, Zookeeper, Zookeeper + Kafka

ZooKeeper is a distributed, open source distributed application coordination service, an open source implementation of Google’s Chubby, and an important component of Hadoop and Hbase. It is a software that provides consistent services for distributed applications. The functions provided include configuration maintenance, domain name service, distributed synchronization, cluster management, etc.

Share a picture

Because the Kafka cluster saves state information in Zookeeper, and the dynamic expansion of Kafka is achieved through Zookeeper, it is necessary to build a Zookeerper cluster first to establish distributed state management. Start to prepare the environment and build the cluster:

zookeeper is developed based on the Java environment, so you need to install Java first. Then the zookeeper installation package version used here is zookeeper-3.4.14, and the Kafka installation package version is kafka_2.11 -2.2.0.

AMQP protocol: Advanced Message Queuing Protocol (Advanced Message Queuing Protocol) is a standard and open application layer message middleware protocol. AMQP defines the data format of the byte stream sent over the network. Therefore, the compatibility is very good. Any program that implements the AMQP protocol can interact with other programs that are compatible with the AMQP protocol. It can easily be cross-language and cross-platform.

server1: 192.168.42.128

server2: 192.168.42.129

server3: 192.168.42.130

Check the system before installation No built-in open-jdk

Command:

rpm -qa |grep java

rpm -qa |grep jdk

If not Enter the information to indicate that it is not installed.
Retrieve the list of 1.8
yum list java-1.8*
Install all files of 1.8.0
yum install java-1.8.0-openjdk* -y

Use the command Check if the installation is successful
java -version

cat /etc/hosts

192.168.42.128 kafka01

192.168.42.129 kafka02

192.168.42.130 kafka03

1. Turn off selinux and firewall.

setenforce 0

systemctl stop firewalld && systemctl disable firewalld

2. Create a storage directory for zookeeper and Kafka:

cd /usr /local/

wget http://mirrors.cnnic.cn/apache/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz

wget http: //mirrors.tuna.tsinghua.edu.cn/apache/kafka/2.2.0/kafka_2.11-2.2.0.tgz

tar xzvf zookeeper-3.4.14.tar.gz

tar xzvf kafka_2.11-2.2.0.tgz

mkdir -pv zookeeper/{zkdata,zkdatalog} #zkdata is to store snapshot logs, zkdatalog is to store transaction logs

mkdir -pv kafka/kfdatalogs #kfdatalogs is to store message logs

4. Generate and change the configuration file of zookeeper (you need to set it on all three servers):

# can put zoo_sample Understand as the configuration file template that comes with zookeeper, copy a configuration file ending with .cfg.

cp -av /usr/local/zookeeper/conf/{zoo_sample,zoo.cfg}

tickTime=2000 #zookeeper server heartbeat time.

initLimit=10 #zookeeper’s maximum connection failure time

syncLimit=5 #zookeeper’s synchronization communication time

dataDir=/usr/local/zookeeper/zkdata #zookeeper’s absolute path for storing snapshot logs

dataLogDir=/usr/local/zookeeper/zkdatalog #zookeeper’s absolute path for storing transaction logs

clientPort=2181 #zookeeper and client Connection port

server.1=192.168.11.139:2888:3888 #Server and its number, server IP address, communication port, election port

server.2=192.168.11.140: 2888:3888 #Server and its number, server IP address, communication port, election port

server.3=192.168.11.141:2888:3888 #Server and its number, server IP address, communication port, election Port

#The above ports are the default ports of zookeeper, which can be modified as required.

5. Create a myid file:

On server1

echo “1”> /usr/local/zookeeper/zkdata/myid

# is to send the server number to myid under zkdata on different servers.

6. Start zookeeper cluster

cd /usr/local/zookeeper/bin

./zkServer.sh start

./ zkServer.sh status

#Mode: leader is the master node, Mode: follower is the slave node. The zk cluster generally has only one leader and multiple followers. The master generally responds to client read and write requests, while the slave Synchronize data, and when the master hangs up, it will vote for a leader from the follower.

zookeeper election process/working principle

https://blog.csdn.net/wyqwilliam/article/details/83537139

In the zookeeper cluster, Each node has the following 3 roles and 4 states:

Roles: leader, follower, observer

Status: leading, following, observing, looking

A Server has 4 states during its work:

LOOKING: The current Server does not know who the leader is and is searching.

LEADING: The current Server is the elected leader.

FOLLOWING: The leader has been elected, and the current server is synchronized with it.

OBSERVING: In most cases, the behavior of observers is exactly the same as that of followers, but they do not participate in elections and voting, but only accept (observing) the results of elections and voting.

So far, After the zookeeper cluster is built, the next step is to build a kafka cluster based on zookeeper:

The basic concept of Kafka:

Topic refers specifically to the different classifications of feeds of messages processed by Kafka .

Partition: The physical grouping of Partition Topic. A topic can be divided into multiple partitions, and each partition is an ordered queue. Each message in the partition will be assigned an ordered id (offset).

Message: Message is the basic unit of communication. Each producer can publish some messages to a topic.

Producers: The data producers of messages. The process of publishing messages to a topic in Kafka is called producers.

Consumers: The data consumers of messages, the process of subscribing to topics and processing the published messages is called consumers.

Broker: Cache proxy, one or more servers in the Kafka cluster are collectively called broker, here is the AMQP protocol.

cd /usr/local/kafka/config/

1. Modify the server.properties file:

broker.id=1 #here and in zookeeper The myid file is the same, using a unique identifier

prot=9092 #The port for connecting between Kafka clusters, which is not in the configuration file, but the default is 9092, which can be modified as required. Here we add

log.dirs=/usr/local/kafka/kfdatalogs #Absolute path to store Kafka message logs

advertised.listeners=PLAINTEXT://kafka01:9092

log .retention.hours=168 #The maximum persistence time of the default message, 168 hours, 7 days

message.max.byte=5242880 #The maximum value of message storage is 5M

default. replication.factor=2 #kafka saves the number of copies of the message. If one copy fails, the other can continue to provide services

replica.fetch.max.bytes=5242880 #Get the maximum direct number of messages< /p>

zookeeper.connect=192.168.42.128:2181,192.168.42.129:2181,192.168.42.130:2181#The IP address of each node of the cluster and the port of zookeeper, what is the port set in the zookeeper cluster here The port is what it is.

Unmodified configuration file information:

num.network.threads=3 #This is the number of threads used by the borker for network processing

num.io.threads =8 #This is the number of threads used by the borker for I/O processing

num.partitions=1 #The default number of partitions, a topic defaults to 1 partition number

log.retention. hours=168 #The maximum persistence time of the default message, 168 hours, 7 days

message.max.byte=5242880 #The maximum value of message storage is 5M

default.replication.factor =2 #kafka saves the number of copies of the message. If one copy fails, the other can continue to provide services

replica.fetch.max.bytes=5242880 #Get the maximum direct number of messages

log.segment.bytes=1073741824 #This parameter is: because Kafka’s message is appended to the file, when this value is exceeded, Kafka will create a new file

log. retention.check.interval.ms=300000 #Check the log expiration time configured above (log.retention.hours=168) every 300000 milliseconds, go to the directory to see if there are any expired messages. If there are, delete them

log.cleaner.enable=false #Whether to enable log compression, generally do not need to enable it, if enabled, it can improve performance

2. Start Kafka cluster:

cd /usr/local/kafka /bin

bash kafka-server-start.sh -daemon …/config/server.properties

3. Start testing:

3.1 Create topic

./kafka-topics.sh –create –zookeeper 192.168.21.241:2181 –replication-factor 2 –partitions 1 –topic wg01

#–replication-factor 2 Copy two copies

#–partitions 1 Create a partition

#–top ic tian Subject is tian

3.2 Create a producter:

./kafka-console-producer.sh –broker-list 192.168.21.241:9092 –topic wg01

3.3 Create a consumer:

./kafka-console-consumer.sh –bootstrap-server 192.168.11.141:9092 –topic wg01 –from-beginning

3.4 View topic:

./kafka-topics.sh –list –zookeeper 192.168.11.141:2181———————————————— Copyright Notice: This article It is the original article of the CSDN blogger “Abandoned?” It follows the CC 4.0 BY-SA copyright agreement. Please attach the original source link and this statement for reprinting. Original link: https://blog.csdn.net/TH_lsq/article/details/102626967

Leave a Comment Cancel reply