ZooKeeper is a distributed, open source distributed application coordination service, an open source implementation of Google’s Chubby, and an important component of Hadoop and Hbase. It is a software that provides consistent services for distributed applications. The functions provided include configuration maintenance, domain name service, distributed synchronization, cluster management, etc.
Because the Kafka cluster saves state information in Zookeeper, and the dynamic expansion of Kafka is achieved through Zookeeper, it is necessary to build a Zookeerper cluster first to establish distributed state management. Start to prepare the environment and build the cluster:
zookeeper is developed based on the Java environment, so you need to install Java first. Then the zookeeper installation package version used here is zookeeper-3.4.14, and the Kafka installation package version is kafka_2.11 -2.2.0.
AMQP protocol: Advanced Message Queuing Protocol (Advanced Message Queuing Protocol) is a standard and open application layer message middleware protocol. AMQP defines the data format of the byte stream sent over the network. Therefore, the compatibility is very good. Any program that implements the AMQP protocol can interact with other programs that are compatible with the AMQP protocol. It can easily be cross-language and cross-platform.
server1: 192.168.42.128
server2: 192.168.42.129
server3: 192.168.42.130
Check the system before installation No built-in open-jdk
Command:
rpm -qa |grep java
rpm -qa |grep jdk
If not Enter the information to indicate that it is not installed.
Retrieve the list of 1.8
yum list java-1.8*
Install all files of 1.8.0
yum install java-1.8.0-openjdk* -y
Use the command Check if the installation is successful
java -version
cat /etc/hosts
192.168.42.128 kafka01
192.168.42.129 kafka02
192.168.42.130 kafka03
1. Turn off selinux and firewall.
setenforce 0
systemctl stop firewalld && systemctl disable firewalld
2. Create a storage directory for zookeeper and Kafka:
cd /usr /local/
wget http://mirrors.cnnic.cn/apache/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
wget http: //mirrors.tuna.tsinghua.edu.cn/apache/kafka/2.2.0/kafka_2.11-2.2.0.tgz
tar xzvf zookeeper-3.4.14.tar.gz
tar xzvf kafka_2.11-2.2.0.tgz
mkdir -pv zookeeper/{zkdata,zkdatalog} #zkdata is to store snapshot logs, zkdatalog is to store transaction logs
mkdir -pv kafka/kfdatalogs #kfdatalogs is to store message logs
4. Generate and change the configuration file of zookeeper (you need to set it on all three servers):
# can put zoo_sample Understand as the configuration file template that comes with zookeeper, copy a configuration file ending with .cfg.
cp -av /usr/local/zookeeper/conf/{zoo_sample,zoo.cfg}
tickTime=2000 #zookeeper server heartbeat time.
initLimit=10 #zookeeper’s maximum connection failure time
syncLimit=5 #zookeeper’s synchronization communication time
dataDir=/usr/local/zookeeper/zkdata #zookeeper’s absolute path for storing snapshot logs
dataLogDir=/usr/local/zookeeper/zkdatalog #zookeeper’s absolute path for storing transaction logs
clientPort=2181 #zookeeper and client Connection port
server.1=192.168.11.139:2888:3888 #Server and its number, server IP address, communication port, election port
server.2=192.168.11.140: 2888:3888 #Server and its number, server IP address, communication port, election port
server.3=192.168.11.141:2888:3888 #Server and its number, server IP address, communication port, election Port
#The above ports are the default ports of zookeeper, which can be modified as required.
5. Create a myid file:
On server1
echo “1”> /usr/local/zookeeper/zkdata/myid
# is to send the server number to myid under zkdata on different servers.
6. Start zookeeper cluster
cd /usr/local/zookeeper/bin
./zkServer.sh start
./ zkServer.sh status
#Mode: leader is the master node, Mode: follower is the slave node. The zk cluster generally has only one leader and multiple followers. The master generally responds to client read and write requests, while the slave Synchronize data, and when the master hangs up, it will vote for a leader from the follower.
zookeeper election process/working principle
https://blog.csdn.net/wyqwilliam/article/details/83537139
In the zookeeper cluster, Each node has the following 3 roles and 4 states:
Roles: leader, follower, observer
Status: leading, following, observing, looking
A Server has 4 states during its work:
LOOKING: The current Server does not know who the leader is and is searching.
LEADING: The current Server is the elected leader.
FOLLOWING: The leader has been elected, and the current server is synchronized with it.
OBSERVING: In most cases, the behavior of observers is exactly the same as that of followers, but they do not participate in elections and voting, but only accept (observing) the results of elections and voting.
So far, After the zookeeper cluster is built, the next step is to build a kafka cluster based on zookeeper:
The basic concept of Kafka:
Topic refers specifically to the different classifications of feeds of messages processed by Kafka .
Partition: The physical grouping of Partition Topic. A topic can be divided into multiple partitions, and each partition is an ordered queue. Each message in the partition will be assigned an ordered id (offset).
Message: Message is the basic unit of communication. Each producer can publish some messages to a topic.
Producers: The data producers of messages. The process of publishing messages to a topic in Kafka is called producers.
Consumers: The data consumers of messages, the process of subscribing to topics and processing the published messages is called consumers.
Broker: Cache proxy, one or more servers in the Kafka cluster are collectively called broker, here is the AMQP protocol.
cd /usr/local/kafka/config/
1. Modify the server.properties file:
broker.id=1 #here and in zookeeper The myid file is the same, using a unique identifier
prot=9092 #The port for connecting between Kafka clusters, which is not in the configuration file, but the default is 9092, which can be modified as required. Here we add
log.dirs=/usr/local/kafka/kfdatalogs #Absolute path to store Kafka message logs
advertised.listeners=PLAINTEXT://kafka01:9092
log .retention.hours=168 #The maximum persistence time of the default message, 168 hours, 7 days
message.max.byte=5242880 #The maximum value of message storage is 5M
default. replication.factor=2 #kafka saves the number of copies of the message. If one copy fails, the other can continue to provide services
replica.fetch.max.bytes=5242880 #Get the maximum direct number of messages< /p>
zookeeper.connect=192.168.42.128:2181,192.168.42.129:2181,192.168.42.130:2181#The IP address of each node of the cluster and the port of zookeeper, what is the port set in the zookeeper cluster here The port is what it is.
Unmodified configuration file information:
num.network.threads=3 #This is the number of threads used by the borker for network processing
num.io.threads =8 #This is the number of threads used by the borker for I/O processing
num.partitions=1 #The default number of partitions, a topic defaults to 1 partition number
log.retention. hours=168 #The maximum persistence time of the default message, 168 hours, 7 days
message.max.byte=5242880 #The maximum value of message storage is 5M
default.replication.factor =2 #kafka saves the number of copies of the message. If one copy fails, the other can continue to provide services
replica.fetch.max.bytes=5242880 #Get the maximum direct number of messages
log.segment.bytes=1073741824 #This parameter is: because Kafka’s message is appended to the file, when this value is exceeded, Kafka will create a new file
log. retention.check.interval.ms=300000 #Check the log expiration time configured above (log.retention.hours=168) every 300000 milliseconds, go to the directory to see if there are any expired messages. If there are, delete them
log.cleaner.enable=false #Whether to enable log compression, generally do not need to enable it, if enabled, it can improve performance
2. Start Kafka cluster:
cd /usr/local/kafka /bin
bash kafka-server-start.sh -daemon …/config/server.properties
3. Start testing:
3.1 Create topic
p>
./kafka-topics.sh –create –zookeeper 192.168.21.241:2181 –replication-factor 2 –partitions 1 –topic wg01
#–replication-factor 2 Copy two copies
#–partitions 1 Create a partition
#–top ic tian Subject is tian
3.2 Create a producter:
./kafka-console-producer.sh –broker-list 192.168.21.241:9092 –topic wg01
3.3 Create a consumer:
./kafka-console-consumer.sh –bootstrap-server 192.168.11.141:9092 –topic wg01 –from-beginning
3.4 View topic:
./kafka-topics.sh –list –zookeeper 192.168.11.141:2181———————————————— Copyright Notice: This article It is the original article of the CSDN blogger “Abandoned?” It follows the CC 4.0 BY-SA copyright agreement. Please attach the original source link and this statement for reprinting. Original link: https://blog.csdn.net/TH_lsq/article/details/102626967