MGR architecture ~ MGR overall performance architecture

I. Introduction: The tuning process of the MGR cluster architecture. The second process: This article will elaborate on the next three hardware from various angles. 1 Choose the same configuration of server, disk, memory, and CPU. The higher the performance, the better. 4 Network 1 0 Packet loss and the best 10 Gigabit network card 5 MGR itself MGR itself requires very good network conditions, mainly including the following points 1 Heartbeat detection of the internal cluster 2 Broadcast and send verification of the write set 3 Internal transmission of the binlog 6 related optimization parameters 1 report_host is directly bound to the real IP instead of the machine domain name, reducing DNS resolution problems and avoiding problems in the cluster itself due to parsing errors 2 group_replication_compression_threshold transaction compression parameter, we set 131072 (128K) Parameter resolution: 1 This parameter is for The internal transaction compression of the binlog uses the LZ4 compression algorithm. (LZ4 can well support multi-threaded environments and obtain higher compression and decompression speeds) 2 Transactions with payloads greater than the threshold are compressed, and the compressed binlog_event After the transaction is transmitted to the node, it will be decompressed. 3 This parameter can effectively reduce the pressure on the network bandwidth and improve the overall performance. 4 This parameter is 1M by default, and the compression mode is enabled, and the transaction 3_limit_replication_only limit is required. Transaction size parameter, we set here 20971520 (20M) Parameter analysis: 1 This parameter is for the case that the binlog_event amount generated by the transaction as a whole is greater than the threshold, and restricts transaction execution. 2 Large transactions may affect the entire cluster in both PXC and MGR environments. Cause a very serious impact. 3 Once the transaction itself exceeds the threshold, the terminal will report an error, and the error log will also reflect it. 4 This parameter is 2G by default. It is recommended to discuss and customize in combination with the online environment and research and development. Parameter analysis: 1 When the node is in the unreachable state, wait for 5S and then no longer wait to prevent affecting the cluster service. If it is still UNREACHABLE, the node will be set to the ERROR state. By default, it will wait indefinitely. 2 Do not let the problematic node stay in the cluster forever , Propose it in time to prevent performance problems of the entire cluster due to network and other problems. 3 Nodes will not be in the view of the cluster once they are kicked out. 5 Parallel replication related parameters Parameter analysis: 1 Parallel replication of 5.7 greatly speeds up the application of queues Speed, it is strongly recommended to turn on 6 group_replication_flow_control_mode Whether to turn on flow control control Group_replication_flow_control_certifier_threshold Turn on flow control application queue parameters 25000 by default, group_replication_flow_control_applier_threshold Turn on flow control application queue parameters by default 25000, resolution of 25000 by default (default 25000 transaction delay parameters) Start the flow control (Flow Control), each cycle performance degradation of the current 10%, until the cluster is unavailable (but the cluster node status is online), a single node is slow and the entire cluster is slow. 2 Flow control is turned on by default, and the value is QUOTA/DISABLE, which means that flow control is turned off. 3 For single-master mode, in fact, flow control-related parameters can be used in the main adjustment, because detection conflicts will not occur. This parameter is recommended to be carried out according to the environment Adjust accordingly (no adjustment experience) Six overall processes Let’s look at the overall process in single main mode 1 Write node receives transactions, applies transactions, generates binlogs, caches them in the cache, and then generates WS collections based on the primary key, binlog_cache and other related information 2 Write nodes broadcast through the protocol and sequentially push WS to other members for verification 3 After other members verify successfully, write node binlog to refresh the binlog log, return the client commit ok, and the transaction is submitted successfully 4 After other members verify successfully, write them to the relaylog 5 Other members read the relaylog and join the application queue. The application is completed, and then the entire cluster is written to the local binlog to achieve a consistent Seven Summary Through the entire process, we found that the network is very important to the overall MGR cluster, and then the speed of the application queue consumption of the read node. The last is the eighth post sequence of cluster blockage caused by big affairs. I only represent my personal opinion. If you have any questions, you can leave a message. I will modify it as soon as possible

Leave a Comment

Your email address will not be published.