Reference: http://www.voidcn.com/ article/p-raquwiob-th.html”
cluster
concept of cluster
Computer clusters are connected through a group of loosely integrated computer software and/or hardware to complete computing work closely and closely. In a sense, they can be regarded as a computer. A single computer in a cluster system Usually called a node, usually connected through a local area network, but there are other possible connection methods. Cluster computers are usually used to improve the calculation of a single computer speed and/or reliability. Generally, a cluster computer is better than a single computer For example, the performance-price ratio of workstations or supercomputers is much higher.
For example, a single heavy-load operation is shared on multiple node devices for parallel processing. After each node device is processed, the results are summarized and returned to the user. The processing capacity of the system has been greatly improved. Generally divided into several types:
- High availability cluster: Generally, when a node in the cluster fails, the tasks on it will be automatically transferred To other normal nodes. It also means that a node in the cluster can be maintained offline and then online. This process does not affect the operation of the entire cluster.
- Load balancing cluster: When a load balancing cluster is running, it is generally Through one or more front-end load balancers, the workload is distributed to a group of back-end servers, so as to achieve the high performance and high availability of the entire system.
- High-performance computing cluster: high-performance computing cluster adopts Distribute computing tasks to different computing nodes in the cluster to improve computing power, so it is mainly used in the field of scientific computing.
distributed< /h2>
Cluster: The same business is deployed on multiple servers. Distributed: A business is split into multiple sub-businesses, or different businesses themselves are deployed on different servers.
Simply put , Distributed is to shorten the execution time of a single task to improve efficiency, while the cluster is to improve efficiency by increasing the number of tasks executed per unit of time. For example: Sina.com, for example, if there are more people visiting, he can do one In the cluster, a balanced server is placed in the front, and the following servers complete the same business. If there is business access, the response server will see which server is not very heavy, which one will be used to complete it, and one server will collapse , Other servers can come on top. Every node in the distributed In the same business, if a node fails, the business may fail.
Load Balancing
Concept
with With the increase in business volume, the access volume and data flow of each core part of the existing network has increased rapidly, and its processing capacity and computing strength have also increased correspondingly, making it impossible for a single server device to bear it. In this case, if you throw away the existing equipment and do a lot of hardware upgrades, this will cause a waste of existing resources, and if you face the next increase in business volume, this will lead to another high hardware upgrade. Cost investment, even equipment with excellent performance cannot meet the current demand for business growth.
Load balancing technology virtualizes the application resources of multiple real servers on the back-end into a high-performance application server by setting the virtual server IP (VIP), and forwards the user’s request to the back-end intranet server through the load balancing algorithm. The internal network server returns the requested response to the load balancer, and the load balancer sends the response to the user. This hides the internal network structure from Internet users and prevents users from directly accessing the back-end (intranet) server, making the server more secure , Which can prevent attacks on the core network stack and services running on other ports. And the load balancing equipment (software or hardware) will continuously check the application status on the server, and automatically isolate invalid application servers, realizing a simple, scalable, and highly reliable application solution. A single server handles the problems of insufficient performance, insufficient scalability, and low reliability.
The expansion of the system can be divided into vertical (vertical) expansion and horizontal (horizontal) expansion. Vertical expansion is to increase the processing capacity of the server from the stand-alone perspective by increasing the processing capacity of the hardware, such as CPU processing capacity, memory capacity, disks, etc., to achieve the improvement of the server processing capacity. Data problem. Therefore, it is necessary to adopt a horizontal expansion method to meet the processing capacity of large-scale website services by adding machines. For example, if one machine cannot meet the requirements, two or more machines will be added to jointly bear the access pressure.
One of the most important applications of load balancing is to use multiple servers to provide a single service. This solution is sometimes called a server farm. Generally, load balancing is mainly applied to Web sites, large-scale Internet Relay Chat networks, high-traffic file download sites, NNTP (Network News Transfer Protocol) services, and DNS services. Now load balancers also start to support database services, which are called database load balancers.
Server load balancing has three basic features: load balancing algorithm, health check and session retention. These three features are the basic elements to ensure the normal work of load balancing. Some other functions are some deepenings on top of these three functions. Below we introduce the function and principle of each function in detail.
Before the load balancing equipment is deployed, users directly access the server address (there may be server addresses mapped to other addresses on the firewall, but essentially one-to-one access). When a single server cannot handle the access of many users due to insufficient performance, it is necessary to consider using multiple servers to provide services. The way to achieve this is load balancing. The implementation principle of load balancing equipment is to map the addresses of multiple servers to an external service IP (we usually call it VIP. The server mapping can directly map the server IP to the VIP address, or the server IP: Port mapping Different mapping methods will take corresponding health checks. During port mapping, the server port and VIP port can be different). This process is invisible to the client, and the user actually doesn’t know what the server did. Load balancing, because they are still accessing a destination IP, then after the user’s access reaches the load balancing device, how to distribute the user’s access to the appropriate server is the work of the load balancing device. Specifically, it is used The three features mentioned above.
Let’s do a detailed access process analysis:
User (IP:207.17.117.20) visits the domain name www.a10networks.com
, first the public network address of this domain name will be resolved through DNS query: 199.237.202.124, and then user 207.17. 117.20 will visit the address of 199.237.202.124, so the data packet will reach the load balancing device, and then the load balancing device will distribute the data packet to the appropriate server, see the following picture:
When the load balancing device sends the data packet to the server, the data packet is changed, as above As shown in the figure, before the data packet reaches the load balancing device, the source address is: 207.17.117.20 and the destination address is: 199.237.202.124. When the load balancing device forwards the data packet to the selected server, the source address is still: 207.17.117.20, The destination address becomes 172.16.20.1, which we call destination address NAT (DNAT, destination address translation). Generally speaking, DNAT must be done in server load balancing (there is another mode called server direct return-DSR, which does not do DNAT, we will discuss it separately), and the source address depends on the deployment mode. Sometimes it needs to be converted to another address. We call it: Source Address NAT (SNAT). Generally speaking, SNAT is required for bypass mode, but not for serial mode. This diagram is in serial mode, so source address Did not do NAT.
Let’s look at the server’s return packet. As shown in the figure below, the IP address conversion process has also gone through. However, the source/destination address in the response packet is exactly reversed with the request packet. The source address of the packet returned from the server is 172.16.20.1 , The destination address is 207.17.117.20. After reaching the load balancing device, the load balancing device changes the source address to 199.237.202.124, and then forwards it to the user to ensure the consistency of access.
Load balancing algorithm
Generally speaking, load balancing equipment will support multiple load balancing distribution strategies by default, such as:
- RoundRobin will request Sequentially and cyclically sent to each server. When one of the servers fails, AX takes it out of the sequential circular queue and does not participate in the next polling until it returns to normal.
- Ratio: Assign a weighted value to each server as a ratio. Based on this ratio, the user’s request is allocated to each server. When one of the servers fails, AX takes it out of the server queue and does not participate in the next allocation of user requests until it returns to normal.
- Priority: Group all servers, define priorities for each group, and assign user requests to the server group with the highest priority (in the same group, use a preset round Inquiry or ratio algorithm to allocate user requests); when all servers in the highest priority or a specified number of servers fail, AX will send the request to the second priority server group. This method actually provides users with a hot backup method.
- LeastConnection: AX will record the current number of connections on each server or service port, and new connections will be delivered to the server with the least number of connections. When one of the servers fails, AX takes it out of the server queue and does not participate in the next allocation of user requests until it returns to normal.
- Fast response time (Fast Response time): new connections are delivered to those servers that respond the fastest. When one of the servers fails, AX takes it out of the server queue and does not participate in the next allocation of user requests until it returns to normal.
- Hash algorithm (hash): The source address and port of the client are hashed, and the result of the calculation is forwarded to a server for processing. When one of the servers fails, it will be removed from The server is taken out of the queue and will not participate in the next allocation of user requests until it returns to normal.
- Content distribution based on data packets: For example, to determine the HTTP URL, if the URL has a .jpg extension, the data packet is forwarded to the specified server.
health check
health check is used to check the available status of various services opened by the server. Load balancing devices are generally configured with various health check methods, such as Ping, TCP, UDP, HTTP, FTP, DNS, etc. Ping belongs to the third layer of health check, which is used to check the connectivity of the server IP, while TCP/UDP belongs to the fourth layer of health check, which is used to check the UP/DOWN of the service port. If you want to check more accurately, you must use it. To the 7-layer health check, for example, create an HTTP health check, Get a page back, and check whether the page content contains a specified string, if it contains, the service is UP, if it does not contain or retrieve the page, Think that the server’s Web service is unavailable (DOWN). For example, if the load balancing device detects that the port 80 of the server 172.16.20.3 is DOWN, the load balancing device will not forward subsequent connections to this server, but will forward data packets to other servers according to the algorithm. When creating a health check, you can set the check interval and the number of attempts. For example, if you set the interval to 5 seconds and the number of attempts to 3, then the load balancing device will initiate a health check every 5 seconds. If the check fails, it will try 3 times. If the check fails for 3 times, the service will be marked as DOWN, and then the server will still check the DOWN server every 5 seconds. When it is found that the health check of the server is successful again at some point, the server will be remarked For UP. The interval of health check and the number of attempts should be set according to the overall situation. The principle is that it will neither affect the business nor cause a large burden on the load balancing equipment.
session retention
How to ensure that two http requests of a user are forwarded to the same server, which requires load balancing equipment Configure session persistence.
Conversation retention is used to maintain the continuity and consistency of the conversation. Since it is difficult to synchronize user access information between servers in real time, this requires that the user’s front and back access sessions be kept on one server for processing. For example, if a user visits an e-commerce website, if the user logs in, it is processed by the first server, but the user’s purchase of goods is processed by the second server. The second server does not know the user information. So this purchase will not succeed. In this case, the session needs to be maintained, and the user’s operations are processed through the first server to succeed. Of course, not all visits require session retention. For example, the server provides static pages such as a news channel on a website. Each server has the same content. This kind of access does not require session retention.
Most load balancing products support two basic types of session retention: source/destination address session retention and cookie session retention. In addition, hash, URL Persist, etc. are also commonly used methods, but not all devices support . Different sessions should be configured for different applications, otherwise it will cause unbalanced load and even access abnormalities. We mainly analyze the session retention of the B/S structure.
Application based on the B/S structure:
For the application content of the ordinary B/S structure, such as the static page of the website, no configuration is required Any session retention, but for a business system based on the B/S structure, especially the middleware platform, session retention must be configured. Under normal circumstances, we configure source address session retention to meet the needs, but considering that the client may have The above environment is not conducive to the maintenance of the source address session, and the cookie session maintenance is a better way. Cookie session retention will save the server information selected by the load balancing device in a Cookie and send it to the client. When the client continues to access it, it will bring the Cookie. The load balancer will keep the session to the previously selected server by analyzing the Cookie. Cookies are divided into file cookies and memory cookies. File cookies are stored on the hard disk of the client computer. As long as the cookie file does not expire, it can remain on the same server regardless of whether the open browser is repeatedly closed or not. The memory cookie saves the cookie information in the memory. The lifetime of the cookie starts when the browser is opened for access and ends when the browser is closed. Because current browsers have certain default security settings for cookies, some clients may stipulate that file cookies are not allowed to be used, so current application development mostly uses memory cookies.
However, memory cookies are not omnipotent. For example, the browser may completely disable cookies for security, so that the cookie session retention will be useless. We can achieve session retention through Session-id, that is, session-id is used as a url parameter or placed in a hidden field , and then the Session-id is analyzed for distribution.
Another solution is to save each session information in a database. Because this program will increase the load of the database, so this program is not good for performance improvement. The database is best used to store session data with a relatively long session time. In order to avoid a single point of failure of the database and improve its scalability, the database is usually replicated to multiple servers, and requests are distributed to the database server through a load balancer.
The session persistence based on the source/destination address is actually not very useful, because the client may connect to the Internet through DHCP, NAT or Web proxy, and its IP address may change frequently, which makes the quality of service of this solution impossible to guarantee.
NAT (Network Address Translation, Network Address Translation): When some hosts in the private network have already been assigned local IP addresses (that is, the private addresses used only in the private network), But now you want to communicate with hosts on the Internet (without encryption), you can use the NAT method. This method requires NAT software to be installed on the router connected to the Internet on the private network. A router equipped with NAT software is called a NAT router, and it has at least one valid external global IP address. In this way, when communicating with the outside world, all hosts using local addresses must convert their local addresses to global IP addresses on the NAT router to connect to the Internet.
Other benefits of load balancing
High scalability
< p> By adding or reducing the number of servers, you can better respond to high concurrent requests.
(server) health check
The load balancer can check the health status of the background server application layer and remove the failed servers from the server pool , Improve reliability.
TCP connection reuse (TCP Connection Reuse)
TCP connection multiplexing technology uses the HTTP of multiple front-end clients The request is multiplexed onto a TCP connection established between the backend and the server. This technology can greatly reduce the performance load of the server, reduce the delay caused by the new TCP connection with the server, and minimize the number of concurrent connection requests from the client to the back-end server, and reduce the resource occupation of the server.
Generally, the client needs to perform a TCP three-way handshake with the server before sending an HTTP request, establish a TCP connection, and then send an HTTP request. After the server receives the HTTP request, it processes it and sends the processed result back to the client. Then the client and the server send FIN to each other and close the connection after receiving the FIN ACK confirmation. In this way, a simple HTTP request requires more than a dozen TCP packets to be processed.
After adopting TCP connection multiplexing technology, the client (such as: ClientA) performs a three-way handshake with the load balancing device and sends an HTTP request. After the load balancing device receives the request, it will detect whether there is an idle long connection to the server. If it does not exist, the server will establish a new connection. When the HTTP request response is completed, the client negotiates with the load balancing device to close the connection, and the load balancer maintains the connection with the server. When other clients (such as ClientB) need to send HTTP requests, the load balancing device will directly send HTTP requests to the idle connection maintained with the server, avoiding the delay and server resource consumption caused by the new TCP connection.
In HTTP 1.1, the client can Sending multiple HTTP requests in a TCP connection, this technique is called HTTP multiplexing (HTTP Multiplexing). The most fundamental difference between it and TCP connection multiplexing is that TCP connection multiplexing is the multiplexing of multiple client HTTP requests onto a server-side TCP connection, while HTTP multiplexing is that multiple HTTP requests of one client pass through one TCP connection. Connect for processing. The former is a unique function of load balancing equipment; and the latter is a new function supported by the HTTP 1.1 protocol, which is currently supported by most browsers.
HTTP cache
The load balancer can store static content, and when the user requests them, it can directly respond to the user without having to make a request to the background server.
TCP buffer
TCP buffer is to solve the problem of server resource waste caused by the mismatch between the back-end server network speed and the customer’s front-end network speed. The link used between the client and the load balancer has a higher delay and lower bandwidth, and the load balancer and the server use a local area network connection with lower delay and high bandwidth. Because the load balancer can temporarily store the response data of the background server to the client, and then forward them to those clients with a longer response time and a slower network speed, the background web server can release corresponding threads to process other tasks.
SSL acceleration
Under normal circumstances, HTTP uses clear text to transmit on the network, which may be illegally eavesdropped, especially password information for authentication Wait. In order to avoid such security problems, the SSL protocol (ie: HTTPS) is generally used to encrypt the HTTP protocol to ensure the security of the entire transmission process. In SSL communication, the asymmetric key technology is first used to exchange authentication information, and the session key used to encrypt data between the server and the browser is exchanged, and then the key is used to encrypt and decrypt the information in the communication process.
SSL is a security technology that consumes a lot of CPU resources. Currently, most load balancing devices use SSL acceleration chips (hardware load balancers) to process SSL information. This method provides higher SSL processing performance than the traditional SSL encryption method that uses the server, thereby saving a lot of server resources and enabling the server to focus on the processing of business requests. In addition, the use of centralized SSL processing can also simplify the management of certificates and reduce the workload of daily management.
Content Filter
Some load balancers can modify the data passing through it as required.
Intrusion prevention function
On the basis of the firewall to protect the network layer/transport layer security, it provides application layer security protection.
Classification
The following discusses the realization of load balancing from different levels:
DNS Load Balancing
DNS is responsible for providing domain name resolution services. When accessing a site, it actually first needs to obtain the IP address pointed to by the domain name through the DNS server of the site’s domain name. In this process The DNS server completes the mapping of domain names to IP addresses. Similarly, the mapping can also be one-to-many. At this time, the DNS server acts as a load balancing scheduler, distributing user requests to multiple servers. Use the dig command to look at the DNS settings of “baidu”:
It can be seen that baidu has three A records.
The advantage of this technology is that it is simple to implement, easy to implement, low cost, suitable for most TCP/IP applications, and the DNS server can find the closest one to the user among all available A records server. However, its shortcomings are also very obvious. First of all, this solution is not load balancing in the true sense. The DNS server evenly distributes Http requests to the back-end Web servers (or based on geographic location), regardless of the current status of each Web server. Load situation; if the configuration and processing capabilities of the back-end Web server are different, the slowest Web server will become the bottleneck of the system, and the server with strong processing capability cannot fully play its role; secondly, fault tolerance is not considered, if a certain back-end Web server fails , The DNS server will still assign DNS requests to this faulty server, causing it to fail to respond to the client. The last point is fatal. It may cause a considerable number of customers to not be able to enjoy Web services, and due to DNS caching, the consequences will last for a long time (generally, the DNS refresh cycle is about 24 hours). Therefore, in the latest foreign construction center Web site program, this program has rarely been adopted.
link layer (OSI layer 2) load balancing
modify the mac address at the data link layer of the communication protocol , Perform load balancing.
When data is distributed, the ip address is not modified (because the ip address is not visible), only the target mac address is modified, and the virtual ip of all back-end servers are configured to be consistent with the load balancer ip address, so that the source of the data packet is not modified Address and target address, the purpose of data distribution.
The actual processing server ip is the same as the destination ip of the data request, and there is no need to go through the load balancing server for address conversion, and the response data packet can be directly returned to the user’s browser to prevent the load balancing server network card bandwidth from becoming a bottleneck. Also known as direct routing mode (DR mode). As shown in the figure below:
The performance is very good, but the configuration is complicated, so it is currently used More extensive.
Transport layer (OSI fourth layer) load balancing
The transport layer is the fourth layer of OSI, including TCP and UDP. Popular transport layer load balancers are HAProxy (this is also used for application layer load balancing) and IPVS.
Mainly through the target address and port in the message, plus the server selection method set by the load balancing device, determine the final internal server selection.
Taking common TCP as an example, when the load balancing device receives the first SYN request from the client, it selects an optimal server through the above method and modifies the target IP address in the message (change to Back-end server IP), directly forward to this server. The TCP connection is established, that is, the three-way handshake is directly established between the client and the server, and the load balancing device only acts as a router-like forwarding action. In some deployment situations, in order to ensure that the server’s response packet can be correctly returned to the load balancing device, the original source address of the packet may be modified while the packet is forwarded.
Application layer (OSI seventh layer) load balancing
The application layer is the seventh layer of OSI. It includes HTTP, HTTPS, and WebSockets. A very popular and well-tested application layer load balancer is Nginx[恩静Aix = Engine X].
The so-called seven-layer load balancing, also known as “content exchange”, is to determine the final internal choice mainly through the truly meaningful application layer content in the message, plus the server selection method set by the load balancing device server. Note that you can see the complete url of the specific http request at this time, so the distribution shown in the following figure can be realized:
Take common TCP as an example. If a load balancing device chooses a server based on the real application layer content, it can only proxy the final server and client to establish a connection ( After the three-way handshake), you can see the actual application layer content message sent by the client, and then according to the specific fields in the message, plus the server selection method set by the load balancing device, determine the final internal server selected. In this case, the load balancing device is more similar to a proxy server. Load balancing and front-end clients and back-end servers will establish TCP connections respectively. Therefore, from the perspective of this technical principle, the seven-layer load balancing obviously has higher requirements for load balancing equipment, and the ability to handle the seven-layer will inevitably be lower than the four-layer deployment mode. So, why do we need seven-layer load balancing?
The benefit of seven-layer load balancing is to make the entire network more “intelligent”. For example, most of the benefits of load balancing listed above are based on seven-layer load balancing. For example, the user traffic visiting a website can forward requests for pictures to a specific image server and use caching technology through seven layers; requests for text can be forwarded to a specific text server and use compression Technology. Of course, this is only a small case of the seven-layer application. From the technical principle, this method can modify the client’s request and the server’s response in any sense, which greatly improves the flexibility of the application system at the network layer.
Another feature often mentioned is security. The most common SYN Flood attack in the network is that hackers control many source clients and use fake IP addresses to send SYN attacks to the same target. Usually this kind of attack will send a large number of SYN packets, exhausting the relevant resources on the server to achieve Denial. The purpose of Service (DoS). It can also be seen from the technical principle that these SYN attacks in the four-layer mode will be forwarded to the back-end server; and in the seven-layer mode, these SYN attacks are naturally terminated on the load balancing device, and will not affect the normal operation of the back-end server. . In addition, the load balancing device can set a variety of strategies at the seven-layer level to filter specific messages, such as SQL Injection and other application-level specific attack methods, to further improve the overall security of the system from the application level.
The current seven-layer load balancing mainly focuses on the widely used HTTP protocol, so its application scope is mainly based on B/S development systems such as numerous websites or internal information platforms. Four-layer load balancing corresponds to other TCP applications, such as ERP and other systems developed based on C/S.