Gateway Protocol Learning: CGI, Fastcgi, WSGI

CGI

CGI stands for Common Gateway Interface, which is an interface standard between external applications (CGI programs) and Web servers, and is a procedure for transferring information between CGI programs and Web servers. The CGI specification allows a Web server to execute external programs and send their output to a Web browser. CGI turns a set of simple static hypermedia documents on the Web into a complete new interactive media. In layman’s terms, CGI is like a bridge that connects the web page and the execution program in the WEB server. It passes the instructions received by HTML to the execution program of the server, and then returns the result of the server execution to the HTML page. CGI has excellent cross-platform performance and can be implemented on almost any operating system.

In CGI mode, when encountering a connection request (user request), first create a cgi child process, activate a CGI process, then process the request, and end this after processing Child process. This is the fork-and-execute mode. Therefore, there will be as many cgi child processes as there are connection requests for the server in cgi mode, and repeated loading of child processes is the main reason for the low performance of cgi. When the number of user requests is very large, it will squeeze a large amount of system resources such as memory and CPU time, resulting in low performance.

CGI script workflow:

  1. browser through HTML form Or the hyperlink request points to the URL of a CGI application.
  2. The server receives and sends the request.
  3. The server executes the specified CGI application.
  4. CGI applications perform the required operations, usually based on the content entered by the viewer.
  5. The CGI application formats the result into a document (usually an HTML page) that the web server and browser can understand.
  6. The web server returns the result to the browser.

FastCGI

FastCGI is a scalable and high-speed communication interface between HTTP server and dynamic scripting language. Most popular HTTP servers support FastCGI, including Apache, Nginx, and lighttpd. At the same time, FastCGI is also supported by many scripting languages, including PHP.

FastCGI is developed and improved from CGI. The main disadvantage of the traditional CGI interface method is poor performance, because every time the HTTP server encounters a dynamic program, the script parser needs to be restarted to perform the analysis, and then the result is returned to the HTTP server. This is almost unavailable when dealing with high concurrent access. FastCGI is like a long-live CGI, it can be executed all the time, as long as it is activated, it will not take time to fork every time (this is the most criticized fork-and-execute mode of CGI ). CGI is the so-called short-lived application, and FastCGI is the so-called long-lived application. Since the FastCGI program does not need to continuously generate new processes, it can greatly reduce the pressure on the server and produce higher application efficiency. Its speed and efficiency are at least 5 times higher than that of CGI technology. It also supports distributed computing, that is, FastCGI programs can be executed on hosts other than the web server and accept requests from other web servers.

FastCGI is a language-independent, scalable architecture CGI open extension. Its main behavior is to keep the CGI interpreter process in memory and thus obtain higher performance. As we all know, repeated loading of the CGI interpreter is the main reason for the low performance of CGI. If the CGI interpreter is kept in memory and is scheduled by the FastCGI process manager, it can provide good performance, scalability, Fail-Over features, and so on. The FastCGI interface adopts the C/S structure, which can separate the HTTP server and the script parsing server, and at the same time start one or more script parsing daemons on the script parsing server. Whenever the HTTP server encounters a dynamic program, it can directly deliver it to the FastCGI process for execution, and then return the result to the browser. This method allows the HTTP server to exclusively process static requests or return the results of the dynamic script server to the client, which greatly improves the performance of the entire application system.

FastCGI workflow:

  1. Web Server startup time Enter FastCGI process manager (PHP-CGI or PHP-FPM or spawn-cgi)
  2. FastCGI process manager initializes itself and starts multiple CGI interpreter processes (Visible multiple php-cgi) and wait for the connection from the Web Server.
  3. When a client request arrives at the Web Server, the FastCGI process manager selects and connects to a CGI interpreter. The Web server sends the CGI environment variables and standard input to the FastCGI subprocess php-cgi.
  4. FastCGI subprocess will return the standard output and error message from the same connection to the Web Server after finishing processing. When the FastCGI child process closes the connection, the request is processed. The FastCGI child process then waits and processes the next connection from the FastCGI process manager (running in the Web Server). In CGI mode, php-cgi exits here.

Features of FastCGI

  1. break the traditional page Processing technology. With traditional page processing technology, the program must be in the same server as the Web server or Application server. This kind of history has been broken by FastCGI technology for N years. FastCGI technology applications can be installed on any server in the server group, and communicate with the Web server through the TCP/IP protocol, which is suitable for the development of large-scale distributed Web groups are also suitable for efficient database control.
  2. Clear request mode. CGI technology does not have a clear role. In the FastCGI program, the program is assigned a clear role (responder role, authenticator role, filter role).

ISAPI

ISAPI (Internet Server Application Program Interface) is a set of API interfaces oriented to WEB services provided by Microsoft. It can implement all the functions provided by CGI and has been extended on this basis, such as filtering Application program interface. ISAPI applications are mostly used in the form of DLL dynamic libraries, which can be executed after being requested by the user, and will not disappear immediately after processing a user request, but continue to reside in the memory and wait for other user input to be processed. In addition, ISAPI’s DLL application and WEB server are in the same process, and the efficiency is significantly higher than that of CGI. (Due to Microsoft’s exclusivity, it can only run in the windows environment)

ISAPI server extension provides another option for Common Gateway Interface (CGI) applications using Internet servers Kind of choice. Unlike CGI applications, ISA runs in the same address space as the HTTP server and can access all resources that can be used by the HTTP server. The system overhead of ISA is lower than that of CGI applications because they do not require the creation of other processes, nor do they perform communication that needs to cross the process boundary, which is very time-consuming. If the memory is needed by other processes, both extension and filter DLLs may be unloaded. ISAPI allows multiple commands in a DLL, and these commands are implemented as member functions of the CHttpServer object in the DLL. CGI requires that each task have a separate name and a URL mapping to a separate executable file. Each new CGI request starts a new process, and each different request is contained in its own executable file. These files are loaded and unloaded according to each request, so the system overhead is higher than that of ISA.

PHP-CGI

PHP-CGI is the FastCGI manager that comes with PHP. The disadvantages of PHP-CGI:

  1. php-cgi needs to restart php-cgi after changing the php.ini configuration to make the new php-ini It takes effect and cannot be restarted smoothly
  2. directly kill the php-cgi process php will not run. (PHP-FPM and Spawn-FCGI do not have this problem, the daemon will smoothly regenerate new child processes.)

Spawn-FCGI

Spawn-FCGI is a general FastCGI management server, which is in lighttpd For one part, many people use Spawn-FCGI of Lighttpd for management work in FastCGI mode, but there are many shortcomings. The emergence of PHP-FPM has somewhat alleviated some problems, but PHP-FPM has a shortcoming that needs to be recompiled, which may have a lot of risk for some already running environments), PHP 5.3.3 can be used directly in PHP- FPM. Spawn-FCGI has very few codes, all of which are only 630 lines, written in C language, and the last submission was 5 years ago. Code homepage: https://github.com/lighttpd/spawn-fcgi

Spawn-FCGI code analysis is as follows:

    < li style="margin:0px; padding:0px">spawn-fcgi first create socket, bind, listen 3 steps to create a server socket, (call this socket fcgi_fd)
  1. Use dup2 to exchange fcgi_fd to FCGI_LISTENSOCK_FILENO (FCGI_LISTENSOCK_FILENO is numerically equal to 0, which is the socket id specified for listening in the fastcgi protocol)
  2. Execute execl, replaces the current process image with a new process image. process image The code segment of the process in the running space

Obviously, Spawn-FCGI Also pre-for The k model is just written in the ancient C language, full of N many dark programming skills under unix.

The Spawn-FCGI function is very simple:

  1. just the fork process , The child process is hung up, the main process only logs once, and will not fork at all. For a period of time in 2009, I used spawn-fcgi to deploy php-cgi, and it would all hang up after running for a period of time. I can only use crontab to periodically restart spawn-fcgi
  2. not responsible for the network IO in the child process, just put the socket in the specified position and the next thing is handled by the spawn program

Spawn-FCGI is a very early program, just look at it. In addition, there is a piece of code from 1996: http://www.fastcgi.com/om_archive/kit/cgi-fcgi/cgi-fcgi.c, which is a style with spawn-fcgi

PHP-FPM

PHP-FPM is a PHP The FastCGI manager is for PHP only and can be downloaded from http://php-fpm.org/download. PHP-FPM is actually a patch of the PHP source code, designed to integrate FastCGI process management into the PHP package. It must be patched to your PHP source code, and it can be used after compiling and installing PHP. FPM (FastCGI Process Manager) is used to replace most of the additional functions of PHP-CGI and is very useful for high-load websites. Its functions include:

  1. supports advanced process management functions for smooth stop/start;
  2. Can work in different uid/gid/chroot environments, monitor different ports and use different php.ini configuration files (can replace safe_mode settings);
  3. stdout and stderr log records;
  4. can restart and cache the corrupted opcode when an unexpected situation occurs;
  5. File upload optimization support;
  6. “Slow log” – record script (not only Record the file name, and also record the PHP backtrace information, you can use ptrace or similar tools to read and analyze the running data of the remote process) abnormal slowness caused by running;
  7. fastcgi_finish_request()-special function: used to continue to perform time-consuming work in the background (input video conversion, statistical processing, etc.) after the request is completed and the data refreshed;
  8. Dynamic/static child process generation;
  9. Basic SAPI running status information (similar to Apache’s mod_status);
  10. based on php.ini configuration file.

WSGI

The Web Server Gateway Interface (Python Web Server Gateway Interface, abbreviated as WSGI) is a simple and universal interface between a Web server and a Web application or framework defined for the Python language. Since WSGI was developed, similar interfaces have appeared in many other languages. WSGI is used as a low-level interface between a Web server and a Web application or application framework to enhance the common ground of portable Web application development. WSGI is designed based on the existing CGI standard.

WSGI is divided into two parts: one is “server” or “gateway”, and the other is “application” or “application framework”. When processing a WSGI request, the server will provide the application with environmental information and a callback function. When the application program finishes processing the request, it sends the result back to the server through the aforementioned callback function. The so-called WSGI middleware implements both sides of the API at the same time, so it can mediate between WSGI services and WSGI applications: from the perspective of the WSGI server, the middleware plays the role of the application, and from the perspective of the application, The middleware acts as a server. The “middleware” component can perform the following functions:

  1. After rewriting environment variables, according to the target URL, the request message is routed to a different Application object.
  2. Allow multiple applications or application frameworks to run simultaneously in one process.
  3. Load balancing and remote processing, by forwarding request and response messages on the network.
  4. Perform content post-processing, such as applying XSLT style sheets.

In the past, how to choose the right web application framework became a problem for Python beginners. This is because, in general, web applications The choice of framework will limit the choice of available web servers and vice versa. At that time, Python applications were usually designed for one of CGI, FastCGI, mod_python, and even for custom API interfaces of specific web servers. There is no official implementation of WSGI, because WSGI is more like a protocol. As long as these protocols are followed, WSGI applications (Application) can run on any server (Server), and vice versa. WSGI is the CGI wrapper of Python, compared to Fastcgi is the CGI wrapper of PHP.

WSGI divides web components into three categories: web server, web middleware, web application. The basic wsgi processing mode is: WSGI Server -> (WSGI Middleware )* -> WSGI Application.

wsgi

1, WSGI Server/gateway

wsgi server can be understood as a conformity The web server of the wsgi specification receives the request request, encapsulates a series of environment variables, calls the registered wsgi app according to the wsgi specification, and finally returns the response to the client. The text is difficult to explain exactly what wsgi server is and what it does. The most intuitive way is to look at the implementation code of wsgi server. Take the wsgiref that comes with python as an example, wsgiref is a simple wsgi server implemented in accordance with the wsgi specification. Its code is also not complicated.

wsgi-gateway

  1. The server creates a socket, monitors the port, and waits for the client to connect.
  2. When a request comes, the server parses the client information and puts it in the environment variable environ, and calls the bound handler to process the request.
  3. The handler parses the http request and puts the request information such as method, path, etc. in the environment.
  4. wsgi handler puts some server-side information in environ, and finally server information, client information, and this request information are all saved to environment variables environ.
  5. wsgi handler calls the registered wsgi app, and passes environ and callback function to wsgi app
  6. wsgi app returns the response header/status/body to the wsgi handler
  7. In the end, the handler still stuffs the response information back to the client through the socket.

2, WSGI Application

wsgi application is an ordinary callable object , When a request comes, the wsgi server will call this wsgi app. This object receives two parameters, usually environ, start_response. environ, like the previous introduction, can be understood as an environment variable. All information related to a request is stored in this environment variable, including server information, client information, and request information. start_response is a callback function. The wsgi application returns the response headers/status to the wsgi server by calling start_response. In addition, this wsgi app will return an iterator object, and this iterator is the response body. It feels vain to speak so empty, and you can understand a lot by looking at the following simple example.

3. WSGI MiddleWare

Some functions may be between the server program and the application program For example, if the server gets the URL requested by the client, different URLs need to be processed by different functions. This function is called URL Routing. This function can be implemented in the middle of the two. The middle layer is middleware. Middleware is transparent to server programs and applications, that is, the server program thinks it is the application program, and the application program thinks it is the server. This tells us that middleware needs to disguise itself as a server, accept the application, and call it. At the same time, middleware also needs to disguise itself as an application and pass it to the server program.

In fact, whether it is a server program, middleware or an application, they are all on the server side and provide services for the client. The reason why they are abstracted into different layers is to Control the complexity, so that every time is not too complicated, and each perform its own duties.

Reference:

  • http://www.voidcn.com/article/p-tdeawtqh-be.html
  • http ://blog.kenshinx.me/blog/wsgi-research/
  • http://blog.ez2learn. com/2010/01/27/introduction-to-wsgi/

uWSGI

The uWSGI project is designed for deployment A complete solution for the development of distributed cluster network applications. uWSGI is mainly oriented to the web and its standard services, and has been successfully used in many different languages. Due to the extensible architecture of uWSGI, it can be extended unlimitedly to support more platforms and languages. Currently, you can use C, C++ and Objective-C to write plug-ins. The “WSGI” in the project name is to express gratitude to the Python Web standard of the same name, because WSGI developed the first plug-in for the project. uWSGI is a web server, which implements the WSGI protocol, uwsgi, http and other protocols. uWSGI uses neither the wsgi protocol nor the FastCGI protocol. Instead, it creates a uwsgi protocol. The uwsgi protocol is a protocol of the uWSGI server. It is used to define the type of information to be transmitted. Each uwsgi packet The first 4 bytes are the description of the transmission information type, which is different from WSGI. It is said that the agreement is about 10 times faster than the fcgi agreement.

  1. The main features of uWSGI are as follows:
  2. Super fast performance.
  3. Low memory footprint (measured to be about half of mod_wsgi of apache2).
  4. Multi-app management.
  5. Detailed log function (can be used to analyze app performance and bottlenecks).
  6. The height can be customized (memory size limit, restart after a certain number of services, etc.).

Other extended knowledge: Java Servlet, Sinatra, Rack

Leave a Comment

Your email address will not be published.