Day41 – Asynchronous IO, Agreement

Contents

(See the navigation in the directory bar on the right)

– 1. Preface
– 2. Five models of IO
– 3. Coroutine
– 3.1 The concept of coroutine
– 4. Gevent module
– 4.1 Basic use of gevent
– 4.2 gevent Application 1: Crawler
– 4.3 gevent Application 2: Network programming

1. Preface

CPU speed is much faster than disk, network and other IO . In a thread, the CPU executes code extremely fast. However, once it encounters IO operations, such as reading and writing files and sending network data, it needs to wait for the completion of the IO operation before proceeding to the next step. This situation is called synchronous IO. During the IO operation, the current thread is suspended, and other codes that need to be executed by the CPU cannot be executed by the current thread. Because an IO operation blocks the current thread, causing other code to be unable to execute, we must use multiple threads or multiple processes to execute code concurrently to serve multiple users. Each user will allocate a thread. If the thread is suspended due to IO, the threads of other users will not be affected. Although the multi-thread and multi-process model solves the concurrency problem, the system cannot increase threads without limit. Because the overhead of switching threads in the system is also very high, once the number of threads is too large, the CPU time is spent on thread switching, and the time to actually run the code is less, resulting in a serious performance degradation. Since the problem we want to solve is a serious mismatch between the high-speed execution capability of the CPU and the turtle speed of the IO device, multi-threading and multi-process are just one way to solve this problem. Another way to solve the IO problem is asynchronous IO. . When the code needs to perform a time-consuming IO operation, it only issues an IO command, does not wait for the IO result, and then executes other code. After a period of time, when the IO returns a result, it will notify the CPU for processing.

2. Five models of IO

   (1) blocking IO (blocking IO)

   (2) noblocking IO ( Non-blocking IO)

  (3) IO multiplexing (IO multiplexing)

  (4) signal driven IO (signal driven IO) – not commonly used

< p>  (5) asynchronous IO (asynchronous IO)

Before understanding the above five IO modes, you need to understand the following 4 concepts:

  synchronous, asynchronous , Blocking, non-blocking

2.1 Synchronous and asynchronous

  Synchronous and asynchronous focus on the message communication mechanism

  Synchronous: When a call is made, the call does not return until the result is obtained. But once the call returns, the return value (result) is obtained, and the caller needs to actively wait for the result of this call.

   Asynchronous: When a call is sent, the call returns directly, regardless of whether it returns a result or not. When an asynchronous procedure call is issued, the callee informs the caller through the status, or handles the call through the callback function

2.2 Blocking and non-blocking

  Blocking and non-blocking focus on the state of the program while waiting for the call result

  Blocking: Before the call result returns, the current thread will be suspended. The calling thread returns only after getting the result;

  non-blocking: the call will not suspend the current thread until the result is not obtained immediately

   Yes A good example illustrates the relationship between these four:

    Lao Zhang loves to drink tea, don’t talk nonsense, boil water. Characters: Lao Zhang, two kettles (ordinary kettle, referred to as kettle; ringing kettle, referred to as ringing kettle).
   1 Lao Zhang put the kettle on the fire and immediately waited for the water to boil. (Synchronization blocking) Lao Zhang feels a little silly
   2 Lao Zhang puts the kettle on the fire, goes to the living room to watch TV, and goes to the kitchen from time to time to see if the water is boiling. (Synchronous non-blocking) Lao Zhang still felt a little stupid, so he became high-end and bought a kettle that can sound a flute. After the water is boiled, it can make a loud noise.
   3 Lao Zhang put the kettle on the fire and immediately waited for the water to boil. (Asynchronous blocking) Lao Zhang feels so stupid that it doesn’t make much sense.
   4 Lao Zhang put the kettle on the fire, went to the living room to watch TV, and stopped watching it before the kettle rang. Go get the pot again. (Asynchronous non-blocking) Lao Zhang thinks he is smart.
        
          所谓同步异步,只是对于水壶而言。 Ordinary kettle, synchronous; ringing kettle, asynchronous. Although they can work, the sound of the kettle can remind Lao Zhang that the water is boiling after he finishes the work. This is beyond the reach of ordinary kettles. Synchronization can only allow the caller to poll himself (in case 2), causing Zhang’s inefficiency.
   The so-called blocking non-blocking is only for Lao Zhang. The old Zhang who is waiting now is blocking; the old Zhang who is watching TV is non-blocking. In case 1 and case 3, Lao Zhang is obstructed, and he doesn’t know when his wife calls. Although the ringing kettle in 3 is asynchronous, it doesn’t make much sense to the old Zhang who is waiting. Therefore, in general, asynchronous is used in conjunction with non-blocking, so that it can play the role of asynchrony.

3. Coroutine

3.1 Concept of Coroutine

  The process is the smallest unit of resource allocation, and the thread is the basic unit of CPU scheduling. In Cpython, due to the existence of the GIL lock, generally speaking, there is only one thread in the cpu at the same time slice Run, in order to improve the efficiency of single thread, the concept of coroutine is proposed here.

   Coroutine: Concurrency under single thread, also known as microthread and fiber. The English name is Coroutine. One sentence explains what a coroutine is: a coroutine is a lightweight thread in a user mode, that is, a coroutine is controlled and scheduled by the user program itself.

   needs to be emphasized:

     1. Python threads belong to the kernel level, that is, the operating system controls the scheduling (such as single thread encounters io or too long execution time will be forced to hand over (Exit the CPU execution permission and switch other threads to run)

     2. Start the coroutine in a single thread, once encountering io, it will switch from the application level (not the operating system) to control the switch to improve Efficiency (!!! The switching of non-io operations has nothing to do with efficiency)

   Compared with the operating system control thread switching, the user controls the coroutine switching in a single thread

  

The advantages of    are as follows:

     1. The switching overhead of the coroutine is smaller, which belongs to the program-level switching, and the operating system does not perceive it at all, so it is more lightweight

p>

    2. The effect of concurrency can be achieved in a single thread, and the cpu can be used to the maximum.

  The disadvantages are as follows:

     1. The essence of the coroutine is single-threaded, Multi-core cannot be used. It can be that one program opens multiple processes, each process opens multiple threads, and each thread opens the coroutine

     2. The coroutine refers to a single thread, so once the coroutine If blocking occurs, the entire thread will be blocked

  Summary of the characteristics of the coroutine:

     1. Concurrency must be achieved in only one single thread

    2. Modify shared data without locking

    3. Save multiple control flow context stacks in the user program

    4. A coroutine encounters an IO operation automatically Switch to other coroutines

4. Gevent module

4.1 Basic use of gevent

  Gevent is a third-party library that can easily implement concurrent synchronous or asynchronous programming through gevent.

  g1=gevent.spawn(func,1,,2,3,x=4,y=5)

     Create a coroutine object g1, the first parameter in spawn brackets is the function name, such as eat, there can be multiple parameters behind, which can be positional arguments or keyword arguments, which are all passed to the function eat
  g2
=gevent.spawn(func2)
  g1.join()
    Wait for the end of g1
  g2.join()
    Wait for the end of g2
   or the above two steps cooperate in one step:
  gevent.joinall([g1,g2])
  g1.value
     get the return value of func1

   Use gevent to switch instances when encountering IO:

import gevent



def eat():
print('eat start...')
gevent.sleep(
2)
print('eat end.')


def play():
print('play start...')
gevent.sleep(
2)
print('play end.')


if __name__ == '__main__':
g1
= gevent.spawn(eat)
g2
= gevent.spawn(play)
g1.join()
g2.join()

print('----Main-----')

p>

  

   The above example gevent.sleep(2) simulates the io blocking that gevent can recognize, while time.sleep(1) or other blocking, gevent If it cannot be directly identified, you need to use the following line of code to patch, and you can identify it

  from gevent import monkey;monkey.patch_all()< /span>Must be placed in front of the patched person, such as time, before the socket module or we simply remember: to use gevent, you need to put from gevent import monkey;monkey.patch_all() at the beginning of the file

from gevent import monkey; monkey.patch_all()

import gevent
import time


def eat():
print('eat start...')
time.sleep(
2)
print('eat end.')


def play():
print('play start...')
time.sleep(
2)
print('play end.')


if __name__ == '__main__':
g1
= gevent.spawn(eat)
g2
= gevent.spawn(play)
g1.join()
g2.join()

print('----Main-----')

p>

   We can use threading.current_thread().getName() to view each g1 and g2, and the result of the view is DummyThread-n, which is a fake thread

from gevent import monkey; monkey.patch_all()

import threading
import gevent
import time


def eat():
print(threading.current_thread().name)
print('eat start...')
time.sleep(
2)
print('eat end.')


def play():
print(threading.current_thread().name)
print('play start...')
time.sleep(
2)
print('play end.')


if __name__ == '__main__':
g1
= gevent.spawn(eat)
g2
= gevent.spawn(play)
g1.join()
g2.join()

print('----Main-----')

Execution result:
DummyThread
-1
eat start...
DummyThread
-2
play start...
(blocked for 2 seconds)
eat end.
play end.
----Main-----

4.2 gevent application one: crawler

share picture

from gevent import monkey; monkey.patch_all()

import gevent
import requests


def get(url):
print('GET:', url)
response
= requests.get(url)
if response.status_code == 200:
print('%d bytes recevied from %s' % (len(response .text), url))


if __name__ == '__main__':
gevent.joinall([
gevent.spawn(get,
'https://www.baidu .com'),
gevent.spawn(get,
'https://www.taobao .com'),
gevent.spawn(get,
'https://www.jd .com')])

gevent-crawler < /div>

4.3 gevent application two: network programming

   realize single-threaded socket concurrency through gevent
  Note: from gevent import monkey;monkey.patch_all() must be placed before importing the socket module, otherwise gevent cannot recognize socket blocking

p>

share picture

from gevent import  spawn, monkey;monkey.patch_all()

import socket


def server(ip_port):
sk_server
= socket.socket()
sk_server.bind(ip_port)
sk_server.listen(
5)
while True:
conn, addr
= sk_server.accept()
spawn(walk, conn)


def walk(conn):
conn.send(b
'welcome!')
try:
while True:
res
= conn.recv(1024)
print(res)
conn.send(res.upper())
except Exception as e:
print(e)
finally:
conn.close()


if __name__ == '__main__':
server((
'localhost', 8080))

server.py

share picture

import socket


sk_client
= socket.socket()
sk_client.connect((
'localhost', 8080))
res
= sk_client.recv(1024)
print(res)
while True:
inp
= input('>>>').strip()
if not inp: continue
sk_client.send(inp.encode())
print(sk_client.recv(1024))

client .py

  g1=gevent.spawn(func,1,,2,3,x=4,y=5)

     Create a coroutine object g1, the first parameter in spawn brackets is the function name, such as eat, there can be multiple parameters behind, which can be positional arguments or keyword arguments, which are all passed to the function eat
  g2
=gevent.spawn(func2)
  g1.join()
    Wait for the end of g1
  g2.join()
    Wait for the end of g2
   or the above two steps cooperate in one step:
  gevent.joinall([g1,g2])
  g1.value
     got the return value of func1

import gevent



def eat():
print('eat start...')
gevent.sleep(
2)
print('eat end.')


def play():
print('play start...')
gevent.sleep(
2)
print('play end.')


if __name__ == '__main__':
g1
= gevent.spawn(eat)
g2
= gevent.spawn(play)
g1.join()
g2.join()

print('----Main-----')

from gevent import monkey; monkey.patch_all()

import gevent
import time


def eat():
print('eat start...')
time.sleep(
2)
print('eat end.')


def play():
print('play start...')
time.sleep(
2)
print('play end.')


if __name__ == '__main__':
g1
= gevent.spawn(eat)
g2
= gevent.spawn(play)
g1.join()
g2.join()

print('----Main-----')

from gevent import monkey; monkey.patch_all()

import threading
import gevent
import time


def eat():
print(threading.current_thread().name)
print('eat start...')
time.sleep(
2)
print('eat end.')


def play():
print(threading.current_thread().name)
print('play start...')
time.sleep(
2)
print('play end.')


if __name__ == '__main__':
g1
= gevent.spawn(eat)
g2
= gevent.spawn(play)
g1.join()
g2.join()

print('----Main-----')

Execution result:
DummyThread
-1
eat start...
DummyThread
-2
play start...
(blocked for 2 seconds)
eat end.
play end.
----Main-----

share picture

from gevent import monkey; monkey.patch_all()

import gevent
import requests


def get(url):
print(GET:, url)
response
= requests.get(url)
if response.status_code == 200:
print(%d bytes recevied from %s % (len(response.text), url))


if __name__ == __main__:
gevent.joinall([
gevent.spawn(get,
https://www.baidu.com),
gevent.spawn(get,
https://www.taobao.com),
gevent.spawn(get,
https://www.jd.com)])

gevent-爬虫

from gevent import monkey; monkey.patch_all()

import gevent
import requests


def get(url):
print(GET:, url)
response
= requests.get(url)
if response.status_code == 200:
print(%d bytes recevied from %s % (len(response.text), url))


if __name__ == __main__:
gevent.joinall([
gevent.spawn(get,
https://www.baidu.com),
gevent.spawn(get,
https://www.taobao.com),
gevent.spawn(get,
https://www.jd.com)])

分享图片

from gevent import spawn, monkey;monkey.patch_all()

import socket


def server(ip_port):
sk_server
= socket.socket()
sk_server.bind(ip_port)
sk_server.listen(
5)
while True:
conn, addr
= sk_server.accept()
spawn(walk, conn)


def walk(conn):
conn.send(b
welcome!)
try:
while True:
res
= conn.recv(1024)
print(res)
conn.send(res.upper())
except Exception as e:
print(e)
finally:
conn.close()


if __name__ == __main__:
server((
localhost, 8080))

server.py

from gevent import spawn, monkey;monkey.patch_all()

import socket


def server(ip_port):
sk_server
= socket.socket()
sk_server.bind(ip_port)
sk_server.listen(
5)
while True:
conn, addr
= sk_server.accept()
spawn(walk, conn)


def walk(conn):
conn.send(b
welcome!)
try:
while True:
res
= conn.recv(1024)
print(res)
conn.send(res.upper())
except Exception as e:
print(e)
finally:
conn.close()


if __name__ == __main__:
server((
localhost, 8080))

分享图片

import socket


sk_client
= socket.socket()
sk_client.connect((
localhost, 8080))
res
= sk_client.recv(1024)
print(res)
while True:
inp
= input(>>>).strip()
if not inp: continue
sk_client.send(inp.encode())
print(sk_client.recv(1024))

client.py

import socket


sk_client
= socket.socket()
sk_client.connect((
localhost, 8080))
res
= sk_client.recv(1024)
print(res)
while True:
inp
= input(>>>).strip()
if not inp: continue
sk_client.send(inp.encode())
print(sk_client.recv(1024))

Leave a Comment

Your email address will not be published.