Py reptile posture

Basics include

head{}Dictionary to access the header files to be passed in. If it can be considered as a general data header, the specific data header should be obtained by capturing the packet

headers = {

'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0',
'Accept': 'text/html,application/xhtml+xml,application /xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.5 ',
'Accept-Encoding': 'gzip, deflate',
'Connection': 'keep-alive'}

Simulated login

This data is a simulated login by Visual China

First enter the wrong account and password in Visual China to obtain a sent value, which can be found by calling the check function of the browser page< br>Get the value {‘username’: “*****”,’password’: “*******”,’captcha’: “”,’lgt’: “0”,’token’ : “”}

Named date

Use the post() function to pass in the login address, the actual account password, and Head data.

Write a function to test whether cookies are returned. If there is no return value, capture the packet to find the actual sent value and extract the value

< em>For details, please see https://blog.csdn.net/churximi/article/details/50917322 I learned from here

< span style="color: #0000ff;">def login():

s
= requests.session()
loginURL
= "https://www.vcg.com/ajax /login/submit" # The URL sent to by POST
login = s.post(loginURL, data = date, headers = headers) # send Login information, return response information (including cookie)
cookies = login.cookies
return cookies

Get webpage

get() function to get the URL, pass in url or urls ,heasders, timeout time The value of html is the webpage

table gets the corresponding tags obtained in html If there is no corresponding internal value, it will return None and find_all() will prompt an error

html=requests.get('https://18moe.com/category/game'< /span>,headers=headers,timeout=5).text
table
=BeautifulSoup(html,'lxml').find('select',{'class',< span style="color: #800000;">'
poi-pager__item_middle_select poi-form__control< /span>'})

Use proxy< /p>

Not used yet waiting for supplement

headers = {

'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0',
'Accept': 'text/html,application/xhtml+xml,application /xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.5 ',
'Accept-Encoding': 'gzip, deflate',
'Connection': 'keep-alive'}

def login():

s
= requests.session()
loginURL
= "https://www.vcg.com/ajax /login/submit" # The URL sent to by POST
login = s.post(loginURL, data = date, headers = headers) # send Login information, return response information (including cookie)
cookies = login.cookies
return cookies

html=requests.get('https://18moe.com/category/game',headers=headers, timeout=5).text
table
=BeautifulSoup(html,'lxml').find('< /span>select',{'class','poi-pager__item_middle_select poi-form__control'})< /pre>

Leave a Comment

Your email address will not be published.