Fluid Archives - Simon Technology Blog

Graduate reptile (1)

The objects processed by the crawler are links, titles, paragraphs, and pictures.
baidu
xxxx
xxxx

There are two types of links that must be excluded:
1, internal Jump link
xxxx
2, the link

September 28, 2021By Simo Web Crawler First, Fluid, one, reptileLeave a Comment

Graduate reptile (three)

A paragraph:

import requests
url=”https://en.wikipedia.org/wiki/Steve_Jobs”
res=requests.get(url )
print(res.status_code)
with open(‘a.html’,’w’, encoding=’utf-8′) as f:
f.write(res.text ) S

September 28, 2021By Simo Web Crawler First, Fluid, reptile, threeLeave a Comment