Web Crawler Archives - Page 7 of 8 - Simon Technology Blog

How to collect QQ numbers in QQ group members

How to collect the QQ numbers of QQ group members, collect QQ numbers, collect QQ numbers in batches

As we all know, QQ group members’ QQ numbers cannot be exported, even members can’t, then

September 28, 2021By Simo Web CrawlerLeave a Comment

The reptile knowledge point summary.

The basic workflow of a web crawler is as follows:

1. Select the seed URL;

2. Put these URLs into the URL queue to be crawled;

1. p> 3. Take out the URL to be crawled from the que

September 28, 2021By Simo Web Crawler Herring, Knowledge, point, SummaryLeave a Comment

[Reptile] Crawn Bean Book TOP250

Locating elements through xpath

There are several ways to locate elements using xpath

#!/user/bin/env python
#coding:utf-8
#First import webdriver from selenium, and then use webdrive

September 28, 2021By Simo Web Crawler Top250Leave a Comment

Teacher IDou teaches you to learn iStio 25: How to use ISTIO to monitor and log acquisition

Everyone knows that istio can help us achieve grayscale publishing, traffic monitoring, traffic management and other functions. Each function helps us realize different businesses in different scen

September 28, 2021By Simo Web Crawler collection, How, Idou, Implementation, ISTIO, learning, log, monitoring, teacher, teachingLeave a Comment

Crawn Selenium

Open the crawler on the front end of the browser from selenium import webdriver
from time import sleep

bro = webdriver.Chrome(executable_path=r’D:\Crawler Storage\chromedriver.exe’)

bro.get(u

September 28, 2021By Simo Web Crawler reptile, SeleniumLeave a Comment

Pycharm

Today, try to use pycharm+beautifulsoup for crawler testing. What I understand is mainly divided into two types: HTML written by myself and web pages on Baidu. The first one is to read the webpage

September 28, 2021By Simo Web Crawler early, Pycharm, reptileLeave a Comment

Application value and field of Hubei big data acquisition platform, digital cloud

Boruo Big Data Computing Service Platform (BR-odp) is a convenient, efficient and easy-to-manage TB/PB-level data storage and computing solution. BR-ODP is based on the big data computing service p

September 28, 2021By Simo Web Crawler application, Cloud, collection, data, Field, Hubei, large, Number, Platform, road, valueLeave a Comment

Temperature, liquid level data online monitoring transmission collection

Program requirements

The storage medium of storage tanks is generally liquid or gas, which is essential and important for petroleum, chemical, grain and oil, food, fire protection, transporta

September 28, 2021By Simo Web Crawler collection, data, liquid level, monitoring, online, Temperature, transmissionLeave a Comment

QueryList is what is the collection?

Collect the title and link of the Baidu search result list.

$data = QueryList::get(‘https://www.baidu.com/s?wd= QueryList’)
// Set collection rules
->rules([
‘Title’=>ar

September 28, 2021By Simo Web Crawler collect, Do, look, QueryList, whatLeave a Comment

SPAGHETTI scanner source code analysis

The previous article roughly analyzed the fingerprint recognition part of Spaghetti. This article will roughly analyze it The crawler part.

Let’s first look at urlextract.py in the extractor

September 28, 2021By Simo Web Crawler analysis, crawler, scanner, Source Code, SpaghettiLeave a Comment