Category: Industry

Enterprise application software is not only software, but also the concrete, logical and behavioral landing based on the theory and experience of enterprise management, because the process of enterprise application software design and development is to study the most advanced management mode and Processes are even more proven effective management laws by most companies. These management experiences have already been embedded in management software thoughts, processes, report content, statistical analysis projects, management levels, and information decision-making.

Reptile technology

1, scrapy (python crawler) 2, pyspider (python crawler) 3. Crawler4j (java stand-alone crawler) 4. WebMagic (java stand-alone crawler) 5. WebCollecto (java stand-alone crawler) 6, Heritrix (java crawler) )

September 29, 2021By Simo Web Crawler reptile, technologyLeave a Comment

Simple use of PHPSPIDER acquisition this blog article content

Collection process

Acquire page content according to the link (curl)->Get the content that needs to be collected (can be filtered by regular, xpath, css selector, etc.)

September 29, 2021By Simo Web Crawler article, Blog, Book, collection, Content, phpspider, simple, useLeave a Comment

Reptile SCRAPY Component Request Metallization, POST Request, Middleware

Use the post request in the scrapy component to call

def start_requests(self):
Transfer parameters and then return yield scrapy.FormRequest(url=url,formdata=data,callback=self.parse)
Make a

September 29, 2021By Simo Web Crawler Biography, components, middleware, Parts, POST, reptile, Request, SCRAPYLeave a Comment

Reptile tips

Crawler Tips First of all, what python crawler modules have you used? I believe most people will reply to requests or scrapy, well I mean most people. But for simple crawlers, let’s habitually use

September 29, 2021By Simo Web Crawler reptile, skillLeave a Comment

Domestic garbage integrated processing remote data acquisition PLC program monitoring

Project background

Information technology needs to be used to establish an intelligent monitoring system for domestic waste transportation and processing, to realize remote centralized monito

September 29, 2021By Simo Web Crawler acquisition, data, Domestic garbage, handling, integrated, monitoring, PLC, program, remoteLeave a Comment

Reptile – picture lazy loading solution

Dynamic data loading processing

I. Lazy loading of pictures

#!/usr/bin/env python
# -*- coding:utf-8 -*-
import requests
from lxml import etree

if __name__ == “__main__”:
url =

September 29, 2021By Simo Web Crawler lazy, loading, picture, program, reptile, SolvingLeave a Comment

Reptral performance analysis and optimization

We wrote a single-task version of the crawler to crawl the user information of Zhenai.com two days ago. What about its performance?

We can take a look at the network utilization. We can see t

September 29, 2021By Simo Web Crawler analysis, Optimization, performance, reptileLeave a Comment

Reptile-REQUESTS usage

Chinese document API: http://requests.kennethreitz.org/zh_CN/latest/

Installation

pip install requests Get webpage

# coding=utf-8
import requests

response = requests.get(‘ht

September 29, 2021By Simo Web Crawler Herbs, requests, usageLeave a Comment

Understand the principle of reptiles

—Recover content begins—

If we compare the Internet to a big spider web, the data is Stored in the various nodes of the spider web, and the crawler is a small spider,

Crawling its o

September 29, 2021By Simo Web Crawler crawler, principle, UnderstandingLeave a Comment

Amazon-WEB-SERVICES – View or calculate the request rate of the AWS S3 bucket?

I am trying to determine the current request rate of an existing AWS bucket to see how far or distance I am running to the standard request limit of 100 QPS on an S3 bucket. Ideally, I hope to see

September 28, 2021By Simo Silicon Valley Amazon, AWS, bucket, Calculation, rate, Request, S3, Services, storage, view, WEBLeave a Comment