XPATH
(1)/Extract layer by layer
(2)text() Extract the text below the label
(3)/ /Tag name extract all tags
(4)//Tag name[num>=1] Extract sibling nodes with the same tag name.
< br>
xpath(‘/tr[@class=”h”]/td[1]/text()’) #Job name
xpath(‘/tr[@class=”h”]/td [2]/text()’) #Job category
xpath(‘/tr[@class=”h”]/td[3]/text()’) #Number of people
xpath(‘/tr[ @class=”h”]/td[3]/text()’) #location
(5)//tag name[@attribute=’attribute value’] Extract the attribute for… Tags
//a[@class=’noactive’]
//a[@class=’noactive’ and @id=’next’]
(6) @Attribute名取某Attributes
======================================== ====================
RE
re.compile(pattern, flags=0)
flags Bit parameter
re.I(re.IGNORECASE)
Make the match insensitive to case
re.L(re.LOCAL)
Do localization recognition ( locale-aware) matching
re.M(re.MULTILINE)
Multi-line matching affects ^ and $
re.S(re.DOTALL)
make . Match all characters including newlines
re.U(re.UNICODE)
Analyze characters according to the Unicode character set. This flag affects \w, \W, \b, \B.
re.X(re.VERBOSE)
This flag gives you a more flexible format so that you can write regular expressions It’s easier to understand.
========================================== =================