[Reptile] Crawn Bean Book TOP250

Locating elements through xpath

There are several ways to locate elements using xpath

// is to select the nodes in the document from the current node that matches the selection, regardless of their location.

share picture

#!/user/bin/env python

#
coding:utf-8
#
First import webdriver from selenium, and then use webdriver to open the installed Google browser.
from selenium import webdriver
#Open the chrom browser
browser =webdriver.Chrome()
#Visit Douban
browser.get('https://book.douban.com/top250?icn =index-book250-all')



#Get the title
title=browser.find_element_by_xpath("//div[@id='content'] //h1").text
#Print title
print(title)
#Get the list of element objects of the current page book information, there are a total of 25 lines
book_list=browser.find_elements_by_xpath("//tr[@class='item'] ")
for ele in book_list:
print(ele.text+" ")

Because there are many pieces of information, please pay attention to find_elements_by_xpath~

p>

Turn the page

Locate this element on the next page

Use find_element_by_class_name to locate this element

#!/user/bin/env python< /span>

#
coding:utf-8
#
First import webdriver from selenium, and then use webdriver to open the installed Google browser.
from selenium import webdriver
import time
#Open the chrom browser
browser =webdriver.Chrome()
#Visit Douban
browser.get('https://book.douban.com/top250?icn =index-book250-all')


for i in range(10):
#Get the title
title=browser.find_element_by_xpath("//div[@id='content'] //h1").text
#Print title
print(title)
#Get the list of element objects of the current page book information, there are a total of 25 lines
book_list=browser.find_elements_by_xpath("//tr[@class='item'] ")
for ele in book_list:
print(ele.text+" ")
#Output the current page number
print("------------Page %s------------" %(i+1))

#Next page
next_page=browser.find_element_by_class_name("next").click()
time.sleep(
5)
print(" ")

time library rule It is a standard library of python

The time sleep() function delays the running of the calling thread. The 5 in the figure indicates that the execution is delayed for 5 seconds.

Because the page loading takes time, just imagine, You start positioning the element immediately after you click the next page, and the element has not been loaded at that time, then the program is likely to report an error.

share picture

// is to select the nodes in the document from the current node of the matching selection, regardless of their location .

// is the current selected from the match Nodes select nodes in the document, regardless of their location.

Share a picture

#!/user/bin/env python

#
coding:utf-8
#
First import webdriver from selenium, and then use webdriver to open the installed Google browser.
from selenium import webdriver
#Open the chrom browser
browser =webdriver.Chrome()
#Visit Douban
browser.get('https://book.douban.com/top250?icn =index-book250-all')



#Get the title
title=browser.find_element_by_xpath("//div[@id='content'] //h1").text
#Print title
print(title)
#Get the list of element objects of the current page book information, there are a total of 25 lines
book_list=browser.find_elements_by_xpath("//tr[@class='item'] ")
for ele in book_list:
print(ele.text+" ")

Because there are many pieces of information, please pay attention to find_elements_by_xpath~

p>

Turn the page

Locate this element on the next page

Use find_element_by_class_name to locate this element

#!/user/bin/env python< /span>

#
coding:utf-8
#
First import webdriver from selenium, and then use webdriver to open the installed Google browser.
from selenium import webdriver
import time
#Open the chrom browser
browser =webdriver.Chrome()
#Visit Douban
browser.get('https://book.douban.com/top250?icn =index-book250-all')


for i in range(10):
#Get the title
title=browser.find_element_by_xpath("//div[@id='content'] //h1").text
#Print title
print(title)
#Get the list of element objects of the current page book information, there are a total of 25 lines
book_list=browser.find_elements_by_xpath("//tr[@class='item'] ")
for ele in book_list:
print(ele.text+" ")
#Output the current page number
print("------------Page %s------------" %(i+1))

#Next page
next_page=browser.find_element_by_class_name("next").click()
time.sleep(
5)
print(" ")

time library rule It is a standard library of python

The time sleep() function delays the running of the calling thread. The 5 in the figure indicates that the execution is delayed for 5 seconds.

Because the page loading takes time, just imagine, You start positioning the element immediately after you click the next page, and the element has not been loaded at that time, then the program is likely to report an error.

share picture

< pre>#!/user/bin/env python

#
coding:utf-8

#
First import webdriver from selenium, and then use webdriver to open the installed Google browser.

from selenium import webdriver

#Open the chrom browser

browser =webdriver.Chrome()

#Visit Douban

browser.get(https://book.douban.com/top250?icn =index-book250-all)

#Get the title

title=browser.find_element_by_xpath(//div[@id=’content’] //h1).text

#Print title

print(title)

#Get the list of element objects of the current page book information, there are a total of 25 lines

book_list=browser.find_elements_by_xpath(//tr[@class=’item’] )

for ele in book_list:

print(ele.text+
)

#!/user/bin/env python

#
coding:utf-8
#
First import webdriver from selenium, and then use webdriver to open the installed Google browser.
from selenium import webdriver
import time
#Open the chrom browser
browser =webdriver.Chrome()
#Visit Douban
browser.get('https://book.douban.com/top250?icn =index-book250-all')


for i in range(10):
#Get the title
title=browser.find_element_by_xpath("//div[@id='content'] //h1").text
#Print title
print(title)
#Get the list of element objects of the current page book information, there are a total of 25 lines
book_list=browser.find_elements_by_xpath("//tr[@class='item'] ")
for ele in book_list:
print(ele.text+" ")
#Output the current page number
print("------------Page %s------------" %(i+1))

#Next page
next_page=browser.find_element_by_class_name("next").click()
time.sleep(
5)
print(" ")

Leave a Comment

Your email address will not be published.