[Reptile] Crawn Bean Book TOP250

Locating elements through xpath

There are several ways to locate elements using xpath

// is to select the nodes in the document from the current node that matches the selection, regardless of their location.

#!/user/bin/env python

#coding:utf-8

#First import webdriver from selenium, and then use webdriver to open the installed Google browser. 

from selenium import webdriver

#Open the chrom browser

browser =webdriver.Chrome()

#Visit Douban

browser.get('https://book.douban.com/top250?icn =index-book250-all')







#Get the title

title=browser.find_element_by_xpath("//div[@id='content'] //h1").text

#Print title

print(title)

#Get the list of element objects of the current page book information, there are a total of 25 lines

book_list=browser.find_elements_by_xpath("//tr[@class='item'] ")

for ele in book_list:

 print(ele.text+"
")

Because there are many pieces of information, please pay attention to find_elements_by_xpath~

Turn the page

Locate this element on the next page

Use find_element_by_class_name to locate this element

#!/user/bin/env python< /span>

#coding:utf-8

#First import webdriver from selenium, and then use webdriver to open the installed Google browser. 

from selenium import webdriver

import time

#Open the chrom browser

browser =webdriver.Chrome()

#Visit Douban

browser.get('https://book.douban.com/top250?icn =index-book250-all')





for i in range(10):

 #Get the title

 title=browser.find_element_by_xpath("//div[@id='content'] //h1").text

 #Print title

 print(title)

 #Get the list of element objects of the current page book information, there are a total of 25 lines

 book_list=browser.find_elements_by_xpath("//tr[@class='item'] ")

 for ele in book_list:

 print(ele.text+"
")

 #Output the current page number

 print("------------Page %s------------" %(i+1))



 #Next page

 next_page=browser.find_element_by_class_name("next").click()

 time.sleep(5)

 print("
")

time library rule It is a standard library of python

The time sleep() function delays the running of the calling thread. The 5 in the figure indicates that the execution is delayed for 5 seconds.

Because the page loading takes time, just imagine, You start positioning the element immediately after you click the next page, and the element has not been loaded at that time, then the program is likely to report an error.

share picture

// is to select the nodes in the document from the current node of the matching selection, regardless of their location .

// is the current selected from the match Nodes select nodes in the document, regardless of their location.

Share a picture

#!/user/bin/env python

#coding:utf-8

#First import webdriver from selenium, and then use webdriver to open the installed Google browser. 

from selenium import webdriver

#Open the chrom browser

browser =webdriver.Chrome()

#Visit Douban

browser.get('https://book.douban.com/top250?icn =index-book250-all')







#Get the title

title=browser.find_element_by_xpath("//div[@id='content'] //h1").text

#Print title

print(title)

#Get the list of element objects of the current page book information, there are a total of 25 lines

book_list=browser.find_elements_by_xpath("//tr[@class='item'] ")

for ele in book_list:

 print(ele.text+"
")

Because there are many pieces of information, please pay attention to find_elements_by_xpath~

Turn the page

Locate this element on the next page

Use find_element_by_class_name to locate this element

#!/user/bin/env python< /span>

#coding:utf-8

#First import webdriver from selenium, and then use webdriver to open the installed Google browser. 

from selenium import webdriver

import time

#Open the chrom browser

browser =webdriver.Chrome()

#Visit Douban

browser.get('https://book.douban.com/top250?icn =index-book250-all')





for i in range(10):

 #Get the title

 title=browser.find_element_by_xpath("//div[@id='content'] //h1").text

 #Print title

 print(title)

 #Get the list of element objects of the current page book information, there are a total of 25 lines

 book_list=browser.find_elements_by_xpath("//tr[@class='item'] ")

 for ele in book_list:

 print(ele.text+"
")

 #Output the current page number

 print("------------Page %s------------" %(i+1))



 #Next page

 next_page=browser.find_element_by_class_name("next").click()

 time.sleep(5)

 print("
")

time library rule It is a standard library of python

The time sleep() function delays the running of the calling thread. The 5 in the figure indicates that the execution is delayed for 5 seconds.

share picture

< pre>#!/user/bin/env python

#coding:utf-8

#First import webdriver from selenium, and then use webdriver to open the installed Google browser.

from selenium import webdriver

#Open the chrom browser

browser =webdriver.Chrome()

#Visit Douban

browser.get(‘https://book.douban.com/top250?icn =index-book250-all‘)

#Get the title

title=browser.find_element_by_xpath(“//div[@id=’content’] //h1“).text

#Print title

print(title)

#Get the list of element objects of the current page book information, there are a total of 25 lines

book_list=browser.find_elements_by_xpath(“//tr[@class=’item’] “)

for ele in book_list:

print(ele.text+“
“)

#!/user/bin/env python

#coding:utf-8

#First import webdriver from selenium, and then use webdriver to open the installed Google browser. 

from selenium import webdriver

import time

#Open the chrom browser

browser =webdriver.Chrome()

#Visit Douban

browser.get('https://book.douban.com/top250?icn =index-book250-all')





for i in range(10):

 #Get the title

 title=browser.find_element_by_xpath("//div[@id='content'] //h1").text

 #Print title

 print(title)

 #Get the list of element objects of the current page book information, there are a total of 25 lines

 book_list=browser.find_elements_by_xpath("//tr[@class='item'] ")

 for ele in book_list:

 print(ele.text+"
")

 #Output the current page number

 print("------------Page %s------------" %(i+1))



 #Next page

 next_page=browser.find_element_by_class_name("next").click()

 time.sleep(5)

 print("
")

Leave a Comment Cancel reply