赞
踩
在学习Python超强爬虫8天速成(完整版)爬取各种网站数据实战案例Day7 - 06.无头浏览器+规避检测时候老师演示的代码,遇到一些问题及解决过程,供分享和指点
- from selenium import webdriver
- from time import sleep
- from selenium.webdriver.chrome.options import Options
- from selenium.webdriver import ChromeOptions
-
- # non visual interface
- chrome_options = Options()
- chrome_options.add_argument('--headless')
- chrome_options.add_argument('--disable-gpu')
-
- # avoid detection risks
- option = ChromeOptions()
- option.add_experimental_option('excludeSwitches', ['enable-automation'])
-
-
- driver = webdriver.Chrome(executable_path='./chromedriver.exe', chrome_options=chrome_options, options=option)
-
- driver.get('https://www.baidu.com')
- # get page source
- print(driver.page_source)
- sleep(2)
- driver.quit()
由于刚开始使用的是seleniumV3.7报错TypeError: __init__() got an unexpected keyword argument 'options' ,作为初学者,比较疑惑,网上没有找到合适的解决办法,尝试将selenium升级到Version4.1.0,但是会有两个warning,
01: DeprecationWarning: executable_path has been deprecated, please pass in a Service object 发生于driver = webdriver.Chrome(executable_path='./chromedriver.exe')
解决方式
- from selenium import webdriver
- from selenium.webdriver.chrome.service import Service
-
- # 创建一个Service对象,指定ChromeDriver的路径
- service = Service('./chromedriver.exe')
-
- # 通过Service对象来初始化Chrome WebDriver
- driver = webdriver.Chrome(service=service)
02:DeprecationWarning: use options instead of chrome_options 发生于driver = webdriver.Chrome(service=service, chrome_options=chrome_options, options=option),
但是chrome_options和option都需要传入options,不知如何解决,但是最后尝试将无界面和反检测相应配置参数都传入Options对象,如下
from selenium import webdriver from selenium.webdriver.chrome.service import Service # 创建一个Service对象,指定ChromeDriver的路径 service = Service('./chromedriver.exe') # 通过Service对象来初始化Chrome WebDriver driver = webdriver.Chrome(service=service)经过测试,后台运行和防止被检测均生效
最终代码
- from selenium import webdriver
- from time import sleep
- from selenium.webdriver.chrome.options import Options
- from selenium.webdriver.chrome.service import Service
-
- chrome_options = Options()
- # non visual interface
- chrome_options.add_argument('--headless')
- chrome_options.add_argument('--disable-gpu')
- # avoid detection risks
- chrome_options.add_experimental_option('excludeSwitches', ['enable-automation'])
-
-
- # 创建一个Service对象,指定ChromeDriver的路径
- service = Service('./chromedriver.exe')
- # 通过Service对象来初始化Chrome WebDriver
- driver = webdriver.Chrome(service=service, options=chrome_options)
-
- driver.get('https://www.baidu.com')
- print(driver.page_source)
- sleep(2)
- driver.quit()
期待指点...
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。