通过爬虫 获取 官方文档库 如果想获取 相应的库 修改对应配置即可
代码如下
- from urllib.parse import urljoin
- import requests
- from lxml import etree
-
- def get_data(page_num, key, file_name):
- """
- 解析
- page_num: 爬取页数
- key: 爬取的关键字
- file_name: 存入的文件
- """
- headers = {
- 'authority': 'pypi.org',
- 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
- 'accept-language': 'zh-CN,zh;q=0.9',
- 'cookie': '_ga=GA1.2.219278849.1652147728; _gid=GA1.2.1817385364.1657850983; _gat_gtag_UA_55961911_1=1; session_id=vyEA1rOaU_76r8eeW8AeM5VBtXYQytU_SjTSCYtvkYk.YtDMZw.VRXbcY1ixEgQ9oFN3hTOs-uv-CGZawlkFmSjKOD70dCymJ9Nap2WXcKUXwEDZwmyI2hez5Y5DkJoAf0Y_bGveA',
- 'referer': 'https://pypi.org/search/?q=selenium&page=1',
- 'sec-ch-ua': '^\\^.Not/A)Brand^\\^;v=^\\^99^\\^, ^\\^Google',
- 'sec-ch-ua-mobile': '?0',
- 'sec-ch-ua-platform': '^\\^Windows^\\^',
- 'sec-fetch-dest': 'document',
- 'sec-fetch-mode': 'navigate',
- 'sec-fetch-site': 'same-origin',
- 'sec-fetch-user': '?1',
- 'upgrade-insecure-requests': '1',
- 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
- }
- for i in range(1, page_num):
- params = (
- ('q', key),
- ('page', str(i)),
- )
- response = requests.get('https://pypi.org/search/', headers=headers, params=params)
- tree = etree.HTML(response.text)
- fp = open(file_name, "a+", encoding="utf-8")
- for li in tree.xpath("//ul[@class=\"unstyled\"]/li"):
- url = urljoin('https://pypi.org/search/', li.xpath("./a/@href")[0])
- name = li.xpath("./a/h3/span[1]/text()")[0] + " " + li.xpath("./a/h3/span[2]/text()")[0]
- try:
- description = li.xpath("./a/p/text()")[0]
- except IndexError:
- description = "简介无内容"
- print(name)
- a = "库名:" + name + "\n" \
- + "库链接:" + url + "\n" + \
- "库简介:" + description + "\r\n"
- fp.write(a)
-
-
- if __name__ == '__main__':
- get_data(80, "selenium", "a.txt")
库名:selenium-oxide 1.0.0
库链接:https://pypi.org/project/selenium-oxide/
库简介:A Selenium boilerplate for automating web exploits. Use responsibly and ethically.
库名:play-selenium 0.0.3
库链接:https://pypi.org/project/play-selenium/
库简介:pytest plugin that let you drive a browser with Selenium
库名:comun-selenium 0.3
库链接:https://pypi.org/project/comun-selenium/
库简介:简介无内容
库名:selenium-cmd 0.0.2
库链接:https://pypi.org/project/selenium-cmd/
库简介:Tool to control Selenium from command line
库名:selenium-chromedriver 1.3
库链接:https://pypi.org/project/selenium-chromedriver/
库简介:Install Stable version Chromedriver for Selenium on Windows, MacOS, M1 MacOS and Linux
库名:selenium-ai 0.1.2
库链接:https://pypi.org/project/selenium-ai/
库简介:AI utility functions for webscraping and selenium automation scripts
库名:selenium-stealth 1.0.6
库链接:https://pypi.org/project/selenium-stealth/
库简介:Trying to make python selenium more stealthy.
库名:edc-selenium 0.1.3
库链接:https://pypi.org/project/edc-selenium/
库简介:TestCaseMixins for selenium tests on the clinicedc/edc
库名:ctreport-selenium 1.1.4
库链接:https://pypi.org/project/ctreport-selenium/
库简介:ctreport-selenium is a simple, creative and a customizable report for automation testing using Python. Best suitable for pytest, unit test and nose framework.
库名:webtest-selenium 0.1
库链接:https://pypi.org/project/webtest-selenium/
库简介:Selenium testing with WebTest
库名:antiblock-selenium 0.0.3
库链接:https://pypi.org/project/antiblock-selenium/
库简介:Selenium Firefox e Chrome webdrivers com alguns mecanismos antibloqueios
库名:masonite-selenium 0.0.3
库链接:https://pypi.org/project/masonite-selenium/
库简介:Selenium Testing Package
库名:selenium-extension 1.9.1
库链接:https://pypi.org/project/selenium-extension/
库简介:Provides additional methods for selenium automation
库名:selenium-youtube 2.0.29
库链接:https://pypi.org/project/selenium-youtube/
库简介:selenium_youtube
库名:selenium-yaml 1.0.96
库链接:https://pypi.org/project/selenium-yaml/
库简介:Selenium bots using YAML
库名:buster-selenium 0.1
库链接:https://pypi.org/project/buster-selenium/
库简介:Manage buster.js slave browsers using selenium.
库名:bonobo-selenium 0.1.1
库链接:https://pypi.org/project/bonobo-selenium/
库简介:Bonobo Selenium Extension
库名:Easy2Selenium 0.0.1
库链接:https://pypi.org/project/easy2selenium/
库简介:Library for easy use of Selenium.
库名:selenium-account 0.2.1
库链接:https://pypi.org/project/selenium-account/
库简介:selenium_account
库名:dd-selenium 0.0.4
库链接:https://pypi.org/project/dd-selenium/
库简介:Providing drag and drop functionality in selenium python.
库名:selenium 4.3.0
库链接:https://pypi.org/project/selenium/
库简介:简介无内容
库名:amazon-selenium 0.1.2
库链接:https://pypi.org/project/amazon-selenium/
库简介:amazon_selenium
库名:selenium-wire 4.6.5
库链接:https://pypi.org/project/selenium-wire/
库简介:Extends Selenium to give you the ability to inspect requests made by the browser.
库名:selenium-configurator 0.1.1
库链接:https://pypi.org/project/selenium-configurator/
库简介:Helper API to define multiple webdrivers in config files.
库名:selenium-firefox 2.0.7
库链接:https://pypi.org/project/selenium-firefox/
库简介:selenium_firefox
库名:auto-selenium 1.0.1
库链接:https://pypi.org/project/auto-selenium/
库简介:Python tool to automate the download of Selenium Web Drivers for all browsers
库名:selenium-components 0.2
库链接:https://pypi.org/project/selenium-components/
库简介:Page objects for common components
库名:selenium_wrapper 0.1
库链接:https://pypi.org/project/selenium-wrapper/
库简介:Selenium driver wrapper and screenshots nosetests plugin
库名:fasttest-selenium 0.1.6
库链接:https://pypi.org/project/fasttest-selenium/
库简介:WEB自动化快速编写工具
库名:selenium-pinterest 0.0.84
库链接:https://pypi.org/project/selenium-pinterest/
库简介:Selenium Pinterest helps you follow / unfollow / pin / post to Pinterest
库名:selenium-generator 0.3
库链接:https://pypi.org/project/selenium-generator/
库简介:A framework for automated generating of Selenim WebDriver tests from yaml based on unittest framework.
库名:Selenium-Screenshot 1.7.0
库链接:https://pypi.org/project/selenium-screenshot/
库简介:This package is used to Clipped Images of Html Elements of Selenium Webdriver
库名:django-selenium 0.9.8
库链接:https://pypi.org/project/django-selenium/
库简介:Django Selenium Integration
库名:selenium-robot 0.0.6
库链接:https://pypi.org/project/selenium-robot/
库简介:This is a robot description base selenium.
库名:datetime-selenium 0.0.0
库链接:https://pypi.org/project/datetime-selenium/
库简介:Send and receive datetime objects from web forms.
库名:selenium-findtext 0.1.5
库链接:https://pypi.org/project/selenium-findtext/
库简介:Text finding helpers for Selenium tests
库名:selenium-elements 0.0.2
库链接:https://pypi.org/project/selenium-elements/
库简介:Page object model made easy.
库名:nose-selenium 0.07
库链接:https://pypi.org/project/nose-selenium/
库简介:Control the WebDriver instance in your scripts with command-line options
库名:selenium-toolkit 0.0.2
库链接:https://pypi.org/project/selenium-toolkit/
库简介:this is not a awesome description
库名:lemoncheesecake-selenium 0.1.0
库链接:https://pypi.org/project/lemoncheesecake-selenium/
库简介:Test Storytelling for Selenium
库名:selenium-oxide 1.0.0
库链接:https://pypi.org/project/selenium-oxide/
库简介:A Selenium boilerplate for automating web exploits. Use responsibly and ethically.
库名:play-selenium 0.0.3
库链接:https://pypi.org/project/play-selenium/
库简介:pytest plugin that let you drive a browser with Selenium
库名:comun-selenium 0.3
库链接:https://pypi.org/project/comun-selenium/
库简介:简介无内容
库名:selenium-cmd 0.0.2
库链接:https://pypi.org/project/selenium-cmd/
库简介:Tool to control Selenium from command line
库名:selenium-chromedriver 1.3
库链接:https://pypi.org/project/selenium-chromedriver/
库简介:Install Stable version Chromedriver for Selenium on Windows, MacOS, M1 MacOS and Linux
库名:selenium-ai 0.1.2
库链接:https://pypi.org/project/selenium-ai/
库简介:AI utility functions for webscraping and selenium automation scripts
库名:selenium-stealth 1.0.6
库链接:https://pypi.org/project/selenium-stealth/
库简介:Trying to make python selenium more stealthy.
库名:edc-selenium 0.1.3
库链接:https://pypi.org/project/edc-selenium/
库简介:TestCaseMixins for selenium tests on the clinicedc/edc
库名:ctreport-selenium 1.1.4
库链接:https://pypi.org/project/ctreport-selenium/
库简介:ctreport-selenium is a simple, creative and a customizable report for automation testing using Python. Best suitable for pytest, unit test and nose framework.
库名:webtest-selenium 0.1
库链接:https://pypi.org/project/webtest-selenium/
库简介:Selenium testing with WebTest
库名:antiblock-selenium 0.0.3
库链接:https://pypi.org/project/antiblock-selenium/
库简介:Selenium Firefox e Chrome webdrivers com alguns mecanismos antibloqueios
库名:masonite-selenium 0.0.3
库链接:https://pypi.org/project/masonite-selenium/
库简介:Selenium Testing Package
库名:selenium-extension 1.9.1
库链接:https://pypi.org/project/selenium-extension/
库简介:Provides additional methods for selenium automation
库名:selenium-youtube 2.0.29
库链接:https://pypi.org/project/selenium-youtube/
库简介:selenium_youtube
库名:selenium-yaml 1.0.96
库链接:https://pypi.org/project/selenium-yaml/
库简介:Selenium bots using YAML
库名:buster-selenium 0.1
库链接:https://pypi.org/project/buster-selenium/
库简介:Manage buster.js slave browsers using selenium.
库名:bonobo-selenium 0.1.1
库链接:https://pypi.org/project/bonobo-selenium/
库简介:Bonobo Selenium Extension
库名:Easy2Selenium 0.0.1
库链接:https://pypi.org/project/easy2selenium/
库简介:Library for easy use of Selenium.
库名:selenium-account 0.2.1
库链接:https://pypi.org/project/selenium-account/
库简介:selenium_account
库名:dd-selenium 0.0.4
库链接:https://pypi.org/project/dd-selenium/
库简介:Providing drag and drop functionality in selenium python.
库名:selenium-probes 0.1.0
库链接:https://pypi.org/project/selenium-probes/
库简介:A framework for building Selenium-based probes.
库名:snapshot-selenium 0.0.2
库链接:https://pypi.org/project/snapshot-selenium/
库简介:Render echarts using selenium
库名:types-selenium 3.141.9
库链接:https://pypi.org/project/types-selenium/
库简介:Typing stubs for selenium
库名:eyes-selenium 5.8.0
库链接:https://pypi.org/project/eyes-selenium/
库简介:Applitools Python SDK. Selenium package
库名:selenium-update 0.0.0
库链接:https://pypi.org/project/selenium-update/
库简介:Automate Selenium webdriver dependency set up
库名:selenium-screenshots 0.1.3
库链接:https://pypi.org/project/selenium-screenshots/
库简介:python package which helps to create many screenshots for selenium.webdriver
库名:selenium-recaptcha 0.0.1
库链接:https://pypi.org/project/selenium-recaptcha/
库简介:reCaptcha v2 solver for selenium
库名:gocept.selenium 7.1
库链接:https://pypi.org/project/gocept-selenium/
库简介:Test-friendly Python API for Selenium and integration with web application frameworks.
库名:selenium-docker 0.5.0
库链接:https://pypi.org/project/selenium-docker/
库简介:Additional selenium drivers that utilize docker containers for their UI.
库名:selenium-requests 2.0.0
库链接:https://pypi.org/project/selenium-requests/
库简介:Extends Selenium WebDriver classes to include the request function from the Requests library, while doing all the needed cookie and request headers handling.
库名:zc.selenium 1.2.1
库链接:https://pypi.org/project/zc-selenium/
库简介:Selenium integration for Zope 3
库名:scrapy-selenium 0.0.7
库链接:https://pypi.org/project/scrapy-selenium/
库简介:Scrapy with selenium
库名:selenium-chrome 0.0.29
库链接:https://pypi.org/project/selenium-chrome/
库简介:selenium_chrome
库名:selenium-actions 1.0.5
库链接:https://pypi.org/project/selenium-actions/
库简介:Selenium Actions - Action Based Selenium library - selenium in more accessible way