当前位置:   article > 正文

Python自动化我选DrissionPage,弃用Selenium_drissionpage和selenium

drissionpage和selenium

DrissionPage 是一个基于 python 的网页自动化工具

它既能控制浏览器,也能收发数据包,还能把两者合而为一。

可兼顾浏览器自动化的便利性和 requests 的高效率。

它功能强大,内置无数人性化设计和便捷功能。

它的语法简洁而优雅,代码量少,对新手友好。

以下是我使用DrissionPage做的抖音无水印视频提取代码:

douyin.py:

  1. # ---encoding:utf-8---
  2. # @Time : 2024/1/13 16:43
  3. # @Author : stzz Wang
  4. # @Email :1050100468@qq.com
  5. # @Site :
  6. # @File : douyin.py
  7. # @Project : douyi_analysis
  8. # @Software: PyCharm
  9. import os
  10. import sys
  11. BASE_DIR = os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
  12. sys.path.append(BASE_DIR)
  13. from DrissionPage import ChromiumOptions, SessionOptions, WebPage
  14. from CODES.config.CONFIG import *
  15. class DouYin:
  16. def __init__(self):
  17. co = ChromiumOptions(ini_path=Config.drission_page_init_file_path)
  18. so = SessionOptions(ini_path=Config.drission_page_init_file_path)
  19. self.page = WebPage(chromium_options=co, session_or_options=so)
  20. def start_listen(self):
  21. self.page.listen.start()
  22. def end_listen(self):
  23. self.page.listen.pause(True)
  24. self.page.listen.stop()
  25. def load_page(self, url):
  26. self.page.get(url)
douyin_without_watermarker_analysis.py:
  1. # ---encoding:utf-8---
  2. # @Time : 2024/1/13 16:53
  3. # @Author : stzz Wang
  4. # @Email :1050100468@qq.com
  5. # @Site :
  6. # @File : douyin_without_watermarker_analysis.py
  7. # @Project : douyi_analysis
  8. # @Software: PyCharm
  9. import os
  10. import sys
  11. import time
  12. BASE_DIR = os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
  13. sys.path.append(BASE_DIR)
  14. from fastapi import APIRouter
  15. from CODES.controllers.model.douyin import *
  16. from CODES.config.CONFIG import *
  17. import json
  18. from pydantic import BaseModel
  19. douyin_wwa = APIRouter()
  20. douyin_instance = DouYin()
  21. class DouYinWithoutWatermarker(BaseModel):
  22. url: str
  23. @douyin_wwa.post("/douyin_without_watermarker_analysis")
  24. async def douyin_without_watermarker_analysis(accept: DouYinWithoutWatermarker):
  25. douyin_instance.load_page(accept.url)
  26. douyin_instance.start_listen()
  27. page = douyin_instance.page
  28. start_time = time.time()
  29. try:
  30. while True:
  31. res = page.listen.wait() # 等待并获取一个数据包
  32. if "https://www.douyin.com/aweme/v1/web/aweme/post/" in res.url:
  33. data = json.loads(res._raw_body)
  34. data_list = data["aweme_list"]
  35. data = []
  36. for item in data_list:
  37. d = {
  38. "title" : item["desc"],
  39. "urls" : item["video"]["play_addr"]["url_list"]
  40. }
  41. data.append(d)
  42. break
  43. use_time = time.time() - start_time
  44. data = {
  45. "data": data,
  46. "use_time": use_time
  47. }
  48. except Exception as e:
  49. data = {
  50. "data": e,
  51. "error_code": 500
  52. }
  53. finally:
  54. douyin_instance.end_listen()
  55. return data

完整代码在github上:

GitHub - STZZ-1992/douyin_analysis: 抖音短视频无水印的解析服务抖音短视频无水印的解析服务. Contribute to STZZ-1992/douyin_analysis development by creating an account on GitHub.icon-default.png?t=N7T8https://github.com/STZZ-1992/douyin_analysis

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/Cpp五条/article/detail/616955
推荐阅读
相关标签
  

闽ICP备14008679号