赞
踩
playwright是一个非常有趣的自动化测试工具, 支持Node.js、Python、C# 和 Java语言,在这里我将向大家介绍playwright以及它的简单使用
Playwright是一个是由微软开发的强大的Python第三方库,它可以仅用一个API即可自动执行Chromium(谷歌)、Firefox(火狐)等浏览器自动化操作,并同时支持以无头模式、有头模式运行。
tips:可以使用国内镜像源安装,会快一点
安装playwright库
pip install playwright
安装浏览器驱动文件(安装过程稍微有点慢)
python -m playwright install
使用codegen命令,以下命令查看所有用法
python -m playwright codegen --help
使用该命令开始录制
python -m playwright codegen
然后我们就可以在弹出的浏览器中进行操作了,playwright会记录我们的一系列操作并生成对应的代码,结束后自动关闭浏览器,保存生成的自动化脚本到py文件。
from playwright.sync_api import Playwright, sync_playwright, expect def run(playwright: Playwright) -> None: browser = playwright.chromium.launch(headless=False) context = browser.new_context() page = context.new_page() page.goto("https://www.baidu.com/") page.locator("#kw").click() page.locator("#kw").fill("playwright") page.get_by_role("button", name="百度一下").click() # --------------------- context.close() browser.close() with sync_playwright() as playwright: run(playwright)
以下代码演示爬取豆瓣一周口碑榜前十的电影信息,用到了xlwt
import xlwt from playwright.sync_api import sync_playwright def run(playwright): #headless=False,设置为有头模式 browser = playwright.chromium.launch(headless=False) context = browser.new_context() page = context.new_page() page.goto("https://movie.douban.com/") names = page.query_selector_all("//div[@class='billboard-bd']//td[@class='title']/a") listMovie = [] for name in names : content = name.text_content() link = name.get_attribute("href") listMovie.append((content,link)) #使用xlwt存入excel workbook = xlwt.Workbook(encoding='utf-8') worksheet = workbook.add_sheet('sheet1') worksheet.write(0, 0, label="名字") worksheet.write(0, 1, label="网址") for i, items in enumerate(listMovie): worksheet.write(i + 1, 0, items[0]) worksheet.write(i + 1, 1, items[1]) workbook.save('DouBanMovie.xls') page.close() context.close() browser.close() with sync_playwright() as playwright: run(playwright)
结果如下:
playwright是一个可玩性很高的python库,大家可以深入尝试
附官方文档地址:链接
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。