当前位置:   article > 正文

python进行爬虫的处理(Html转DOM),使用bs4的BeautifulSoup这个类_网页爬虫结果 dom实例化

网页爬虫结果 dom实例化
  1. from urllib import request
  2. from bs4 import BeautifulSoup
  3. headers = {
  4. "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3894.0 Safari/537.36"
  5. }
  6. req = request.Request("https://blog.csdn.net/wtl1992",headers=headers)
  7. connection = request.urlopen(req)
  8. html = connection.read().decode("utf-8")
  9. document = BeautifulSoup(html,features="lxml")
  10. print(document.select_one("div.avatar-box.d-flex.justify-content-center.flex-column"))
  11. print(document.select("div.avatar-box.d-flex.justify-content-center.flex-column"))

document.select : 选择器选择匹配的所有内容

document.select_one:选择器选择匹配的第一个

声明:本文内容由网友自发贡献,转载请注明出处:【wpsshop博客】
推荐阅读
相关标签
  

闽ICP备14008679号