赞
踩
近期,有很多小伙伴问我在爬虫过程中遇到的一个问题AttributeError: ‘NoneType’ object has no attribute ‘children’,通过查询后也没有找到较好的解决方法。本文给大家提供一个解决方案。
网站:产业信息网
图片示例:
Created on Fri Jan 20 11:30:31 2023 @author: 北辰远_code I love python.快乐每一天! """ import requests from bs4 import BeautifulSoup import bs4 import csv def getHTMLText(url):#爬取网站数据 try: r = requests.get(url, timeout = 30) r.raise_for_status() r.encoding = r.apparent_encoding return r.text except: return '爬取失败' def fillUnivlist(ulist,html):#解析网站数据 soup = BeautifulSoup(html,"html.parser") for tr in soup.find('tbody').children: if isinstance(tr,bs4.element.Tag): tds = tr('td') ulist.append([tds[0].text,tds[1].text,tds[2].text,tds[3].text,tds[4].text,tds[5].text,tds[6].text,tds[7].text]) def writeUlistfile(ulist,dataname):#将网站存入csv文件 with open(dataname,'w',encoding = 'utf-8',newline='') as fout: writer = csv.writer(fout) for row in ulist: writer.writerow(row) url1 = 'https://www.chyxx.com/industry/202105/953391.html' html1 = getHTMLText(url1) uinfo1 =[] fillUnivlist(uinfo1,html1) writeUlistfile(uinfo1,'各种油产量初.csv')
Traceback (most recent call last):
File "D:\Users\Qi520503\Desktop\shiyou\未命名3.py", line 45, in <module>
fillUnivlist(uinfo1,html1)
File "D:\Users\Qi520503\Desktop\shiyou\未命名3.py", line 29, in fillUnivlist
for tr in soup.find('tbody').children:
AttributeError: 'NoneType' object has no attribute 'children'
解决方案:
加入urllib3模块,关闭ssl警告。
代码如下:
import requests
from bs4 import BeautifulSoup
import bs4
import csv
import urllib3
urllib3.disable_warnings()
def getHTMLText(url):#爬取网站数据
try:
r = requests.get(url, timeout = 30,verify=False)
r.raise_for_status()
r.encoding = r.apparent_encoding
return r.text
except:
return '爬取失败'
运行后无报错,爬取的数据正常保存在csv文件中。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。