赞
踩
本项目利用网络爬虫技术从某汽车门户网站采集汽车数据,并利用 Flask + Echarts 前后端框架,
实现对汽车数据的可视化分析,包括不同品牌汽车的评分、车型级别、车身结构、发动机、变速箱和指导价等维度进行可视化统计分析。
基于python的汽车信息爬取与可视化分析系统的功能主要包括:
利用 python 的 request + beautifulsoup 从某汽车门户平台采集汽车数据。
url = ‘https://car.xxxx.com.cn/AsLeftMenu/As_LeftListNew.ashx?typeId=1%20&brandId;=0%20&fctId;=0%20&seriesId;=0’
headers = {
‘accept’: ‘/’,
‘accept-encoding’: ‘gzip, deflate, br’,
‘accept-language’: ‘zh-CN,zh;q=0.9,en;q=0.8’,
‘cookie’: ‘Your cookies’,
‘referer’: ‘https://car.xxxx.com.cn/’,
‘sec-fetch-dest’: ‘empty’,
‘sec-fetch-mode’: ‘cors’,
‘sec-fetch-site’: ‘same-origin’,
‘user-agent’: ‘Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36’
}
response = requests.get(url, headers=headers)
response.encoding = 'gbk'
soup = BeautifulSoup(response.text[18:-3], 'lxml')
brands = soup.find_all('a')
brand_urls = {}
for brand in brands:
brand_name = brand.text
brand_href = 'https://car.xxxx.com.cn' + brand['href']
brand_urls[brand_name.split('(')[0]] = brand_href
def fetch_brand_cars_info(brand, url):
headers = {
‘accept’: ‘/’,
‘accept-encoding’: ‘gzip, deflate, br’,
‘accept-language’: ‘zh-CN,zh;q=0.9,en;q=0.8’,
‘cookie’: ‘Your cookies’,
‘referer’: url,
‘sec-fetch-dest’: ‘empty’,
‘sec-fetch-mode’: ‘cors’,
‘sec-fetch-site’: ‘same-origin’,
‘user-agent’: ‘Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36’
}
response = requests.get(url, headers=headers)
response.encoding = ‘gbk’
soup = BeautifulSoup(response.text, ‘lxml’)
cars = soup.select(‘div.list-cont’)
brand_cars = [] for car in cars: car_info = {'品牌': brand} name = car.select('a.font-bold')[0].text score = car.select('span.score-number') if len(score) == 0: score = '暂无' else: score = score[0].text car_info['车系'] = name car_info['评分'] = score ul = car.select('ul.lever-ul')[0] for li in ul.select('li'): data = li.text.replace('\xa0', '').replace(' ', '').replace(' ', '').strip().split(':') if '颜色' in data[0]: continue if len(data) < 2: continue car_info[data[0]] = data[1] price = car.select('span.font-arial')[0].text price = price.split('-') if len(price) == 1: car_info['最低指导价'] = price[0] car_info['最高指导价'] = price[0] else: car_info['最低指导价'] = price[0] + '万' car_info['最高指导价'] = price[1] car_info['链接'] = url brand_cars.append(car_info) return brand_cars
不同汽车的品牌、车系、评分、级别、车身结构、发动机和变速箱等因素对汽车价格的影响分析:
项目分享:
https://gitee.com/asoonis/feed-neo
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。