当前位置:   article > 正文

python 爬取携程全国旅游景点信息-2024.4.13_如何爬取携程城市列表

如何爬取携程城市列表

 1. 概述

         携程网是中国领先的在线旅行服务公司,提供酒店预订、机票预订、旅游度假、商旅管理等服务。携程网上有大量的旅游景点和酒店信息,这些信息对于旅行者和旅游业者都有很大的价值。通过爬虫技术,我们可以从携程网上获取这些信息,并进行数据清洗、数据分析、数据可视化等操作,从而得到有用的洞察和建议。

 2. 安装requests 库  

                在开始之前,请确保你已经安装了以下 Python 库:

                requests:用于发送 HTTP 请求并获取网页内容。

                你可以使用 pip 来安装这些库:

pip install requests 

3. 爬取携程旅游网站数据

        首先,我们需要确定要爬取的页面。 假设我们想要获取携程旅游网站上某个目的地的旅游信息。如下例如北京。

      当前接口 链接 和 post 参数

  1. url = 'https://m.ctrip.com/restapi/soa2/18109/json/getAttractionList?_fxpcqlniredt=09031015313388236487&x-traceID=09031015313388236487-1712974794650-8267936'
  2. data = {"index":1,"count":10,"sortType":1,"isShowAggregation":true,"districtId":1,"scene":"DISTRICT","pageId":"214062","traceId":"14f9745c-92ad-f5c5-07bb-171293c80647","extension":[{"name":"osVersion","value":"10"},{"name":"deviceType","value":"windows"}],"filter":{"filterItems":[]},"crnVersion":"2020-09-01 22:00:45","isInitialState":true,"head":{"cid":"09031015313388236487","ctok":"","cver":"1.0","lang":"01","sid":"8888","syscode":"09","auth":"","xsid":"","extension":[]}}

4. 开始正式代码

        

  1. headers = {
  2. 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36 Edg/123.0.0.0',
  3. 'Cookie': 你的cookies
  4. }
  5. html = requests.post(url, headers=headers, json=data).json()
  6. attractionList = html['attractionList']
  7. for attraction in attractionList:
  8. data = attraction['card']
  9. commentCount = data['commentCount']
  10. commentScore = data['commentScore']
  11. coordinate = [data['coordinate']['latitude'], data['coordinate']['longitude']]
  12. coverImageUrl = data.get('coverImageUrl','')
  13. # 距离
  14. distanceStr = data.get('distanceStr','')
  15. # 地点
  16. displayField = data.get('displayField', None)
  17. heatScore = data.get('heatScore','')
  18. # 景点名
  19. poiName = data['poiName']
  20. isFree = data['isFree']
  21. if isFree:
  22. price = 0
  23. # 原价
  24. marketPrice = 0
  25. else:
  26. price = data.get('price',0)
  27. # 原价
  28. marketPrice = data.get('marketPrice',0)
  29. # 类别信息
  30. sightCategoryInfo = data.get('sightCategoryInfo','')
  31. # 标签
  32. tagNameList = data.get('tagNameList','')
  33. # 5a
  34. sightLevelStr = data.get('sightLevelStr', None)

5. 保存到csv

  1. f = open('csv/全国各景点全.csv', 'w', encoding="utf-8", newline='')
  2. csvwrite = csv.writer(f)
  3. csvwrite.writerow(['城市', '景点名', '地点', '距离', '坐标', '评论数','评论分','热评分','封面','是否免费','价格','原价','类别信息','标签','是否5A'])
  4. csvwrite.writerow([city,poiName,displayField,distanceStr,coordinate,commentCount,commentScore,heatScore,coverImageUrl,isFree,price,marketPrice,sightCategoryInfo,tagNameList,sightLevelStr])

6 .通过获取全国 city id,可请求全国景点数据并保存

        全国景点数据csv 地址 https://download.csdn.net/download/britlee/89115745

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/2023面试高手/article/detail/498215
推荐阅读
相关标签
  

闽ICP备14008679号