赞
踩
本文按照下列项目来进行说明。
mysite2
- manage.py
- mysite2
- app01
1、打开今日头条,对网页进行分析并爬取
获取请求URL
分析网站的数据来源后。
开始构造headers,对及今日头条进行爬取,并把数据JSON格式化。
其中的Url对应的就是当前新闻内容的网址,Title对应的就是新闻的标题。
- {"data":[
- {
- "ClusterId":7072942452532842023,
- "Title":"沙特和阿联酋领导人拒接拜登电话",
- "LabelUrl":"https://p26.toutiaoimg.com/img/mosaic-legacy/2b29200041b9c651e8148~cs_noop.png",
- "Label":"hot",
- "Url":"https://www.toutiao.com/amos_land_page/?category_name=topic_innerflow\u0026event_type=hot_board\u0026log_pb=%7B%22category_name%22%3A%22topic_innerflow%22%2C%22cluster_type%22%3A%2210%22%2C%22enter_from%22%3A%22click_category%22%2C%22entrance_hotspot%22%3A%22outside%22%2C%22event_type%22%3A%22hot_board%22%2C%22hot_board_cluster_id%22%3A%227072942452532842023%22%2C%22hot_board_impr_id%22%3A%222022030918321201021216216025C743EE%22%2C%22jump_page%22%3A%22hot_board_page%22%2C%22location%22%3A%22news_hot_card%22%2C%22page_location%22%3A%22hot_board_page%22%2C%22rank%22%3A%221%22%2C%22source%22%3A%22trending_tab%22%2C%22style_id%22%3A%2240132%22%2C%22title%22%3A%22%E6%B2%99%E7%89%B9%E5%92%8C%E9%98%BF%E8%81%94%E9%85%8B%E9%A2%86%E5%AF%BC%E4%BA%BA%E6%8B%92%E6%8E%A5%E6%8B%9C%E7%99%BB%E7%94%B5%E8%AF%9D%22%7D\u0026rank=1\u0026style_id=40132\u0026topic_id=7072942452532842023",
- "HotValue":"6753999",
- "Schema":"",
- "LabelUri":{
- "uri":"mosaic-legacy/2b29200041b9c651e8148",
- "url":"https://p26.toutiaoimg.com/img/mosaic-legacy/2b29200041b9c651e8148~cs_noop.png",
- "width":200,
- "height":200,
- "url_list":[
- {"url":"https://p26.toutiaoimg.com/img/mosaic-legacy/2b29200041b9c651e8148~cs_noop.png"},
- {"url":"https://p3.toutiaoimg.com/img/mosaic-legacy/2b29200041b9c651e8148~cs_noop.png"},
- {"url":"https://p9.toutiaoimg.com/img/mosaic-legacy/2b29200041b9c651e8148~cs_noop.png"}
- ],
- "image_type":1
- },
- "ClusterIdStr":"7072942452532842023",
- "ClusterType":10,
- "QueryWord":"沙特和阿联酋领导人拒接拜登电话",
- "InterestCategory":["international"],
- "Image":{
- "uri":"tos-cn-i-qvj2lq49k0/a7e3f7e3e8c04c37bc7f88b2340ab999",
- "url":"https://p6.toutiaoimg.com/img/tos-cn-i-qvj2lq49k0/a7e3f7e3e8c04c37bc7f88b2340ab999~cs_noop.png",
- "width":0,
- "height":0,
- "url_list":[
- {"url":"https://p6.toutiaoimg.com/img/tos-cn-i-qvj2lq49k0/a7e3f7e3e8c04c37bc7f88b2340ab999~cs_noop.png"},
- {"url":"https://p9.toutiaoimg.com/img/tos-cn-i-qvj2lq49k0/a7e3f7e3e8c04c37bc7f88b2340ab999~cs_noop.png"},
- {"url":"https://p3.toutiaoimg.com/img/tos-cn-i-qvj2lq49k0/a7e3f7e3e8c04c37bc7f88b2340ab999~cs_noop.png"}
- ],
- "image_type":1
- },
- "LabelDesc":"热门事件"
- },
- {
-
- },
![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)
2、在app01/views.py文件中添加一个函数用来爬取新闻并进行展示
- #爬取今日头条的头条热榜,进行展示并附加链接
- def news(req):
- url = 'https://www.toutiao.com/hot-event/hot-board/?origin=toutiao_pc&_signature=_02B4Z6wo00f01yG9tdQAAIDCQrd1vxaJp9chmbFAAKpR4Dqk0c56dkhdlvNsoD3I03ygIjgUcxkM0VcFYKfO0a9iJRjnl1M9yxZvlq-pgzUXDOrpi1wKoYlCVC9.llzChJ7GmTYXIDMvE.c1a6'
- headers = {
- "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36",
- "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9", }
- res = requests.get(url=url, headers=headers)
- data_all_dict = res.json()
- data_lists = dict(data_all_dict)['data']
- return render(
- req,
- 'news.html',
- {
- "news_dicts":data_lists
- }
- )
![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)
3、在app01/templates文件夹下新建一个news.html文件
其中style='text-decoration:none;color:black' ,作用是去掉超链接的下划线,并让超链接的颜色变成黑色。再使用Django的模板技术,对新闻字典进行遍历输出。
- <html lang="en">
- <head>
- <meta charset="UTF-8">
- <title>Title</title>
- </head>
- <body>
- <h1>今日头条</h1>
- <ul>
- {% for news in news_dicts %}
- <li>
- <a style='text-decoration:none;color:black' href = {{news.Url}} target="_blank">{{ news.Title }}</a><br>
- </li>
- {% endfor %}
- </ul>
- </body>
- </html>
![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)
4、在mysite2/urls.py文件中构造url和函数的链接关系
path('news/',views.news)
5、启动服务python manage.py runserver 0.0.0.0:8000,在浏览器中输入http://127.0.0.1:8000/news/,查看是否成功。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。