赞
踩
Hello 大家好,我是一名新来的金融领域打工人,日常分享一些python知识,都是自己在学习生活中遇到的一些问题,分享给大家,希望对大家有一定的帮助!
上一篇文章讲了讲如何通过爬虫获取页面源代码,我们可以很方便地使用postman工具来进行页面源代码地获取:
- ## postman工具的使用
- import requests
-
- url = "https://travel.qunar.com/p-cs299782-xiamen-jingdian"
-
- payload={}
- headers = {
- 'authority': 'travel.qunar.com',
- 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
- 'accept-language': 'zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6,zh-TW;q=0.5',
- 'cache-control': 'max-age=0',
- 'cookie': 'QN1=0000918034fc4118d820961d; QN269=65706FF0C82711EC859AFA163E515513; _i=ueHd8LkXXXV0bDSA-9fQKGvqE11X; fid=e0ca98b5-69ba-49ae-af45-eb75ae47171f; viewdist=299782-6; uld=1-299782-6-1652167178; JSESSIONID=07447CB2149341056CEBB815F1EDF0F6; qunar-assist={%22version%22:%2220211215173359.925%22%2C%22show%22:false%2C%22audio%22:false%2C%22speed%22:%22middle%22%2C%22zomm%22:1%2C%22cursor%22:false%2C%22pointer%22:false%2C%22bigtext%22:false%2C%22overead%22:false%2C%22readscreen%22:false%2C%22theme%22:%22default%22}; QN205=organic; QN277=organic; QN267=08897278013e594d4; csrfToken=pG8P5YxlawgK4xLy5gqboMfjzc3PL8f6; ariaDefaultTheme=undefined; _vi=ZVM5OVJRff5-WqKRSR8z-1-5wsxUZFKe3HjjzY36FjM2dAB9Kid_TTlErMLyxiV_LRIKgmGxb1f112lFh2V3k5KmcOWUWaXPhZABjEAJYERJXu6lED-BVDqGdxMi6Cpadvxt5kTHWmL-GrSJVgDkNAHwEu1STc_ZoDyrwh6qiywq; Hm_lvt_c56a2b5278263aa647778d304009eafc=1651283208,1651290050,1651291263,1652167180; Hm_lpvt_c56a2b5278263aa647778d304009eafc=1652167180; QN271=749e150b-d9b2-49a3-960a-7fa27373fbfb; SECKEY_ABVK=LG1DqJApvTrEf9k99/qQFt4OsSw6VpB+noTf6BSInqQ%3D; BMAP_SECKEY=H2dLlEk7yFbg2TroK6omHBgP0C5Z8rMsdadN13glWW_rmOYweLnZ20x1TWwCuwF_fS_aLBiPAVFI2Eh4KJKMatp-gktEUhpMzj_VFo_15mVV-TTyqV2tV6Q-rw6Fe0Y4fTbjUCcMrevzr_y8nlhxtFjgLVgD9kStuYoAs3HtEVcZwevbYDQHNfSSiMcsyq-D; JSESSIONID=A5E2B3B84C33240FFD867ABCE81BB2AA; uld=1-299782-7-1652167254; viewdist=299782-7',
- 'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="101", "Microsoft Edge";v="101"',
- 'sec-ch-ua-mobile': '?0',
- 'sec-ch-ua-platform': '"Windows"',
- 'sec-fetch-dest': 'document',
- 'sec-fetch-mode': 'navigate',
- 'sec-fetch-site': 'none',
- 'sec-fetch-user': '?1',
- 'upgrade-insecure-requests': '1',
- 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36 Edg/101.0.1210.39'
- }
-
- response = requests.request("GET", url, headers=headers, data=payload)
- result = response.text
- print(result)
这一篇文章给大家推荐一个非常简单的获取页面源代码的方法,可以实现在Jupyter Notebook使用一行代码就可以获取网页的页面源代码,话不多说我们直接上代码:
- %load URL
- #URL:URL为指定网站的地址
这里我们选取一个网址,然后将它的URL输入,如下图所示:
然后我们运行代码,让我们来看看获得的结果:
续上图:
我们可以和网页本身的页面源代码对比一下:
可以看出通过%load所获得的内容和页面源代码是对的上的~
好啦,今天的文章就分享到这里啦!
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。