赞
踩
第一:urlopen出现403
- #!/usr/bin/env python
- # -*- coding: utf-8 -*-
- import urllib
-
- url = "http://www.google.com/translate_a/t?client=t&sl=zh-CN&tl=en&q=%E7%94%B7%E5%AD%A9"
- #浏览器头
- headers = {'User-Agent':'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6'}
- req = urllib2.Request(url=url,headers=headers)
- data = urllib.request.urlopen(req).read()
- print data
二:urlretrieve 出现403(转载自:https://www.213.name/archives/1087/comment-page-1)
出现该错误的原因是服务器开启了反爬虫,一般情况下只需要设置header模拟浏览器即可,但是urlretrieve并未提供header参数。
使用urlopen也可以直接下载文件,例
- headers = {"User-Agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.86 Safari/537.36"}
- def down_pic(url, path):
- try:
- req = request.Request(url, headers=headers)
- data = request.urlopen(req).read()
- with open(path, 'wb') as f:
- f.write(data)
- f.close()
- except Exception as e:
- print(str(e))
还有一种解决方法:
- opener=urllib.request.build_opener()
- opener.addheaders=[('User-Agent','Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1941.0 Safari/537.36')]
- urllib.request.install_opener(opener)
- urllib.request.urlretrieve(url, Path)
赞
踩
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。