赞
踩
设置响应的编码
response.encoding = response.apparent_encoding
中间件中添加process_response代码
from scrapy.http import HtmlResponse
class RandomUserAgentMiddleware(object):
def process_request(self, request, spider):
ua = random.choice(USER_AGENT_LIST)
request.headers.setdefault('User-Agent', ua)
def process_response(self, request, response, spider):
response = HtmlResponse(
url=response.url,
body=response.body,
encoding='GB2312'
)
return response
GB2312不行的话,可以改成utf-8之类的
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。