当前位置:   article > 正文

Scracpy爬取图片实例

数据抓取 scra

非常简单,直接上爬虫代码

# -*- coding: utf-8 -*-
import scrapy
import urllib
import logging

class TopitComSpider(scrapy.Spider):
    name = "topit.com"
    allowed_domains = ["topit.com"]
    start_urls = [
        'http://www.topit.me',
    ]
    def parse(self, response):
        counter = 0
        image_urls1=response.xpath("//div[@class='catalog']/div[@class='e m'][position()<=8]/a/img/@src").extract()
        image_urls2=response.xpath("//div[@class='catalog']/div[@class='e m'][position()>8]/a/img/@data-original").extract()
        image_urls = image_urls1 + image_urls2
        for url in image_urls:
            urllib.urlretrieve(url, "/root/pic/"+str(counter)+'.jpg')
            logging.debug(url)
            counter=counter+1
        pass

遗留问题:

在用xpath匹配的时候用or将两个表达式连接起来匹配不到,只好分开匹配,再把结果合并。原因不明,有知道的朋友还请告知,谢谢

 

转载于:https://www.cnblogs.com/gordon0918/p/6531861.html

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/花生_TL007/article/detail/430011
推荐阅读
相关标签
  

闽ICP备14008679号