当前位置:   article > 正文

java+selenium_selenium java

selenium java

前言

文章仅供学习使用!!
严禁做违法违纪的事情,责任自负

简介

Selenium 是最广泛使用的开源 Web UI(用户界面)自动化测试套件之一。
与java集成,本质上是通过Java代码调用浏览器驱动 进行模拟人工的操作.
selenium支持不同的浏览器,本文以谷歌为例 !

1.安装驱动

selenium驱动有两种下载方式.任选其一即可
①首先需要确认浏览器版本: 在浏览器界面输入chrome://settings/
在这里插入图片描述② 下面网址任选其一,选择对应的版本下载 ( 此处如未有完全一致版本,则选择最大版本 例如本文中是104.0.5112.102 可选的版本是104开头 最优选为104版本中最大版号)

http://chromedriver.storage.googleapis.com/index.html
http://npm.taobao.org/mirrors/chromedriver/

在这里插入图片描述

2.简单案例走进爬虫

package com.mengkeng.selenium_demo.test;

import org.openqa.selenium.By;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;

import java.util.concurrent.TimeUnit;

public class BaiduDemo {

    public static void main(String[] args) throws Exception {
        //D://chromedriver.exe 以实际存储路径为准
        System.setProperty("webdriver.chrome.driver", "D://chromedriver.exe");
        ChromeOptions chromeOptions = new ChromeOptions();
        ChromeDriver driver = new ChromeDriver(chromeOptions);
        try {
            // 窗口最大化
            driver.manage().window().maximize();
            driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
            Thread.sleep(1000);
            //进入百度首页
            driver.get("https://www.baidu.com/");
            //找到输入框
            WebElement text = driver.findElement(By.id("kw"));
            //找到百度一下按钮
            WebElement button = driver.findElement(By.id("su"));
            text.sendKeys("123");
            button.click();
        } finally {
            sleep(10000);
            driver.quit();
        }
    }

    public static void sleep(int time) {
        try {
            Thread.sleep(1000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43

通过几行代码实现了打开网页搜索 ‘123’ , 接下来看看常用的api , 理解即可 随用随查

3.seleniumAPI

3-1创建一个可操控的浏览器对象

//  注意修改实际驱动存储位置
System.setProperty("webdriver.chrome.driver", "D://chromedriver.exe");
WebDriver driver = new ChromeDriver();
  • 1
  • 2
  • 3

3-2打开指定页面

driver.get("https://www.baidu.com/");
  • 1

3-3定位元素

注意: 页面出现相同属性的元素, 则需要使用xpath定位方式进行指定获取

id定位
driver.findElement(By.id("pnum"));
  • 1
name定位
driver.findElement(By.name("name"));
  • 1
class 定位
driver.findElement(By.className("pgo"));
  • 1
link定位
driver.findElement(By.linkText("link"));
  • 1
xpath定位
driver.findElement(By.xpath("//div[@id='1']/div/div/h3/a[1]"))
  • 1

3-4浏览器常用方法

方法描述
sendKey()模拟输入指定内容
clear()清楚输入内容
text()获取文本信息
getAttribute()获取指定属性

ok掌握这一部分就可以书写简单爬虫了 , 有兴趣的童鞋试着做一下如下案例:

案例 一 登录QQ邮箱

需求:

登录qq邮箱,并打开收件箱页面

以下是实现代码

package com.mengkeng.selenium_demo.test;

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import java.util.Objects;

public class QQEmaIlLoginDemo {
    public static void main(String[] args) throws InterruptedException {
        //定义使用什么版本的驱动,注意替换你的路径
        System.setProperty("webdriver.chrome.driver", "D://chromedriver.exe");
        ChromeDriver driver = new ChromeDriver();
        driver.manage().window().maximize();
        try {
            Thread.sleep(1000);
            driver.get("https://mail.qq.com/");
            driver.switchTo().frame("login_frame");
            WebElement username = driver.findElement(By.id("u"));
            WebElement password = driver.findElement(By.id("p"));
            username.sendKeys("xxxxxx@qq.com");
            password.sendKeys("xxxxxx");
            WebElement submit = driver.findElement(By.id("login_button"));
            submit.click();
            Thread.sleep(1000);
            driver.switchTo().defaultContent();
            WebElement element = validElement("//a[@id='folder_1']", driver);
            if (Objects.nonNull(element)){
                WebElement folder_1 = driver.findElement(By.xpath("//a[@id='folder_1']"));
                folder_1.click();
            }else{
                System.out.println("打开收件箱失败");
            }
        } finally {
            Thread.sleep(10000);
            driver.close();
            driver.quit();
        }
    }
    public static WebElement validElement(String str, WebDriver driver) {
        try {
            WebElement element = driver.findElement(By.xpath(str));
            return element;
        } catch (Exception e) {
            System.out.println("这个元素不存在" + str);
        }
        return null;
    }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50

上述只是简单案例 有鼠标,多页面跳转的怎么办呢 . 别急 这就来

3-5selenium 进阶

鼠标

注意 鼠标操作方法需要以perform()方法结尾 如未使用该方法结尾则操作不生效

方法描述
click()单击左键
context_click()单击右键
double_click()双击
drag_and_drop()拖动
move_to_element()鼠标悬停
perform()执行所有ActionChains中存储的动作
切换窗口

当点击页面元素 浏览器创建新窗口后需要切换到最新页面.

driver.switchTo().window(frontHandle) // 此处的frontHandle是页面对象 可以使用driver.getWindowHandle(); 获取后暂存

调用js

模拟滑动页面
driver.executeScript(“window.scrollTo(0,300)”);

当页面元素无法点击的时候(反爬虫拦截)
driver.executeScript(“arguments[0].click();”, element);// 其中element为按钮或元素

chromeOptions 创建浏览器 参数
        ChromeOptions chromeOptions = new ChromeOptions();
        chromeOptions.setPageLoadStrategy(PageLoadStrategy.EAGER);  //  急速加载模式
  		 chromeOptions.addArguments("--incognito"); // 隐私窗口模式
        chromeOptions.addArguments("--blink-settings=imagesEnabled=false"); //  不加载图片
        chromeOptions.addArguments("--headless");	//  无头模式
        chromeOptions.addArguments("--no-sandbox"); //  禁用沙箱模式
        chromeOptions.addArguments("--disable-gpu");//  禁用gpu加速
        chromeOptions.addArguments("--proxy-server=" + proxy); //  添加代理
        ChromeDriver driver = new ChromeDriver(chromeOptions);
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
浏览器相关设置
//  设置全局等待时间
driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
//  最大化页面
driver.manage().window().maximize();
//  去除sesenium标志
String js1="Object.defineProperties(navigator, {webdriver:{get:()=>undefined}});";
((ChromeDriver) driver).executeScript(js1);
//  添加UA请求头
 String[] arr = {"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36",
                "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15",
                "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36",
                "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.53 Safari/537.36 Edg/103.0.1264.37",
                "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en) Opera 9.50",
                "Mozilla/5.0 (Windows NT 5.1; U; en; rv:1.8.1) Gecko/20061208 Firefox/2.0.0 Opera 9.50",
                "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; QQDownload 732; .NET4.0C; .NET4.0E)"};
chromeOptions.addArguments("User-Agent=" + arr[random.nextInt(7)]);
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
多线程示例

在解析列表页 创建浏览器对象执行解析


            
  private void parsePagePre(SetOperations ops) {
      ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(2,
            8, 30L,
            TimeUnit.SECONDS, new LinkedBlockingQueue<>());
        List<BuildAreaUrlLj> buildAreaUrlLjs = buildAreaUrlLjMapper.selectList(null);
        for (BuildAreaUrlLj buildAreaUrlLj : buildAreaUrlLjs1) {
            pagepoolExecutor.execute(() -> parsePage(ops, opsForHash, buildAreaUrlLj));
        }
    }
        
  private void parsePage(SetOperations ops, HashOperations<String, Object, Object> opsForHash, BuildAreaUrlLj buildAreaUrlLj) {
        ChromeDriver driver = getChromeDriver();
			driver.get(buildAreaUrlLj.getAreaUrl());
       //  业务代码
   }
private ChromeDriver getChromeDriver() {
        String[] arr = {"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36",
                "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15",
                "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36",
                "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.53 Safari/537.36 Edg/103.0.1264.37",
                "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en) Opera 9.50",
                "Mozilla/5.0 (Windows NT 5.1; U; en; rv:1.8.1) Gecko/20061208 Firefox/2.0.0 Opera 9.50",
                "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; QQDownload 732; .NET4.0C; .NET4.0E)"};

        ChromeOptions chromeOptions = new ChromeOptions();
        chromeOptions.setPageLoadStrategy(PageLoadStrategy.EAGER);
        chromeOptions.addArguments("--incognito");
        chromeOptions.addArguments("--blink-settings=imagesEnabled=false");
        chromeOptions.addArguments("--headless");
        chromeOptions.addArguments("--no-sandbox");
        chromeOptions.addArguments("--disable-gpu");
        if ("用代理") {
            chromeOptions.addArguments("--proxy-server=" + nextProxy);
        }
        HashMap<String, Object> map = new HashMap<>();
        map.put("webrtc.ip_handling_policy", "disable_non_proxied_udp");
        map.put("webrtc.multiple_routes_enabled", false);
        map.put("webrtc.nonproxied_udp_enabled", false);
        chromeOptions.setExperimentalOption("prefs", map);
        Random random = new Random();
        chromeOptions.addArguments("User-Agent=" + arr[random.nextInt(7)]);
        ChromeDriver driver = new ChromeDriver(chromeOptions);
        driver.manage().window().maximize();
        return driver;
    }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47

实战案例 - 爬取房天下价格走势图

package com.mengkeng.selenium_demo.test;

import com.alibaba.fastjson.JSON;
import com.mengkeng.selenium_demo.config.RestTemplateConfig;
import com.mengkeng.selenium_demo.entity.TkBuildingsPriceAjk;
import lombok.extern.slf4j.Slf4j;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.redis.core.RedisTemplate;
import org.springframework.data.redis.core.SetOperations;
import org.springframework.http.*;
import org.springframework.util.CollectionUtils;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.client.RestTemplate;

import java.math.BigDecimal;
import java.util.*;
import java.util.concurrent.TimeUnit;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 *
 * Date: 2022-07-10 13:50
 * Description:
 */
@RestController
@RequestMapping("fang")
@Slf4j
public class FangtianxiaDemo {
    @Autowired
    private RedisTemplate redisTemplate;

    private static LinkedList<String> pages = new LinkedList<>();
    /**
     * 基础页面
     */
    public static final String PRICE_URL = "https://pinggun.fang.com/RunChartNew/MakeChartData/";
    /**
     * redis 记录页面
     */
    public static final String SKIP_URLS = "SKIP_URLS";
    /**
     * 成功标识
     */
    public static String TEMP_FLAG = "fail";


    @RequestMapping("sync")
    public String sync() {
        while (!TEMP_FLAG.equals("success")) {
            System.setProperty("webdriver.chrome.driver", "D://chromedriver.exe");
            ChromeOptions chromeOptions = new ChromeOptions();
            chromeOptions.addArguments("--headless");
            chromeOptions.addArguments("--no-sandbox");
            chromeOptions.addArguments("--disable-gpu");
            chromeOptions.addArguments("--disable-dev-shm-usage");
            WebDriver driver = new ChromeDriver(chromeOptions);
            driver.manage().window().maximize();
            driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
            driver.get("https://esf.fang.com/housing/");
            sleep(2000);
            try {
                parseFTX(driver);
            } catch (Exception e) {
                try {
                    Thread.sleep(10000);
                } catch (InterruptedException interruptedException) {
                    interruptedException.printStackTrace();
                }
            } finally {
                sleep(10000);
                driver.quit();
            }
        }
        return "ok";
    }

    /**
     * 解析fangtianxia
     */
    private void parseFTX(WebDriver driver) {
        SetOperations ops = redisTemplate.opsForSet();
        List<WebElement> elements = driver.findElements(By.xpath("//div[@class='qxName']/a"));
        // 区域
        for (int i = 2; i <= elements.size() - 3; i++) {

            WebElement element = driver.findElement(By.xpath("//div[@class='qxName']/a[" + i + "]"));
            element.click();
            sleep(800);
            //商圈
            List<WebElement> elementsShangquan = driver.findElements(By.xpath("//p[@id='shangQuancontain']/a"));
            for (int sq = 2; sq <= elementsShangquan.size(); sq++) {

                WebElement elementsq = driver.findElement(By.xpath("//p[@id='shangQuancontain']/a[" + sq + "]"));
                String tempHref = elementsq.getAttribute("href");

//                if (ops.isMember(SKIP_URLS, tempHref)) {
//                    System.out.println("跳过了当前链接" + tempHref);
//                    continue;
//                }

                elementsq.click();
                parsePage(driver);
                ops.add(SKIP_URLS, tempHref);
                sleep(800);
            }
        }
        TEMP_FLAG = "success"; //正常跑一圈 结束
    }

    /**
     * 解析分页
     *
     * @param driver
     */
    private void parsePage(WebDriver driver) {
        // 分页
        try {
            driver.findElement(By.className("txt")).getText();
        } catch (Exception e) {
            log.info("该分类下无数据 url是" + driver.getCurrentUrl());
            return;
        }
        String pageTotal = driver.findElement(By.className("txt")).getText().replaceAll("共", "").replaceAll("页", "");
        for (int page = 0; page < Integer.parseInt(pageTotal); page++) {

            List<WebElement> houseList = driver.findElements(By.xpath("//div[@class='houseList']/div"));

            for (int i = 1; i < houseList.size(); i++) {
                String communityName = driver.findElement(By.xpath("//div[@class='houseList']/div[" + i + "]/dl/dd/p[1]/a[1]")).getText();
                String communityCode = driver.findElement(By.xpath("//div[@class='houseList']/div[" + i + "]/dl/dd/p[1]/a[2]")).getAttribute("projcode");
                String areaName = driver.findElement(By.xpath("//div[@class='houseList']/div[" + i + "]/dl/dd/p[2]/a[1]")).getText();


                // 跳转到详情页
                pages.addAll(driver.getWindowHandles());
                driver.findElement(By.xpath("//div[@class='houseList']/div[" + i + "]/dl/dd/p[1]/a[1]")).click();
                sleepAndCutoverNewPage(800, driver);

                parseDetail(communityCode, communityName, areaName);

                driver.close();
                driver.switchTo().window(pages.getLast());
                sleep(1000);
            }
            if (page + 1 == Integer.parseInt(pageTotal)) {
                break;
            }
            String pageNow = driver.findElement(By.xpath("//div[@id='houselist_B14_01']/a[last()-1]")).getAttribute("href");
            System.out.println("下一页是------------" + pageNow + "----" + pageTotal);
            driver.findElement(By.xpath("//div[@id='houselist_B14_01']/a[last()-1]")).click();
            sleep(600);

        }
    }

    /**
     * 解析详情
     *
     * @param communityCode
     * @param communityName
     * @param areaName
     */
    public void parseDetail(String communityCode, String communityName, String areaName) {
        HashMap<String, Object> map = new HashMap<>();
        map.put("newcode", communityCode);
        map.put("city", cnToUnicode("北京"));
        map.put("district", cnToUnicode(areaName));

        HttpHeaders headers = new HttpHeaders();
        headers.setContentType(MediaType.APPLICATION_JSON_UTF8);
        HttpEntity<String> entity = new HttpEntity<>(JSON.toJSONString(map), headers);
        RestTemplate restTemplate = null;
        try {
            restTemplate = new RestTemplate(RestTemplateConfig.generateHttpRequestFactory());
        } catch (Exception e) {
            e.printStackTrace();
        }
        ResponseEntity<String> stringResponseEntity = restTemplate.exchange(PRICE_URL, HttpMethod.POST, entity, String.class);
        Pattern compile = Pattern.compile(",(\\w+)]");
        Matcher matcher = compile.matcher(stringResponseEntity.getBody());

        Pattern compileMonth = Pattern.compile("年(\\w+)月");
        Matcher matcherMonth = compileMonth.matcher(stringResponseEntity.getBody());
        ArrayList<String> list = new ArrayList<>();
        while (matcherMonth.find()) {
            list.add(matcherMonth.group(1));
        }

        Pattern compileYear = Pattern.compile("&(\\w+)年");
        Matcher matcherYear = compileYear.matcher(stringResponseEntity.getBody());
        int year = 2020;
        while (matcherYear.find()) {
            year = Integer.parseInt(matcherYear.group(1));
        }
        ArrayList months = null;
        if (!CollectionUtils.isEmpty(list)) {
            months = getMonths(year, Integer.parseInt(list.get(0)), Integer.parseInt(list.get(1)));
        }

        while (matcher.find()) {
            TkBuildingsPriceAjk ajk = new TkBuildingsPriceAjk();
            ajk.setDataOrigin("fangtianxia");
            ajk.setCommunityCode(communityCode);
            ajk.setCommunity(communityName);
            ajk.setAvgPrice(new BigDecimal(matcher.group(1)));
            System.out.println("持久化=======================================" + ajk);
        }

    }

    private static void sleep(int millis) {
        try {
            Thread.sleep(millis);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

    /**
     * 切换页面
     *
     * @param millis
     * @param driver
     * @return
     */
    private static String sleepAndCutoverNewPage(int millis, WebDriver driver) {
        try {
            Thread.sleep(millis);
            for (String handle : driver.getWindowHandles()) {
                if (!pages.contains(handle)) {
                    driver.switchTo().window(handle);
                }
            }
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        return null;
    }

    /**
     * 获取对象unionCode值
     *
     * @param cn
     * @return
     */
    private static String cnToUnicode(String cn) {
        char[] chars = cn.toCharArray();
        StringBuilder returnStr = new StringBuilder();
        for (int i = 0; i < chars.length; i++) {
            returnStr.append("\\u").append(Integer.toString(chars[i], 16));
        }
        return returnStr.toString();
    }
    /**
     * 获取年份列表-只支持今年至下一年
     *
     * @param year  开始年份
     * @param start 开始月份
     * @param end   结束月份
     * @return
     */
    private static ArrayList getMonths(int year, int start, int end) {
        ArrayList res = new ArrayList();
        for (int i = start; i <= (end == 12 ? 12 : end + 12); i++) {
            if (i > 12) {
                res.add((year + 1) + String.format("%02d", i - 12));
            } else {
                res.add(year + String.format("%02d", i));
            }
        }
        return res;
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • 265
  • 266
  • 267
  • 268
  • 269
  • 270
  • 271
  • 272
  • 273
  • 274
  • 275
  • 276
  • 277
  • 278
  • 279
  • 280

实战案例 - 爬取链家小区价格

package com.mengkeng.selenium_demo.test;

import com.alibaba.fastjson.JSON;
import com.mengkeng.selenium_demo.entity.BuildAreaUrlLj;
import com.mengkeng.selenium_demo.entity.IdAndNamePO;
import com.mengkeng.selenium_demo.entity.TkBuildingsAreaInfolj;
import com.mengkeng.selenium_demo.entity.TkBuildingsMonthPriceLj;
import com.mengkeng.selenium_demo.mapper.BuildAreaUrlLjMapper;
import com.mengkeng.selenium_demo.service.ProxyService;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.lang3.StringUtils;
import org.apache.commons.lang3.time.DateFormatUtils;
import org.openqa.selenium.By;
import org.openqa.selenium.PageLoadStrategy;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.redis.core.HashOperations;
import org.springframework.data.redis.core.SetOperations;
import org.springframework.data.redis.core.StringRedisTemplate;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

import java.time.LocalDate;
import java.time.LocalDateTime;
import java.util.*;
import java.util.concurrent.LinkedBlockingQueue;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 *
 * Date: 2022-09-05 13:58
 * Description: 小区
 */
@RestController
@RequestMapping("areaInfo")
@Slf4j
public class LianjiaAreaInfoDemo {
    @Autowired
    private StringRedisTemplate redisTemplate;
    @Autowired
    private BuildAreaUrlLjMapper buildAreaUrlLjMapper;
    @Autowired
    private ProxyService proxyService;

    public static final String SKIP_URLS = "SKIP_URLS_AREAINFO_LIANJIA";
    public static final String URLS = "URLS_AREAINFO_LIANJIA";
    public static final String AREA_INFO_COMMUNITY_CODE_LJ = "AREA_INFO_COMMUNITY_CODE_LJ";

    private static LinkedList<String> pages = new LinkedList<>();
    ThreadPoolExecutor pagepoolExecutor = new ThreadPoolExecutor(2,
            10, 30L,
            TimeUnit.SECONDS, new LinkedBlockingQueue<>());

    @RequestMapping("sync")
    public void sync() throws InterruptedException {
        System.setProperty("webdriver.chrome.driver", "D://chromedriver.exe");
        boolean flag = false;
        while (!flag) {
            try {
                ChromeDriver driver = getChromeDriver();
                SetOperations ops = redisTemplate.opsForSet();
                try {
                    getUrls(driver, ops);

                    parsePagePre(ops);
                 } finally {
                    sleep(1000);
                    driver.quit();
                }

            } catch (Exception e) {
                Thread.sleep(10000);
                continue;
            }
            flag = true;
        }
        System.out.println("完成");
    }

    /**
     * 获取浏览器对象
     * @return
     */
    private ChromeDriver getChromeDriver() {
        String nextProxy = proxyService.getNextProxy();
        System.out.println("当前ip是" + nextProxy);
        String[] arr = {"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36",
                "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15",
                "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36",
                "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.53 Safari/537.36 Edg/103.0.1264.37",
                "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en) Opera 9.50",
                "Mozilla/5.0 (Windows NT 5.1; U; en; rv:1.8.1) Gecko/20061208 Firefox/2.0.0 Opera 9.50",
                "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; QQDownload 732; .NET4.0C; .NET4.0E)"};

        ChromeOptions chromeOptions = new ChromeOptions();
        chromeOptions.setPageLoadStrategy(PageLoadStrategy.EAGER);
        chromeOptions.addArguments("--incognito");
        chromeOptions.addArguments("--blink-settings=imagesEnabled=false");
        chromeOptions.addArguments("--headless");
        chromeOptions.addArguments("--no-sandbox");
        chromeOptions.addArguments("--disable-gpu");
        if (StringUtils.isNotBlank(nextProxy) && !nextProxy.equals("local")) {
            chromeOptions.addArguments("--proxy-server=" + nextProxy);
        }
        HashMap<String, Object> map = new HashMap<>();
        map.put("webrtc.ip_handling_policy", "disable_non_proxied_udp");
        map.put("webrtc.multiple_routes_enabled", false);
        map.put("webrtc.nonproxied_udp_enabled", false);
        chromeOptions.setExperimentalOption("prefs", map);
        Random random = new Random();
        chromeOptions.addArguments("User-Agent=" + arr[random.nextInt(7)]);
        ChromeDriver driver = new ChromeDriver(chromeOptions);
        driver.manage().window().maximize();
        return driver;
    }

    private void parsePagePre(SetOperations ops) {
        HashOperations<String, Object, Object> opsForHash = redisTemplate.opsForHash();

        List<BuildAreaUrlLj> buildAreaUrlLjs = buildAreaUrlLjMapper.selectList(null);
        List<BuildAreaUrlLj> buildAreaUrlLjs1 = buildAreaUrlLjs.subList(1,3500);
        for (BuildAreaUrlLj buildAreaUrlLj : buildAreaUrlLjs1) {
            if (ops.isMember(SKIP_URLS, buildAreaUrlLj.getAreaUrl())) {
                System.out.println("跳过当前区域" + buildAreaUrlLj.getCityName() + "-" + buildAreaUrlLj.getCountyName());
                continue;
            }
            pagepoolExecutor.execute(() -> parsePage(ops, opsForHash, buildAreaUrlLj));
        }
    }

    /**
     * 解析列表
     * @param ops
     * @param opsForHash
     * @param buildAreaUrlLj
     */
    private void parsePage(SetOperations ops, HashOperations<String, Object, Object> opsForHash, BuildAreaUrlLj buildAreaUrlLj) {
        ChromeDriver driver = getChromeDriver();
        try {
            driver.get(buildAreaUrlLj.getAreaUrl());
            String windowHandlePage = driver.getWindowHandle();
            WebElement totalNumStr = validElement("//h2[@class='total fl']/span", driver);
            if (null != totalNumStr) {
                Integer total = Integer.valueOf(totalNumStr.getText());
                // 有数据
                if (total > 1) {
                    String pageData = driver.findElement(By.xpath("//div[@class='page-box house-lst-page-box']")).getAttribute("page-data");
                    Integer pageNumStr = Integer.valueOf(JSON.parseObject(pageData).getString("totalPage"));
                    System.out.println("当前区域页数" + pageNumStr + "---" + buildAreaUrlLj.getAreaUrl());
                    for (int x = 1; x <= pageNumStr; x++) {
                        List<WebElement> elements = driver.findElements(By.xpath("//ul[@class='listContent']/li/div[1]/div[1]/a"));
                        for (int i = 0; i < elements.size(); i++) {
                            WebElement item = elements.get(i);
                            String code = "";
                            Pattern compile1 = Pattern.compile("xiaoqu/(\\w+)/");
                            Matcher matcher1 = compile1.matcher(item.getAttribute("href"));
                            while (matcher1.find()) {
                                code = matcher1.group(1);
                            }
                            driver.executeScript("arguments[0].click();", item);
                            sleepAndCutoverNewPage(300, driver);

                            // 如果有 则不解析详情
                            if (!opsForHash.hasKey(AREA_INFO_COMMUNITY_CODE_LJ, code)) {
                                parseDetail(driver, code, buildAreaUrlLj, opsForHash);
                            } else {
                                System.out.println("当前code redis 存在" + code);
                                //更新
                                //                                new  TkBuildingsMonthPriceLj();
                            }


                            driver.close();
                            driver.switchTo().window(windowHandlePage);
                            sleep(200);
                            elements = driver.findElements(By.xpath("//ul[@class='listContent']/li/div[1]/div[1]/a"));
                        }
                        if (x != pageNumStr) {
                            String nextPage = buildAreaUrlLj.getAreaUrl() + "pg" + (x + 1) + "/";
                            driver.get(nextPage);
                            System.out.println("下一页是" + nextPage);
                            sleep(200);
                        }
                    }
                }
            }
            ops.add(SKIP_URLS, buildAreaUrlLj.getAreaUrl());
        } catch (NumberFormatException e) {
            throw new RuntimeException("多线程发生异常"+e.getMessage());
        }finally {
            driver.quit();
        }

    }

    /**
     * 解析详情
     * @param driver
     * @param communityCode
     * @param buildAreaUrlLj
     * @param opsForHash
     */
    private void parseDetail(ChromeDriver driver, String communityCode, BuildAreaUrlLj buildAreaUrlLj, HashOperations<String, Object, Object> opsForHash) {
        LocalDateTime now1 = LocalDateTime.now();
        if (null != validElement("//span[@class='xiaoquUnitPrice']", driver)) {
            TkBuildingsMonthPriceLj lj = new TkBuildingsMonthPriceLj();
            lj.setCommunityCode(communityCode);
            String year = String.valueOf(LocalDate.now().getYear());
            if (driver.findElement(By.className("xiaoquUnitPriceDesc")).getText().equals("挂牌均价")){
                lj.setYearmonth(DateFormatUtils.format(new Date(),"yyyyMM"));
            }else{
                String monthStr = driver.findElement(By.className("xiaoquUnitPriceDesc")).getText().replace("月参考均价", "");
                String month = String.format("%02d", Integer.parseInt(monthStr));
                lj.setYearmonth(year + month);
            }
            lj.setAvgPrice(Integer.valueOf(driver.findElement(By.className("xiaoquUnitPrice")).getText()));
            lj.setGenerateType("0");
            lj.setCreateBy("1");
            lj.setCreateDate(new Date());
            lj.setUpdateBy("1");
            lj.setUpdateDate(new Date());
            lj.setDelFlag("0");
            System.out.println("持久化价格"+lj);
        }
        LocalDateTime now2 = LocalDateTime.now();

        TkBuildingsAreaInfolj infolj = new TkBuildingsAreaInfolj();
        infolj.setDataOrigin("lianjia");
        infolj.setGenerateType("0");
        infolj.setProvince(buildAreaUrlLj.getProvinceId());
        infolj.setCity(buildAreaUrlLj.getCityId());
        infolj.setArea(buildAreaUrlLj.getCountyId());
        infolj.setCommunity(validElement("//h1[@class='detailTitle']", driver) == null ?
                "" : driver.findElement(By.xpath("//h1[@class='detailTitle']")).getText());
        infolj.setCommunityCode(communityCode);
        infolj.setBuildingYear(validElement("//span[text()='建筑年代']", driver) == null ?
                "" : driver.findElement(By.xpath("//span[text()='建筑年代']/parent::div/span[2]")).getText());
        infolj.setBuildingType(validElement("//span[text()='建筑类型']", driver) == null ?
                "" : driver.findElement(By.xpath("//span[text()='建筑类型']/parent::div/span[2]")).getText());
        infolj.setManageCost(validElement("//span[text()='物业费用']", driver) == null ?
                "" : driver.findElement(By.xpath("//span[text()='物业费用']/parent::div/span[2]")).getText());
        infolj.setManageCompany(validElement("//span[text()='物业公司']", driver) == null ?
                "" : driver.findElement(By.xpath("//span[text()='物业公司']/parent::div/span[2]")).getText());
        infolj.setManageDevlop(validElement("//span[text()='开发商']", driver) == null ?
                "" : driver.findElement(By.xpath("//span[text()='开发商']/parent::div/span[2]")).getText());
        infolj.setBuildingCount(validElement("//span[text()='楼栋总数']", driver) == null ?
                "" : driver.findElement(By.xpath("//span[text()='楼栋总数']/parent::div/span[2]")).getText());
        infolj.setRoomCount(validElement("//span[text()='房屋总数']", driver) == null ?
                "" : driver.findElement(By.xpath("//span[text()='房屋总数']/parent::div/span[2]")).getText());
        infolj.setCreateBy("1");
        infolj.setCreateDate(new Date());
        infolj.setUpdateBy("1");
        infolj.setUpdateDate(new Date());
        infolj.setDelFlag("0");
        System.out.println("持久化小区"+infolj);
    }

    /**
     * 爬取链接
     * @param driver
     * @param ops
     */
    private void getUrls(ChromeDriver driver, SetOperations ops) {
        driver.get("https://www.lianjia.com/city/");

        int count = 0;
        List<WebElement> elements = driver.findElements(By.xpath("//ul[@class='city_list_ul']/li/div[2]/div/ul/li/a"));
        for (int i = 0; i < elements.size(); i++) {
            WebElement element = elements.get(i);
            String provinceName = element.findElement(By.xpath("./parent::li/parent::ul/parent::div/div")).getText();
            String areaName = element.getText();
            Boolean memberFlag = ops.isMember(URLS, areaName);
            if (memberFlag) {
                System.out.println("已跑过当前区域  跳过" + areaName);
                continue;
            }

            driver.executeScript("arguments[0].click();", element);
            String frontPage = driver.getWindowHandle();
            WebElement ershoufang = null;
            try {
                ershoufang = driver.findElement(By.linkText("小区"));
            } catch (Exception e) {
                ops.add(URLS, areaName);

                sleep(200);
                System.out.println(areaName + "  没有小区====");
                driver.get("https://www.lianjia.com/city/");
                elements = driver.findElements(By.xpath("//ul[@class='city_list_ul']/li/div[2]/div/ul/li/a"));
                continue;
            }
            driver.executeScript("arguments[0].click();", ershoufang);
            sleepAndCutoverNewPage(500, driver);
            List<WebElement> citys = driver.findElements(By.xpath("//div[@data-role='ershoufang']/div[1]/a"));
            citys.forEach(e -> System.out.println("市级============" + e.getText() + "==" + e.getAttribute("href")));

            for (int j = 0; j < citys.size(); j++) {
                String countyName = citys.get(j).getText();
                driver.executeScript("arguments[0].click();", citys.get(j));
                sleep(200);
                if (validElement("//h2[@class='total fl']/span", driver) != null) {
                    String text = driver.findElement(By.xpath("//h2[@class='total fl']/span")).getText();
                    count += Integer.parseInt(text);
                    System.out.println(countyName + text + "个");
                    System.out.println("当前总数是" + count);
                }

                List<WebElement> areas = null;
                try {
                    areas = driver.findElements(By.xpath("//div[@data-role='ershoufang']/div[2]/a"));
                } catch (Exception e) {
                    citys = driver.findElements(By.xpath("//div[@data-role='ershoufang']/div[1]/a"));
                    saveDataCity(countyName, areaName, provinceName, citys);
                    break;
                }
                if (areas.size() == 0) {
                    citys = driver.findElements(By.xpath("//div[@data-role='ershoufang']/div[1]/a"));
                    saveDataCity(countyName, areaName, provinceName, citys);
                    break;
                }
                saveDataCounty(countyName, areaName, provinceName, areas);

                sleep(100);
                citys = driver.findElements(By.xpath("//div[@data-role='ershoufang']/div[1]/a"));
            }

            ops.add(URLS, areaName);
            driver.close();
            driver.switchTo().window(frontPage);
            driver.get("https://www.lianjia.com/city/");
            sleep(200);
            elements = driver.findElements(By.xpath("//ul[@class='city_list_ul']/li/div[2]/div/ul/li/a"));
        }
        System.out.println("总数是" + count);
    }

    private void saveDataCounty(String countyName, String areaName, String provinceName, List<WebElement> list) {
        for (WebElement element : list) {
            String url = element.getAttribute("href");
            BuildAreaUrlLj buildAreaUrlLj = new BuildAreaUrlLj();
            IdAndNamePO provincepo = queryProvinceCityArea(1, provinceName, null);
            buildAreaUrlLj.setProvinceName(provincepo.getBusinessName());
            buildAreaUrlLj.setProvinceId(provincepo.getBusinessId());
            IdAndNamePO areapo = queryProvinceCityArea(2, areaName, provincepo.getBusinessId());
            buildAreaUrlLj.setCityName(areapo.getBusinessName());
            buildAreaUrlLj.setCityId(areapo.getBusinessId());
            IdAndNamePO countypo = queryProvinceCityArea(3, countyName, areapo.getBusinessId());
            buildAreaUrlLj.setCountyName(countypo.getBusinessName());
            buildAreaUrlLj.setCountyId(countypo.getBusinessId());
            buildAreaUrlLj.setAreaUrl(url);
            buildAreaUrlLj.setCreateTime(new Date());
            buildAreaUrlLj.setUpdateTime(new Date());
            System.out.println("持久化链接"+buildAreaUrlLj);
        }
    }

    private void saveDataCity(String countyName, String areaName, String provinceName, List<WebElement> list) {
        for (WebElement element : list) {
            String url = element.getAttribute("href");
            BuildAreaUrlLj buildAreaUrlLj = new BuildAreaUrlLj();
            IdAndNamePO provincepo = queryProvinceCityArea(1, provinceName, null);
            buildAreaUrlLj.setProvinceName(provinceName);
            buildAreaUrlLj.setProvinceId(provincepo.getBusinessId());
            buildAreaUrlLj.setCityName(areaName);
            IdAndNamePO areapo = queryProvinceCityArea(2, areaName, provincepo.getBusinessId());
            buildAreaUrlLj.setCityId(areapo.getBusinessId());
            IdAndNamePO countypo = queryProvinceCityArea(3, countyName, areapo.getBusinessId());
            buildAreaUrlLj.setCountyName(countypo.getBusinessName());
            buildAreaUrlLj.setCountyId(countypo.getBusinessId());

            buildAreaUrlLj.setAreaUrl(url);
            buildAreaUrlLj.setCreateTime(new Date());
            buildAreaUrlLj.setUpdateTime(new Date());
            System.out.println("持久化链接"+buildAreaUrlLj);
        }
    }

    /**
     * 根据名称查询省市县信息
     * @param type 1/省  2/市 3/区
     * @param businessName 名称
     * @param parentId 父id
     * @return
     */
    private IdAndNamePO queryProvinceCityArea(Integer type, String businessName, String parentId) {
        if (StringUtils.isNotBlank(parentId)) {
            ArrayList<String> citys = new ArrayList<>(8);
            citys.add("50");
            citys.add("11");
            citys.add("31");
            citys.add("12");
            if (citys.contains(parentId)) {
                businessName = "市辖区";
            }

        }
        IdAndNamePO po = null;
        try {
            if (type == 1) {
//                po = buildingsAvgMapper.queryProvinceIdByName(businessName);
            } else if (type == 2) {
//                po = buildingsAvgMapper.queryCityIdByName(businessName, parentId);
            } else if (type == 3) {
//                po = buildingsAvgMapper.querycountyIdByName(businessName, parentId);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        if (null == po) {
            po = new IdAndNamePO();
            po.setBusinessId("-1");
            po.setBusinessName(businessName);
        }
        return po;
    }


    private static String sleepAndCutoverNewPage(int millis, WebDriver driver) {
        try {
            Thread.sleep(millis);
            for (String handle : driver.getWindowHandles()) {
                if (!pages.contains(handle)) {
                    driver.switchTo().window(handle);
                }
            }
        } catch (InterruptedException e) {
        }
        return null;
    }

    private static void sleep(int millis) {
        try {
            Thread.sleep(millis);
        } catch (InterruptedException e) {
        }
    }

    public static WebElement validElement(String str, WebDriver driver) {
        try {
            WebElement element = driver.findElement(By.xpath(str));
            return element;
        } catch (Exception e) {
            System.out.println("这个元素不存在" + str);
        }
        return null;
    }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • 265
  • 266
  • 267
  • 268
  • 269
  • 270
  • 271
  • 272
  • 273
  • 274
  • 275
  • 276
  • 277
  • 278
  • 279
  • 280
  • 281
  • 282
  • 283
  • 284
  • 285
  • 286
  • 287
  • 288
  • 289
  • 290
  • 291
  • 292
  • 293
  • 294
  • 295
  • 296
  • 297
  • 298
  • 299
  • 300
  • 301
  • 302
  • 303
  • 304
  • 305
  • 306
  • 307
  • 308
  • 309
  • 310
  • 311
  • 312
  • 313
  • 314
  • 315
  • 316
  • 317
  • 318
  • 319
  • 320
  • 321
  • 322
  • 323
  • 324
  • 325
  • 326
  • 327
  • 328
  • 329
  • 330
  • 331
  • 332
  • 333
  • 334
  • 335
  • 336
  • 337
  • 338
  • 339
  • 340
  • 341
  • 342
  • 343
  • 344
  • 345
  • 346
  • 347
  • 348
  • 349
  • 350
  • 351
  • 352
  • 353
  • 354
  • 355
  • 356
  • 357
  • 358
  • 359
  • 360
  • 361
  • 362
  • 363
  • 364
  • 365
  • 366
  • 367
  • 368
  • 369
  • 370
  • 371
  • 372
  • 373
  • 374
  • 375
  • 376
  • 377
  • 378
  • 379
  • 380
  • 381
  • 382
  • 383
  • 384
  • 385
  • 386
  • 387
  • 388
  • 389
  • 390
  • 391
  • 392
  • 393
  • 394
  • 395
  • 396
  • 397
  • 398
  • 399
  • 400
  • 401
  • 402
  • 403
  • 404
  • 405
  • 406
  • 407
  • 408
  • 409
  • 410
  • 411
  • 412
  • 413
  • 414
  • 415
  • 416
  • 417
  • 418
  • 419
  • 420
  • 421
  • 422
  • 423
  • 424
  • 425
  • 426
  • 427
  • 428
  • 429
  • 430
  • 431
  • 432
  • 433
  • 434
  • 435
  • 436
  • 437
  • 438
  • 439
  • 440
  • 441
  • 442
  • 443
  • 444
  • 445
  • 446
  • 447
  • 448
  • 449
  • 450
  • 451
  • 452
  • 453
  • 454

注意事项

1. driver.close 是关闭当前页  driver.quit是退出进程   循环跑列表的不退出进程的话浏览器会把内存吃满 
2. 跳转页面尽量显示等待一下 以防元素未加载导致查找错误
3. 请求不可太频繁  特殊需求请加代理 
  • 1
  • 2
  • 3

后语

上述案例源码

https://download.csdn.net/download/DoAsOnePleases/86772623

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/从前慢现在也慢/article/detail/89710
推荐阅读
相关标签
  

闽ICP备14008679号