当前位置:   article > 正文

数据采集:selenium 提取 Cookie 自动登陆_seleniumwire 获取cookie

seleniumwire 获取cookie

写在前面


  • 工作需要,简单整理
  • 博文内容涉及 通过 selenium 实现自动登陆
  • 理解不足小伙伴帮忙指正

对每个人而言,真正的职责只有一个:找到自我。然后在心中坚守其一生,全心全意,永不停息。所有其它的路都是不完整的,是人的逃避方式,是对大众理想的懦弱回归,是随波逐流,是对内心的恐惧 ——赫尔曼·黑塞《德米安》


未登陆用户

保存 cookie

假设登陆用户名为 : chinaz_7356287

我们需要获取一些 CDN数据,代码很简单,不做说明,自动登陆 站长之家的 CDN 网站 https://cdn.chinaz.com/

from selenium import webdriver
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import re
import pandas as pd
import json

browser = webdriver.Chrome()
browser.get("https://cdn.chinaz.com/")
time.sleep(4)
# 需要根据页面编写
browser.find_element(By.CSS_SELECTOR, ".userbar").find_element(By.TAG_NAME, "a").click()
print("等待登录...")
while True:
    try:
        time.sleep(10)
        # 需要根据页面编写 判断用户名是否存在
        if browser.find_element(By.CSS_SELECTOR,".username").text == "chinaz_7356287":
            print("已登录,保存 cookie...")
            with open('cookie.txt', 'w', encoding='u8') as f:
            json.dump(browser.get_cookies(), f) 
            browser.close()
            print("cookie保存完成,游览器已自动退出...")
        else:
            time.sleep(3)
    except:
        pass
        print("登陆页面未出现,重试中")
    finally:
        pass        
        
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33

获取到的 JSON 数据

[
    {
        "domain": ".chinaz.com",
        "expiry": 1693423590,
        "httpOnly": false,
        "name": "chinaz_topuser",
        "path": "/",
        "sameSite": "Lax",
        "secure": false,
        "value": "92da5ff9-69ac-c0b7-73ab-040cb089d48f"
    },
    {
        "domain": ".chinaz.com",
        "expiry": 1693884390,
        "httpOnly": true,
        "name": "ucvalidate",
        "path": "/",
        "sameSite": "None",
        "secure": true,
        "value": "9aba1eb9-8b70-019b-8352-43ff4719eb84"
    },
    {
        "domain": ".chinaz.com",
        "httpOnly": false,
        "name": "Hm_lpvt_ca96c3507ee04e182fb6d097cb2a1a4c",
        "path": "/",
        "sameSite": "Lax",
        "secure": false,
        "value": "1692588387"
    },
    {
        "domain": ".chinaz.com",
        "expiry": 1724124387,
        "httpOnly": false,
        "name": "Hm_lvt_ca96c3507ee04e182fb6d097cb2a1a4c",
        "path": "/",
        "sameSite": "Lax",
        "secure": false,
        "value": "1692588387"
    }
]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41

使用 cookie 自动登陆

from seleniumwire import webdriver
import json
import time

# 自动登陆
browser = webdriver.Chrome()
with open('C:\\Users\山河已无恙\\Documents\GitHub\\reptile_demo\\demo\\cookie.txt', 'r', encoding='u8') as f:
    cookies = json.load(f)

browser.get('https://cdn.chinaz.com/')
for cookie in cookies:
    browser.add_cookie(cookie)

browser.get('https://cdn.chinaz.com/')


time.sleep(10000)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17

以登陆用户

对于已经存在的用户,我们直接通过当前会话可以获取cookie 信息,然后直接替换对应的值

[
    {
        "domain": ".chinaz.com",
        "expiry": 1693423590,
        "httpOnly": false,
        "name": "chinaz_topuser",
        "path": "/",
        "sameSite": "Lax",
        "secure": false,
        "value": "92da5ff9-69ac-c0b7-73ab-040cb089d48f"
    },
    {
        "domain": ".chinaz.com",
        "expiry": 1693884390,
        "httpOnly": true,
        "name": "ucvalidate",
        "path": "/",
        "sameSite": "None",
        "secure": true,
        "value": "9aba1eb9-8b70-019b-8352-43ff4719eb84"
    },
    {
        "domain": ".chinaz.com",
        "httpOnly": false,
        "name": "Hm_lpvt_ca96c3507ee04e182fb6d097cb2a1a4c",
        "path": "/",
        "sameSite": "Lax",
        "secure": false,
        "value": "1692588387"
    },
    {
        "domain": ".chinaz.com",
        "expiry": 1724124387,
        "httpOnly": false,
        "name": "Hm_lvt_ca96c3507ee04e182fb6d097cb2a1a4c",
        "path": "/",
        "sameSite": "Lax",
        "secure": false,
        "value": "1692588387"
    }
]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41

控制台输入 console.log(document.cookie); 打印 cookie ,然后复制对应的 value 到上面的 JSON

console.log(document.cookie);

VM64:1 toolUserGrade=DA558BECA59696EB6D6F7073658259093B6A1006BF1EE9768104ED4EF435DFFE7A7CCE826E9718B7BF5917ABBB8378EB9F2A2DF83F2D261B6ABB5FF77D3EB74948E7E207D35739840897873E9CED6A06188A7269E8D6621D2A3EB35366EE2939BD52587A8E5FD9CFD5B7FADCEA248B51B971062D27AB402FF41885786B87AD00; bbsmax_user=096a40c7-f2ba-8f87-f56a-bb8c65838157; chinaz_zxuser=c55d2eaa-e630-99a5-3d19-82c6cbadc2e3; Hm_lvt_ca96c3507ee04e182fb6d097cb2a1a4c=1692590966; Hm_lpvt_ca96c3507ee04e182fb6d097cb2a1a4c=1692590966; toolbox_urls=1.180.204.161|jiuzhoufangyuan.cn|daoxinwuliu.com|www.lzfjyl.com|lzfjyl.com|herunnongye.com|danyu.com.cn|www.danyu.com.cn|encrypt-k-vod.xet.tech; chinaz_topuser=f38f3b0f-4c0d-57d8-8f2d-35180d6e13a5
  • 1
  • 2
  • 3

之后可以使用相同的方式登陆

博文部分内容参考

© 文中涉及参考链接内容版权归原作者所有,如有侵权请告知



© 2018-2023 liruilonger@gmail.com, All rights reserved. 保持署名-非商用-相同方式共享(CC BY-NC-SA 4.0)

本文内容由网友自发贡献,转载请注明出处:【wpsshop博客】
推荐阅读
相关标签
  

闽ICP备14008679号