当前位置:   article > 正文

JS逆向批改网实现自动提交作文 调用文心一言API自动生成作文_批改网脚本

批改网脚本

前天熬了个大夜,下午一点才起,一听到舍友说又要写那个恶心的批改网,气得我直接写了个脚本自动把所有未完成的作文秒杀,话不多说直接开始分析。

一.登陆前的cookies处理

清空网站的所有本地存储,会话存储和cookies,并刷新

可以看到多出了两个cookies,在抓包的地方搜索set-cookie

可以发现这两个cookies是服务器返回的

二.登录请求发送

输入账号密码,点击登录,开始抓包

我们很快就发现了登录验证的请求,这个password被加密了

点击发起程序想看看调用堆栈,结果。。。

居然看不了,只能查看发起请求的程序链子

搜索几个关键字,很快就能发现加密函数

把加密函数搬到本地的js并封装,以便我们后续操作

  1. const JSEncrypt = require('jsencrypt');
  2. function encrypt(str) {
  3. var encrypt = new JSEncrypt();
  4. encrypt.setPublicKey('MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCYlII16dSwQErDgjIl8BzU4NEL2IzvyWiLNxie3mkpw6eseF/iUVb3bisAFH+lzgnrv/mBOKUMkbqtW2+8en/6r0hj6ctvGT+UOtg4P5LF/jxkbE+cA2fVJK2RaBzeEEbrKOvauVnGkEOvPVl1/NK4NgeN6aSPIK9ECfXcjlEOHwIDAQAB');
  5. var encrypted = encrypt.encrypt(str);
  6. return encrypted;
  7. }
  8. function main123(password){
  9. return encrypt(password)
  10. }

根据前面的分析逻辑,直接对着验证的链接发请求

  1. headers = {
  2. 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
  3. 'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6',
  4. 'Cache-Control': 'no-cache',
  5. 'Connection': 'keep-alive',
  6. 'Pragma': 'no-cache',
  7. 'Referer': 'https://www.pigai.org/?a=logout',
  8. 'Sec-Fetch-Dest': 'document',
  9. 'Sec-Fetch-Mode': 'navigate',
  10. 'Sec-Fetch-Site': 'same-origin',
  11. 'Sec-Fetch-User': '?1',
  12. 'Upgrade-Insecure-Requests': '1',
  13. 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36 Edg/125.0.0.0',
  14. 'sec-ch-ua': '"Microsoft Edge";v="125", "Chromium";v="125", "Not.A/Brand";v="24"',
  15. 'sec-ch-ua-mobile': '?0',
  16. 'sec-ch-ua-platform': '"Windows"',
  17. }
  18. response = requests.get('https://www.pigai.org/', headers=headers)
  19. PHPSESSID = response.cookies['PHPSESSID']
  20. old=response.cookies['old']
  21. cookies ={
  22. 'PHPSESSID': PHPSESSID,
  23. 'old': old,
  24. }
  25. headers = {
  26. 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
  27. 'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6',
  28. 'Cache-Control': 'no-cache',
  29. 'Connection': 'keep-alive',
  30. 'Content-Type': 'application/x-www-form-urlencoded',
  31. # 'Cookie': 'old=2012; PHPSESSID=8oj8mlkeavc583i7l2kutgen77',
  32. 'Origin': 'https://www.pigai.org',
  33. 'Pragma': 'no-cache',
  34. 'Referer': 'https://www.pigai.org/',
  35. 'Sec-Fetch-Dest': 'document',
  36. 'Sec-Fetch-Mode': 'navigate',
  37. 'Sec-Fetch-Site': 'same-origin',
  38. 'Sec-Fetch-User': '?1',
  39. 'Upgrade-Insecure-Requests': '1',
  40. 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36 Edg/125.0.0.0',
  41. 'sec-ch-ua': '"Microsoft Edge";v="125", "Chromium";v="125", "Not.A/Brand";v="24"',
  42. 'sec-ch-ua-mobile': '?0',
  43. 'sec-ch-ua-platform': '"Windows"',
  44. }
  45. params = {
  46. 'a': 'login',
  47. }
  48. data = {
  49. 'username': username,
  50. 'password': '',
  51. 'checkhash': '',
  52. 'password_encrypt': execjs.compile(open('密码加密.js', 'r', encoding='gbk').read()).call('main123', password),
  53. }
  54. response = requests.post('https://www.pigai.org/index.php', params=params, cookies=cookies, headers=headers, data=data,allow_redirects=False)

这样我们就算是登陆进去了,可以发现已上代码最后一句话

response = requests.post('https://www.pigai.org/index.php', params=params, cookies=cookies, headers=headers, data=data,allow_redirects=False)

这里我将allow_redirects=False,这是禁止重定向操作,但是为什么要这么做呢?

点击我们刚刚验证的请求,再点击cookies

服务器给我们返回了很多cookies,这些cookies是服务器识别我们身份的标识,所以我们要将其保存下来

但是若我没有禁止重定向操作,最后请求中获取的cookie如下

这是怎么回事呢?点击预览看看

原来是我们验证通过后,服务器会直接给我们重定向到登陆的主界面,若我们不禁止重定向,就无法抓取到能验证我们身份的cookies

三.百度统计第三方cookies处理

点击存储查看一下存在我们本地的cookies

上面那三个cookies其实是批改网引入了百度统计的第三方cookies,百度统计是用来检测访客信息和访客行为的第三方网站,我们直接查看百度统计官方提供的文档

这个Hm_lpvt_和Hm_lvt_我们直接全部设置成当前时间的时间戳就好(意思就是伪装成我们是第一次登录)

这个HMACCOUNT_BFESS是由百度统计接口分配的一个访客id,我们只需要在网站存储中找到一个类似这样的js文件

查看一下服务器响应

也是直接给我们分配了一个访客id,这里应为返回的是js代码的形式,所以要用到正则表达式将这个字段提取,提取过程如下

  1. headers = {
  2. 'Accept': '*/*',
  3. 'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6',
  4. 'Cache-Control': 'no-cache',
  5. 'Connection': 'keep-alive',
  6. # 'Cookie': 'HMACCOUNT_BFESS=7C04F9FF37E951F6',
  7. 'Pragma': 'no-cache',
  8. 'Referer': 'https://www.pigai.org/',
  9. 'Sec-Fetch-Dest': 'script',
  10. 'Sec-Fetch-Mode': 'no-cors',
  11. 'Sec-Fetch-Site': 'cross-site',
  12. 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36 Edg/125.0.0.0',
  13. 'sec-ch-ua': '"Microsoft Edge";v="125", "Chromium";v="125", "Not.A/Brand";v="24"',
  14. 'sec-ch-ua-mobile': '?0',
  15. 'sec-ch-ua-platform': '"Windows"',
  16. }
  17. response = requests.get('https://hm.baidu.com/hm.js?3f46f9c09663bf0ac2abdeeb95c7e516', headers=headers)
  18. match = re.search(r"hca:'([0-9A-Fa-f]+)'", response.text)
  19. BFESS=match.group(1)

四.批量获取未完成的作文题目

然后,我们点击未完成

抓到了数据包,这里应该会返回我没写完作文的所有信息,但是因为我的作文全部用这个脚本秒了,所以啥都没看到,这里我是用xpath的方法批量提取我没完成作文的所有作文编号,并存在一个列表中

  1. cookies ={
  2. 'PHPSESSID': PHPSESSID,
  3. 'old': old,
  4. '_JUKU_USER':response.cookies['_JUKU_USER'],
  5. 'isPrize': response.cookies['isPrize'],
  6. 'Hm_lpvt_3f46f9c09663bf0ac2abdeeb95c7e516':str(int(time.time()*1000)),
  7. 'Hm_lvt_3f46f9c09663bf0ac2abdeeb95c7e516':str(int(time.time()*1000)),
  8. 'HMACCOUNT_BFESS':BFESS,
  9. 'JK_GCNT':'0'
  10. }
  11. headers = {
  12. 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
  13. 'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6',
  14. 'Cache-Control': 'no-cache',
  15. 'Connection': 'keep-alive',
  16. # 'Cookie': 'old=2012; PHPSESSID=eqnfap3mhnh5i894d1e0dorr44; _JUKU_USER=%7B%22i%22%3A%2230209707%22%2C%22u%22%3A%22N633f8e15a0a8a%22%2C%22u2%22%3A%22%5Cu5434%5Cu6bd3%5Cu535a%22%2C%22k%22%3A%22432d6aaf111ccba5cddcce2f653223b1%22%2C%22img%22%3A%22%22%2C%22ts%22%3A2%2C%22s%22%3A%22%5Cu56db%5Cu5ddd%5Cu5927%5Cu5b66%22%2C%22iv%22%3A0%2C%22st%22%3A%220%22%2C%22no%22%3A%222022141530099%22%2C%22cl%22%3A%22105%22%2C%22it%22%3A%221%22%7D; isPrize=0; JK_GCNT=0; Hm_lvt_3f46f9c09663bf0ac2abdeeb95c7e516=1718356284; Hm_lpvt_3f46f9c09663bf0ac2abdeeb95c7e516=1718356284',
  17. 'Pragma': 'no-cache',
  18. 'Referer': 'https://www.pigai.org/index.php?c=write&f2=login',
  19. 'Sec-Fetch-Dest': 'document',
  20. 'Sec-Fetch-Mode': 'navigate',
  21. 'Sec-Fetch-Site': 'same-origin',
  22. 'Sec-Fetch-User': '?1',
  23. 'Upgrade-Insecure-Requests': '1',
  24. 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36 Edg/125.0.0.0',
  25. 'sec-ch-ua': '"Microsoft Edge";v="125", "Chromium";v="125", "Not.A/Brand";v="24"',
  26. 'sec-ch-ua-mobile': '?0',
  27. 'sec-ch-ua-platform': '"Windows"',
  28. }
  29. params = {
  30. 'c': 'write',
  31. 'a': 'waitComplete',
  32. 'bf': '4',
  33. }
  34. response = requests.get('https://www.pigai.org/index.php', params=params, cookies=cookies, headers=headers)
  35. tree =etree.HTML(response.text)
  36. paragraphs = tree.xpath('//*[@id="essayList"]//ul')
  37. sum=0
  38. essayLisy=[]
  39. for ul in paragraphs:
  40. if(sum==0):
  41. sum+=1
  42. continue
  43. essayLisy.append(ul.xpath('.//li[1]/text()')[0])
  44. sum+=1

接着我们带着我们获取的作文编号就能直接请求到写作文的详情页

这里得注意,请求后服务器给我们返回了一个_fromcode的cookies,要注意保存

直接用xpath获取作文题目

  1. for rid in essayLisy:
  2. headers = {
  3. 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
  4. 'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6',
  5. 'Cache-Control': 'no-cache',
  6. 'Connection': 'keep-alive',
  7. # 'Cookie': 'old=2012; PHPSESSID=02k4hi1jt2glddbfhvd381ek47; _JUKU_USER=%7B%22i%22%3A%2230209707%22%2C%22u%22%3A%22N633f8e15a0a8a%22%2C%22u2%22%3A%22%5Cu5434%5Cu6bd3%5Cu535a%22%2C%22k%22%3A%22432d6aaf111ccba5cddcce2f653223b1%22%2C%22img%22%3A%22%22%2C%22ts%22%3A2%2C%22s%22%3A%22%5Cu56db%5Cu5ddd%5Cu5927%5Cu5b66%22%2C%22iv%22%3A0%2C%22st%22%3A%220%22%2C%22no%22%3A%222022141530099%22%2C%22cl%22%3A%22105%22%2C%22it%22%3A%221%22%7D; isPrize=0; JK_GCNT=0; Hm_lvt_3f46f9c09663bf0ac2abdeeb95c7e516=1718358190; Hm_lpvt_3f46f9c09663bf0ac2abdeeb95c7e516=1718358199',
  8. 'Pragma': 'no-cache',
  9. 'Referer': 'https://www.pigai.org/index.php?c=write&a=waitComplete&bf=4',
  10. 'Sec-Fetch-Dest': 'document',
  11. 'Sec-Fetch-Mode': 'navigate',
  12. 'Sec-Fetch-Site': 'same-origin',
  13. 'Sec-Fetch-User': '?1',
  14. 'Upgrade-Insecure-Requests': '1',
  15. 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36 Edg/125.0.0.0',
  16. 'sec-ch-ua': '"Microsoft Edge";v="125", "Chromium";v="125", "Not.A/Brand";v="24"',
  17. 'sec-ch-ua-mobile': '?0',
  18. 'sec-ch-ua-platform': '"Windows"',
  19. }
  20. params = {
  21. 'c': 'v2',
  22. 'a': 'write',
  23. 'rid': rid,
  24. }
  25. response = requests.get('https://www.pigai.org/index.php', params=params, cookies=cookies, headers=headers)
  26. cookies.update({'_fromCode': response.cookies['_fromCode']})
  27. tree = etree.HTML(response.text)
  28. title = tree.xpath('string(//*[@id="request_y"])')

五.调用文言一心api

大公司就是好啊,模型的接口直接给我们随便用

打开百度智能云官网,搜索千帆大模型,创建应用

阅读官方提供的API文档

这里有个接口调试平台,非常的好用

查看我写的调用代码

  1. def main(title):
  2. url = "https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/ernie-3.5-8k-0205?access_token=" + get_access_token()
  3. payload = json.dumps({
  4. "messages": [
  5. {
  6. "role": "user",
  7. "content": title
  8. }
  9. ],
  10. "temperature": 0.8,
  11. "top_p": 0.8,
  12. "penalty_score": 1,
  13. "disable_search": False,
  14. "enable_citation": False,
  15. "response_format": "text"
  16. })
  17. headers = {
  18. 'Content-Type': 'application/json'
  19. }
  20. response = requests.request("POST", url, headers=headers, data=payload)
  21. return response.json().get('result')
  22. def get_access_token():
  23. """
  24. 使用 AK,SK 生成鉴权签名(Access Token)
  25. :return: access_token,或是None(如果错误)
  26. """
  27. url = "https://aip.baidubce.com/oauth/2.0/token"
  28. params = {"grant_type": "client_credentials", "client_id": API_KEY, "client_secret": SECRET_KEY}
  29. return str(requests.post(url, params=params).json().get("access_token"))

直接调用main函数,并传入作文题目,就能返回对应的文章了

六.上传文章

直接对着这个链接发请求就行了,非常简单,作文仅是经过了简单的url编码

  1. essay=main(title)
  2. headers = {
  3. 'Accept': '*/*',
  4. 'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6',
  5. 'Cache-Control': 'no-cache',
  6. 'Connection': 'keep-alive',
  7. 'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
  8. # 'Cookie': 'old=2012; PHPSESSID=javkrsajhm585dur0nt1a6llf1; _JUKU_USER=%7B%22i%22%3A%2230209707%22%2C%22u%22%3A%22N633f8e15a0a8a%22%2C%22u2%22%3A%22%5Cu5434%5Cu6bd3%5Cu535a%22%2C%22k%22%3A%22432d6aaf111ccba5cddcce2f653223b1%22%2C%22img%22%3A%22%22%2C%22ts%22%3A2%2C%22s%22%3A%22%5Cu56db%5Cu5ddd%5Cu5927%5Cu5b66%22%2C%22iv%22%3A0%2C%22st%22%3A%220%22%2C%22no%22%3A%222022141530099%22%2C%22cl%22%3A%22105%22%2C%22it%22%3A%221%22%7D; isPrize=0; JK_GCNT=0; Hm_lvt_3f46f9c09663bf0ac2abdeeb95c7e516=1718385312; _fromCode=692374; Hm_lpvt_3f46f9c09663bf0ac2abdeeb95c7e516=1718385738',
  9. 'Origin': 'https://www.pigai.org',
  10. 'Pragma': 'no-cache',
  11. 'Referer': 'https://www.pigai.org/index.php?c=v2&a=write&rid=11111&eid=',
  12. 'Sec-Fetch-Dest': 'empty',
  13. 'Sec-Fetch-Mode': 'cors',
  14. 'Sec-Fetch-Site': 'same-origin',
  15. 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36 Edg/125.0.0.0',
  16. 'X-Requested-With': 'XMLHttpRequest',
  17. 'sec-ch-ua': '"Microsoft Edge";v="125", "Chromium";v="125", "Not.A/Brand";v="24"',
  18. 'sec-ch-ua-mobile': '?0',
  19. 'sec-ch-ua-platform': '"Windows"',
  20. }
  21. params = {
  22. 'c': 'ajax',
  23. 'a': 'postSave',
  24. }
  25. data = {
  26. 'utContent': quote(essay),
  27. 'utTitle': quote('第'+rid+'号 '),
  28. 'bzold': '',
  29. 'bz': '',
  30. 'fileName': '',
  31. 'filePath': '',
  32. 'rid': rid,
  33. 'eid': '',
  34. 'type': '0',
  35. 'utype': '',
  36. 'gao': '1',
  37. 'uncheck': '',
  38. 'tiku_id': '0',
  39. 'engine': '',
  40. 'fromCode': cookies['_fromCode'],
  41. 'autoDel': '',
  42. 'stu_class': '',
  43. }
  44. response = requests.post('https://www.pigai.org/index.php', params=params, cookies=cookies, headers=headers,
  45. data=data)
  46. print(response.text)

至此,整个脚本编写的思路就都搞定啦,妈妈再也不用担心我写不完作文了

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/黑客灵魂/article/detail/761226
推荐阅读
相关标签
  

闽ICP备14008679号