当前位置:   article > 正文

我把B站番剧的视频和弹幕一起下载下来了……_番剧弹幕下载

番剧弹幕下载

众所周知,B 站是神器的网站

「里面的人,个个都是人才,说话又好听,超喜欢那里的。」

B 站里有很多宝藏 UP 主,视频质量非常高。

有时候我们想要下载一个视频,但是没有弹幕的 B 站视频是没有灵魂的,弹幕当然也不能少。

所以我就把B站的视频和弹幕一起爬取下载了

完整代码展示:

  1. # -*-coding:utf-8 -*-
  2. import requests
  3. import json
  4. import re
  5. import json
  6. import math
  7. import xml2ass
  8. import time
  9. from contextlib import closing
  10. from bs4 import BeautifulSoup
  11. import os
  12. from win32com.client import Dispatch
  13. def addTasktoXunlei(down_url):
  14. flag = False
  15. o = Dispatch('ThunderAgent.Agent64.1')
  16. try:
  17. o.AddTask(down_url, "", "", "", "", -1, 0, 5)
  18. o.CommitTasks()
  19. flag = True
  20. except Exception:
  21. print(Exception.message)
  22. print(" AddTask is fail!")
  23. return flag
  24. def get_download_url(arcurl):
  25. jiexi_url = 'xxx'
  26. payload = {'url': arcurl}
  27. jiexi_req = requests.get(jiexi_url, params=payload)
  28. jiexi_bf = BeautifulSoup(jiexi_req.text)
  29. jiexi_dn_url = jiexi_bf.iframe.get('src')
  30. dn_req = requests.get(jiexi_dn_url)
  31. dn_bf = BeautifulSoup(dn_req.text)
  32. video_script = dn_bf.find('script',src = None)
  33. DPlayer = str(video_script.string)
  34. download_url = re.findall('\'(http[s]?:(?:[a-zA-Z]|[0-9]|[$-_@.&~+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+)\'', DPlayer)[0]
  35. download_url = download_url.replace('\\', '')
  36. return download_url
  37. space_url = 'https://space.bilibili.com/280793434'
  38. search_url = 'https://api.bilibili.com/x/space/arc/search'
  39. mid = space_url.split('/')[-1]
  40. sess = requests.Session()
  41. search_headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167 Safari/537.36',
  42. 'Accept-Language': 'zh-CN,zh;q=0.9',
  43. 'Accept-Encoding': 'gzip, deflate, br',
  44. 'Accept': 'application/json, text/plain, */*'}
  45. # 获取视频个数
  46. ps = 1
  47. pn = 1
  48. search_params = {'mid': mid,
  49. 'ps': ps,
  50. 'tid': 0,
  51. 'pn': pn}
  52. req = sess.get(url=search_url, headers=search_headers, params=search_params, verify=False)
  53. info = json.loads(req.text)
  54. video_count = info['data']['page']['count']
  55. ps = 10
  56. page = math.ceil(video_count/ps)
  57. videos_list = []
  58. for pn in range(1, page+1):
  59. search_params = {'mid': mid,
  60. 'ps': ps,
  61. 'tid': 0,
  62. 'pn': pn}
  63. req = sess.get(url=search_url, headers=search_headers, params=search_params, verify=False)
  64. info = json.loads(req.text)
  65. vlist = info['data']['list']['vlist']
  66. for video in vlist:
  67. title = video['title']
  68. bvid = video['bvid']
  69. vurl = 'https://www.bilibili.com/video/' + bvid
  70. videos_list.append([title, vurl])
  71. print('共 %d 个视频' % len(videos_list))
  72. all_video = {}
  73. # 下载前 10 个视频
  74. for video in videos_list[:10]:
  75. download_url = get_download_url(video[1])
  76. print(video[0] + ':' + download_url)
  77. # 记录视频名字
  78. xunlei_video_name = download_url.split('?')[0].split('/')[-1]
  79. filename = video[0]
  80. for c in u'´☆❤◦\/:*?"<>| ':
  81. filename = filename.replace(c, '')
  82. save_video_name = filename + '.mp4'
  83. all_video[xunlei_video_name] = save_video_name
  84. addTasktoXunlei(download_url)
  85. # 弹幕下载
  86. danmu_name = filename + '.xml'
  87. danmu_ass = filename + '.ass'
  88. oid = download_url.split('/')[6]
  89. danmu_url = 'https://api.bilibili.com/x/v1/dm/list.so?oid={}'.format(oid)
  90. danmu_header = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167 Safari/537.36',
  91. 'Accept': '*/*',
  92. 'Accept-Encoding': 'gzip, deflate, br',
  93. 'Accept-Language': 'zh-CN,zh;q=0.9'}
  94. with closing(sess.get(danmu_url, headers=danmu_header, stream=True, verify=False)) as response:
  95. if response.status_code == 200:
  96. with open(danmu_name, 'wb') as file:
  97. for data in response.iter_content():
  98. file.write(data)
  99. file.flush()
  100. else:
  101. print('链接异常')
  102. time.sleep(0.5)
  103. xml2ass.Danmaku2ASS(danmu_name, danmu_ass, 1280, 720)
  104. # 视频重命名
  105. for key, item in all_video.items():
  106. while key not in os.listdir('./'):
  107. time.sleep(1)
  108. os.rename(key, item)

技术交流群这里
今天是持续写作的第 2天。可以点赞、评论、收藏啦。

  1. 一如既往地送你们东西,干货主要有:
  2. ① 2000多本Python电子书(主流和经典的书籍都有)
  3. ② Python标准库资料(最全中文版)
  4. ③ 项目源码(四五十个有趣且经典的练手项目及王者源码)
  5. ④ Python基础入门、爬虫、web开发、大数据分析方面的视频(适合小白学习)
  6. ⑤  Python学习路线图(告别不入流的学习)

如果你用得到的话可以直接拿走,在我的QQ技术交流群里(广告进来立马封号,不要惹老程序员)可以自助拿走
 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/凡人多烦事01/article/detail/337285?site
推荐阅读
相关标签
  

闽ICP备14008679号