当前位置:   article > 正文

Python requests下载超大文件/批量下载文件_python批量下载文件夹

python批量下载文件夹

(一)下载超大文件:

使用 python 下载超大文件,直接全部下载,文件过大,可能会造成内存不足,这时候要使用 requests 的 stream 模式

主要代码如下

iter_content:一块一块的遍历要下载的内容

iter_lines:一行一行的遍历要下载的内容

  1. def download_file(url, file_pname, chunk_size=1024*4):
  2.     """
  3.     url: file url
  4.     file_pname: file save path
  5.     chunk_size: chunk size
  6.     """# 第一种
  7.     response_data_file = requests.get(url, stream=True)
  8.     with open(file_pname, 'wb') as f:
  9.     for chunk in response_data_file.iter_content(chunk_size=chunk_size):
  10.     if chunk:
  11.     f.write(chunk)
  12.    
  13. # 第二种with requests.get(url, stream=True) as req:
  14. with open(file_pname, 'wb') as f:
  15. for chunk in req.iter_content(chunk_size=chunk_size):
  16. if chunk:
  17. f.write(chunk)
  18. # 下载大文件 应用实例:
  19. def Big_Download(session,url_inquire,headers,form_data):
  20. response = session.post(url=url_inquire,data=form_data,headers=headers,verify=False,stream=True)
  21. # 获取文件大小
  22. file_size = int(response.headers['content-length'])
  23. with tqdm(total=file_size, unit='B', unit_scale=True, unit_divisor=1024, ascii=True, desc='Expense.json') as bar:
  24. with session.post(url=url_inquire,data=form_data,headers=headers,verify=False,stream=True) as r:
  25. with open('Expense.json', 'wb') as fp:
  26. for chunk in r.iter_content(chunk_size=512):
  27. if chunk:
  28. fp.write(chunk)
  29. bar.update(len(chunk))

(二)批量下载文件:

  1. #批量文件下载
  2. import requests
  3. from bs4 import BeautifulSoup
  4. archive_url = "http://www-personal.umich.edu/~csev/books/py4inf/media/"
  5. def get_links():
  6. r = requests.get(archive_url)
  7. soup = BeautifulSoup(r.content, 'html5lib')
  8. links = soup.findAll('a')
  9. video_links = [archive_url + link['href'] for link in links if link['href'].endswith('mp4')]
  10. return video_links
  11. def download_series(video_links):
  12. for link in video_links:
  13. file_name = link.split('/')[-1]
  14. print("Downloading file:%s" % file_name)
  15. r = requests.get(link, stream=True)
  16. # download started
  17. with open(file_name, 'wb') as f:
  18. for chunk in r.iter_content(chunk_size=1024 * 1024):
  19. if chunk:
  20. f.write(chunk)
  21. print("%s downloaded!\n" % file_name)
  22. print("All videos downloaded!")
  23. return
  24. if __name__ == "__main__":
  25. video_links = get_links()
  26. download_series(video_links)

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/羊村懒王/article/detail/409377
推荐阅读
相关标签
  

闽ICP备14008679号