当前位置:   article > 正文

从Hugging Face下载数据测试whisper、fast_whisper耗时_fastwhisper

fastwhisper

时长比较短的音频:https://huggingface.co/datasets/PolyAI/minds14/viewer/en-US

时长比较长的音频:https://huggingface.co/datasets/librispeech_asr?row=8

此次测试过程暂时只使用比较短的音频

使用fast_whisper测试

下载安装,参考官方网站即可

 报错提示:

Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory
Please make sure libcudnn_ops_infer.so.8 is in your library path!

解决办法:

找到有libcudnn_ops_infer.so.8 的路径,在我的电脑中,改文件所在的路径为

在终端导入  export LD_LIBRARY_PATH=/opt/audio/venv/lib/python3.10/site-packages/nvidia/cudnn/lib:$LD_LIBRARY_PATH

test_fast_whisper.py

  1. import subprocess
  2. import os
  3. import time
  4. import unittest
  5. import openpyxl
  6. from pydub import AudioSegment
  7. from datasets import load_dataset
  8. from faster_whisper import WhisperModel
  9. class TestFastWhisper(unittest.TestCase):
  10. def setUp(self):
  11. pass
  12. def test_fastwhisper(self):
  13. # 替换为您的脚本路径
  14. # 设置HTTP代理
  15. os.environ["http_proxy"] = "http://10.10.10.178:7890"
  16. os.environ["HTTP_PROXY"] = "http://10.10.10.178:7890"
  17. # 不知道此处为什么不能生效,必须要在终端中手动导入
  18. os.environ["LD_LIBRARY_PATH"] = "/opt/audio/venv/lib/python3.10/site-packages/nvidia/cudnn/lib:$LD_LIBRARY_PATH"
  19. # 设置HTTPS代理
  20. os.environ["https_proxy"] = "http://10.10.10.178:7890"
  21. os.environ["HTTPS_PROXY"] = "http://10.10.10.178:7890"
  22. print("load whisper")
  23. # 使用fast_whisper
  24. model_size = "large-v2"
  25. # Run on GPU with FP16
  26. fast_whisper_model = WhisperModel(model_size, device="cuda", compute_type="float16")
  27. minds_14 = load_dataset("PolyAI/minds14", "en-US", split="train") # for en-US
  28. workbook = openpyxl.Workbook()
  29. # 创建一个工作表
  30. worksheet = workbook.active
  31. # 设置表头
  32. worksheet["A1"] = "Audio Path"
  33. worksheet["B1"] = "Audio Duration (seconds)"
  34. worksheet["C1"] = "Audio Size (MB)"
  35. worksheet["D1"] = "Correct Text"
  36. worksheet["E1"] = "Transcribed Text"
  37. worksheet["F1"] = "Cost Time (seconds)"
  38. for index, each in enumerate(minds_14, start=2):
  39. audioPath = each["path"]
  40. print(audioPath)
  41. # audioArray = each["audio"]
  42. audioDuration = len(AudioSegment.from_file(audioPath))/1000
  43. audioSize = os.path.getsize(audioPath)/ (1024 * 1024)
  44. CorrectText = each["transcription"]
  45. tran_start_time = time.time()
  46. segments, info = fast_whisper_model.transcribe(audioPath, beam_size=5)
  47. segments = list(segments) # The transcription will actually run here.
  48. print("Detected language '%s' with probability %f" % (info.language, info.language_probability))
  49. text = ""
  50. for segment in segments:
  51. text += segment.text
  52. cost_time = time.time() - tran_start_time
  53. print("Audio Path:", audioPath)
  54. print("Audio Duration (seconds):", audioDuration)
  55. print("Audio Size (MB):", audioSize)
  56. print("Correct Text:", CorrectText)
  57. print("Transcription Time (seconds):", cost_time)
  58. print("Transcribed Text:", text)
  59. worksheet[f"A{index}"] = audioPath
  60. worksheet[f"B{index}"] = audioDuration
  61. worksheet[f"C{index}"] = audioSize
  62. worksheet[f"D{index}"] = CorrectText
  63. worksheet[f"E{index}"] = text
  64. worksheet[f"F{index}"] = cost_time
  65. # break
  66. workbook.save("fast_whisper_output_data.xlsx")
  67. print("数据已保存到 fast_whisper_output_data.xlsx 文件")
  68. if __name__ == '__main__':
  69. unittest.main()

使用whisper测试

下载安装,参考官方网站即可,代码与上面代码类似

测试结果可视化

不太熟悉用numbers,凑合着看一下就行

很明显,fast_whisper速度要更快一些

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/不正经/article/detail/680969
推荐阅读
相关标签
  

闽ICP备14008679号