赞
踩
想要构建自己的听力词库,把听力材料中不需要的词去掉,根据我提供的变化的词单截出来单词语音来练习。
使用windows,pycharm。
pip install SpeechRecognition
pip install pydub
ffmpeg:
官网:http://www.ffmpeg.org/
现在不像以前有那么多选项了,一个包就搞定:https://blog.csdn.net/qq_39382753/article/details/115939665
然后注意安装要重启cmd或者pycharm才生效配置
pocketsphinx:
https://www.lfd.uci.edu/~gohlke/pythonlibs/#pocketsphinx
搜索pocketsphin
下载对应的版本到本地即可(python 3.7就下cp37)
然后直接pip install 你下载的.wheel就可以啦
def convert_mp3_to_wav(filename,savename):
'''
将MP3格式转成wav格式
:param filename: x.mp3文件
:param savename: x.wav文件
:return: sound
'''
from pydub import AudioSegment
sound = AudioSegment.from_mp3(filename)
sound.export(savename, format="wav")
return sound
参考https://www.cnblogs.com/zhe-hello/p/13273523.html,
我比较了recognize_google()和recognize_sphinx() ,google会识别更准
def recognize_word(filename):
'''
打印wav文件的转写结果
:param filename: wav文件
:return: 转写结果 str
'''
import speech_recognition as sr
r = sr.Recognizer()
with sr.AudioFile(filename) as source:
audio = r.record(source) # read the entire audio file
res = r.recognize_google(audio) # 注意要挂梯子哦
res1 = res.split(" ")
result=" ".join(res1)
print(result,type(result))
return result
Google识别结果:draft version version rewrite rewrite revise revise
Google的可以说全对
sphinx识别结果:draft version we write we write you guys you guys
sphinx的结果看得我一脸问号
参考https://blog.csdn.net/wangqianqianya/article/details/89605298
def split_to_chunks_by_slience(sound,savedir='word_wav'):
from pydub.silence import split_on_silence
import os
if not os.path.exists(savedir):
os.mkdir(savedir)
chunks = split_on_silence(sound,
# must be silent for at least half a second,沉默半秒
min_silence_len=430,
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。