赞
踩
在3分钟内使用OpenAI GPT5,OpenAI Whisper和Coqui TTS构建您自己的个人语音AI助手,例如Alexa
将AI工具(OpenAI GPT3,OpenAI Whisper,Coqui TTS)放在一起以创建基本个人语音AI助手的教程。
在 5 分钟内引导您了解如何构建自己的简单 Alexa。这是一个非常简单的语音AI助手,您可以使用Python脚本从头开始创建。
步骤#0:设置
pip install -U openai-whisper
pip install sounddevice
pip install scipy
pip install openai
pip install python-dotenv
pip install TTS
步骤#1:提出问题并录制您的声音。
import sounddevice as sd
from scipy.io.wavfile import write
# 采样频率
# 无论原始音频文件中使用的采样率是多少,音频信号都会被重新采样为 16kHz
# 通过 ffmpeg任何高于 16kHz 的频率都能正常工作。
# https://github.com/openai/whisper/discussions/870.
freq = 44100
# 以秒为单位的录音时长
duration = int(input("select duration of the audio: "))
# 给定的持续时间和采样频率
recording = sd.rec(int(duration * freq),
samplerate=freq, channels=2)
# 录制指定秒数的音频
sd.wait()
write("question1.wav", freq, recording)
步骤#2:将音频转换为文本。
import whisper
model = whisper.load_model("base")
result = model.transcribe(audio_path)
'''
result looks like this:
{'text': ' How do you follow Will Smith in the snow?',
'segments': [{'id': 0,
'seek': 0,
'start': 0.0,
'end': 3.24,
'text': ' How do you follow Will Smith in the snow?',
'tokens': [50364,
1012,
360,
291,
1524,
3099,
8538,
294,
264,
5756,
30,
50526],
'temperature': 0.0,
'avg_logprob': -0.2659930999462421,
'compression_ratio': 0.8723404255319149,
'no_speech_prob': 0.02715839259326458}],
'language': 'en'}
步骤#3:使用ChatGPT回答问题。
import os
import openai
from dotenv import load_dotenv
load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
prompt = result['text'].strip()
# API reference: https://platform.openai.com/docs/api-reference/completions/create
response = openai.Completion.create(
model="text-davinci-003",
prompt=prompt,
max_tokens=1000,
temperature=0.6
)
'''
response =
<OpenAIObject text_completion id=cmpl-xxx at xxx> JSON: {
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"text": "\n\nIt's not possible to physically follow Will Smith in the snow unless he invites you to join him. However, you can stay up to date with what he is doing in the snow by following him on social media or watching him in any interviews or videos he posts online."
}
],
"created": 1679637674,
"id": "cmpl-xxx",
"model": "text-davinci-003",
"object": "text_completion",
"usage": {
"completion_tokens": 56,
"prompt_tokens": 10,
"total_tokens": 66
}
}
步骤#4:将文本转换为音频。 text = response["choices"][0]["text"] tts.tts_to_file(text=text, speaker=tts.speakers[0], language=tts.languages[0], file_path=f"{questions_path}/output.wav") 您可以在我的GitHub上的文件或源代码中看到结果。.wav
就是这样!现在,我将开始研究如何构建以将其投入生产。如果你喜欢这个故事,请关注我,看看我的AI之旅!
永不停止学习,
gTTS:
#https://www.geeksforgeeks.org/convert-text-speech-python/
# Import the required module for text
# to speech conversion
from gtts import gTTS
# This module is imported so that we can
# play the converted audio
import os
# The text that you want to convert to audio
mytext = 'Welcome to geeksforgeeks,life is short, I love python!'
# Language in which you want to convert
language = 'en'
# Passing the text and language to the engine,
# here we have marked slow=False. Which tells
# the module that the converted audio should
# have a high speed
myobj = gTTS(text=mytext, lang=language, slow=False)
# Saving the converted audio in a mp3 file named
# welcome
myobj.save("welcome.mp3")
# Playing the converted file
os.system("welcome.mp3")
mp3提示打开播放器
如果你只想要源代码,我的Github链接就在这里。
对于用Python实现文本转语音,追求自然效果的话,我推荐使用以下几个语音合成库:
本文由 mdnice 多平台发布
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。