当前位置:   article > 正文

Azure OpenAI 语音转语音聊天_azure openai 语音转语音聊天 restful

azure openai 语音转语音聊天 restful

总体而言:

一个文件—— 自建 python文件 openai-speech.py,代码粘贴即可:

两个依赖—— 使用 pip 安装 azure-cognitiveservices-speech,openai

三个可修改变量:

  • 若要更改语音识别语言,请将 en-US 替换为其他支持的语言。 例如,es-ES 代表西班牙语(西班牙)。 如果未指定语言,则默认语言为 en-US。 若要详细了解如何从多种使用的语言中进行识别,请参阅语言识别
  • 若要更改你听到的语音,请将 en-US-JennyMultilingualNeural 替换为另一个受支持的语音。 如果语音不使用从 Azure OpenAI 返回的文本的语言,则语音服务不会输出合成音频。
  • 若要使用不同的模型,请将 text-davinci-002 替换为另一个部署的 ID。 请记住,部署 ID 不一定与模型名称相同。 你是在 Azure OpenAI Studio 中创建部署时为其命名的。

四个环境变量:

  1. setx OPEN_AI_KEY your-openai-key
  2. setx OPEN_AI_ENDPOINT your-openai-endpoint
  3. setx SPEECH_KEY your-speech-key
  4. setx SPEECH_REGION your-speech-region

附代码:

  1. import os
  2. import azure.cognitiveservices.speech as speechsdk
  3. import openai
  4. # This example requires environment variables named "OPEN_AI_KEY" and "OPEN_AI_ENDPOINT"
  5. # Your endpoint should look like the following https://YOUR_OPEN_AI_RESOURCE_NAME.openai.azure.com/
  6. openai.api_key = os.environ.get('OPEN_AI_KEY')
  7. openai.api_base = os.environ.get('OPEN_AI_ENDPOINT')
  8. openai.api_type = 'azure'
  9. openai.api_version = '2022-12-01'
  10. # This will correspond to the custom name you chose for your deployment when you deployed a model.
  11. deployment_id='text-davinci-002'
  12. # This example requires environment variables named "SPEECH_KEY" and "SPEECH_REGION"
  13. speech_config = speechsdk.SpeechConfig(subscription=os.environ.get('SPEECH_KEY'), region=os.environ.get('SPEECH_REGION'))
  14. audio_output_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)
  15. audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True)
  16. # Should be the locale for the speaker's language.
  17. speech_config.speech_recognition_language="en-US"
  18. speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)
  19. # The language of the voice that responds on behalf of Azure OpenAI.
  20. speech_config.speech_synthesis_voice_name='en-US-JennyMultilingualNeural'
  21. speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_output_config)
  22. # Prompts Azure OpenAI with a request and synthesizes the response.
  23. def ask_openai(prompt):
  24. # Ask Azure OpenAI
  25. response = openai.Completion.create(engine=deployment_id, prompt=prompt, max_tokens=100)
  26. text = response['choices'][0]['text'].replace('\n', ' ').replace(' .', '.').strip()
  27. print('Azure OpenAI response:' + text)
  28. # Azure text to speech output
  29. speech_synthesis_result = speech_synthesizer.speak_text_async(text).get()
  30. # Check result
  31. if speech_synthesis_result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
  32. print("Speech synthesized to speaker for text [{}]".format(text))
  33. elif speech_synthesis_result.reason == speechsdk.ResultReason.Canceled:
  34. cancellation_details = speech_synthesis_result.cancellation_details
  35. print("Speech synthesis canceled: {}".format(cancellation_details.reason))
  36. if cancellation_details.reason == speechsdk.CancellationReason.Error:
  37. print("Error details: {}".format(cancellation_details.error_details))
  38. # Continuously listens for speech input to recognize and send as text to Azure OpenAI
  39. def chat_with_open_ai():
  40. while True:
  41. print("Azure OpenAI is listening. Say 'Stop' or press Ctrl-Z to end the conversation.")
  42. try:
  43. # Get audio from the microphone and then send it to the TTS service.
  44. speech_recognition_result = speech_recognizer.recognize_once_async().get()
  45. # If speech is recognized, send it to Azure OpenAI and listen for the response.
  46. if speech_recognition_result.reason == speechsdk.ResultReason.RecognizedSpeech:
  47. if speech_recognition_result.text == "Stop.":
  48. print("Conversation ended.")
  49. break
  50. print("Recognized speech: {}".format(speech_recognition_result.text))
  51. ask_openai(speech_recognition_result.text)
  52. elif speech_recognition_result.reason == speechsdk.ResultReason.NoMatch:
  53. print("No speech could be recognized: {}".format(speech_recognition_result.no_match_details))
  54. break
  55. elif speech_recognition_result.reason == speechsdk.ResultReason.Canceled:
  56. cancellation_details = speech_recognition_result.cancellation_details
  57. print("Speech Recognition canceled: {}".format(cancellation_details.reason))
  58. if cancellation_details.reason == speechsdk.CancellationReason.Error:
  59. print("Error details: {}".format(cancellation_details.error_details))
  60. except EOFError:
  61. break
  62. # Main
  63. try:
  64. chat_with_open_ai()
  65. except Exception as err:
  66. print("Encountered exception. {}".format(err))

参见:

Azure OpenAI 语音转语音聊天 - 语音服务 - Azure Cognitive Services | Microsoft Learn

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/繁依Fanyi0/article/detail/375087
推荐阅读
相关标签
  

闽ICP备14008679号