赞
踩
本文章结合官方教程给大家介绍如何利用OpenAI的Whisper和GPT-4模型来开发一个自动化会议记录生成器。这个应用程序可以转录会议音频
本教程假设您具备基本的Python知识,并拥有一个OpenAI API密钥。您可以使用本教程提供的音频文件或您自己的音频文件。
此外,您需要安装python-docx
和OpenAI
库。您可以创建一个新的Python环境,并使用以下命令安装所需的软件包:
- # 创建一个新的Python环境(可选)
- python -m venv myenv
- source myenv/bin/activate # 对于Windows系统,使用 myenv\Scripts\activate
-
- # 安装所需的软件包
- pip install python-docx openai
转录会议音频的第一步是将会议的音频文件传递给我们的/v1/audio
API。Whisper模型是驱动音频API的核心,它能够将口语转换为书面文本。首先,我们将不传递提示或温度(用于控制模型输出的可选参数),而是使用默认值。
- from openai import OpenAI
-
- # 设置OpenAI API密钥
- client = OpenAI(
- # defaults to os.environ.get("OPENAI_API_KEY")
- # api_key="My API Key",
- )
- from docx import Document
-
- # 音频文件路径
- audio_file_path = 'path/to/your/audio/file.mp3'
-
- # 打开音频文件并传递给API
- def transcribe_audio(audio_file_path):
- with open(audio_file_path, 'rb') as audio_file:
- transcription = client.audio.transcriptions.create("whisper-1", audio_file)
- return transcription['text']
在上面这个函数中,audio_file_path
是您要转录的音频文件的路径。该函数打开此文件并将其传递给Whisper ASR模型(whisper-1)进行转录。结果将以原始文本的形式返回。需要注意的是,client.audio.transcriptions.create
函数需要传递实际的音频文件,而不仅仅是本地或远程服务器上的文件路径。这意味着,如果您在一个可能不存储音频文件的服务器上运行此代码,您需要首先有一个预处理步骤,将音频文件下载到该设备上。
———————————————————————————————————————————
用GPT-4总结和分析转录文本 获得转录文本后,我们将通过Chat Completions API将其传递给GPT-4。GPT-4是OpenAI最先进的大语言模型,我们将用它来生成摘要、提取关键点和行动项,并进行情感分析。
本教程为每个任务使用不同的函数。这不是最有效的方法——您可以将这些指令放入一个函数中,但分开处理可能会提高摘要质量。
为了分解这些任务,我们定义了meeting_minutes
函数,该函数将作为此应用程序的主要函数:
- def abstract_summary_extraction(transcription):
- # 生成摘要的逻辑
- pass
-
- def key_points_extraction(transcription):
- # 提取关键点的逻辑
- pass
-
- def action_item_extraction(transcription):
- # 识别行动项的逻辑
- pass
-
- def sentiment_analysis(transcription):
- # 进行情感分析的逻辑
- pass
-
- def meeting_minutes(transcription):
- abstract_summary = abstract_summary_extraction(transcription)
- key_points = key_points_extraction(transcription)
- action_items = action_item_extraction(transcription)
- sentiment = sentiment_analysis(transcription)
- return {
- 'abstract_summary': abstract_summary,
- 'key_points': key_points,
- 'action_items': action_items,
- 'sentiment': sentiment
- }
在上面这个函数中,transcription
是我们从Whisper获得的文本。转录文本可以传递给其他四个函数,每个函数执行一个特定的任务:
abstract_summary_extraction
生成会议摘要:
- def abstract_summary_extraction(transcription):
- response = client.chat.completions.create(
- model="gpt-4",
- temperature=0,
- messages=[
- {
- "role": "system",
- "content": "You are a highly skilled AI trained in language comprehension and summarization. I would like you to read the following text and summarize it into a concise abstract paragraph. Aim to retain the most important points, providing a coherent and readable summary that could help a person understand the main points of the discussion without needing to read the entire text. Please avoid unnecessary details or tangential points."
- },
- {
- "role": "user",
- "content": transcription
- }
- ]
- )
- return response.choices[0].message.content
key_points_extraction
提取主要点:
- def key_points_extraction(transcription):
- response = client.chat.completions.create(
- model="gpt-4",
- temperature=0,
- messages=[
- {
- "role": "system",
- "content": "You are a proficient AI with a specialty in distilling information into key points. Based on the following text, identify and list the main points that were discussed or brought up. These should be the most important ideas, findings, or topics that are crucial to the essence of the discussion. Your goal is to provide a list that someone could read to quickly understand what was talked about."
- },
- {
- "role": "user",
- "content": transcription
- }
- ]
- )
- return response.choices[0].message.content
action_item_extraction
识别行动项:
- def action_item_extraction(transcription):
- response = client.chat.completions.create(
- model="gpt-4",
- temperature=0,
- messages=[
- {
- "role": "system",
- "content": "You are an AI expert in analyzing conversations and extracting action items. Please review the text and identify any tasks, assignments, or actions that were agreed upon or mentioned as needing to be done. These could be tasks assigned to specific individuals, or general actions that the group has decided to take. Please list these action items clearly and concisely."
- },
- {
- "role": "user",
- "content": transcription
- }
- ]
- )
- return response.choices[0].message.content
sentiment_analysis
进行情感分析:
- def sentiment_analysis(transcription):
- response = client.chat.completions.create(
- model="gpt-4",
- temperature=0,
- messages=[
- {
- "role": "system",
- "content": "As an AI with expertise in language and emotion analysis, your task is to analyze the sentiment of the following text. Please consider the overall tone of the discussion, the emotion conveyed by the language used, and the context in which words and phrases are used. Indicate whether the sentiment is generally positive, negative, or neutral, and provide brief explanations for your analysis where possible."
- },
- {
- "role": "user",
- "content": transcription
- }
- ]
- )
- return response.choices[0].message.content
如果您需要其他功能,也可以按照上面的框架添加。
———————————————————————————————————————————
导出会议记录
一旦我们生成了会议记录,将它们保存为易于分发的可读格式是有益的。常见的格式之一是Microsoft Word。python-docx
库是一个用于创建Word文档的流行开源库。如果您希望构建一个端到端的会议记录应用程序,可以考虑将导出步骤删除,而是将摘要内嵌到电子邮件中作为后续跟进。
为了处理导出过程,定义一个save_as_docx
函数,将原始文本转换为Word文档:
- from docx import Document
-
- def save_as_docx(minutes, filename):
- doc = Document()
- for key, value in minutes.items():
- # 将下划线替换为空格,并将每个单词的首字母大写,作为标题
- heading = ' '.join(word.capitalize() for word in key.split('_'))
- doc.add_heading(heading, level=1)
- doc.add_paragraph(value)
- # 在各部分之间添加换行
- doc.add_paragraph()
- doc.save(filename)
最后,您可以将所有部分整合在一起,从音频文件生成会议记录:
- audio_file_path = "Earningscall.wav"
- transcription = transcribe_audio(audio_file_path)
- minutes = meeting_minutes(transcription)
- print(minutes)
-
- save_as_docx(minutes, 'meeting_minutes.docx')
这段代码将转录音频文件Earningscall.wav
,生成会议记录,打印它们,然后将其保存为名为meeting_minutes.docx
的Word文档。
总结:
我想说这个示例的关键点其实在于 提示词, 正确的使用提示词,能让你从GPT获得更精确的更高质量的反馈,我建议大家可以多阅读 Prompt engineering - OpenAI API,谢谢大家!我们下期见。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。