赞
踩
利用科大讯飞API来实现语音识别,利用Java SWT来封装界面。
语音识别的API可以免费试用5小时,许多厂家已经开放了语音识别的API例如百度,阿里等,这里使用科大讯飞的API来实现。其实也可以自己训练数据来实现语音识别的功能,只不过识别率可能不是太高,具体实现原理可以参考如下:日后有时间可以研究一下。
https://blog.ailemon.me/2018/08/29/asrt-a-chinese-speech-recognition-system/
https://github.com/nl8590687/ASRT_SpeechRecognition
声学模型通过采用卷积神经网络(CNN)和连接性时序分类(CTC)方法,使用大量中文语音数据集进行训练,将声音转录为中文拼音,并通过语言模型,将拼音序列转换为中文文本。
登录科大讯飞网址:https://www.xfyun.cn/services/lfasr
下载Java SDK
新建应用,获取appId以及secret
在SDK中配置appId以及secret
- # APP ID
- app_id=
- # secret key
- secret_key=
- # we support both http and https prototype
- lfasr_host=http://raasr.xfyun.cn/api
- # file piece size
- file_piece_size=10485760
- # store path: this is not the store path for the result json file, but the path for the file piece during upload
- store_path=F://
Demo中给出一个测试用例:
使用过程如下:
初始化LFASRClient实例
- // 初始化LFASRClient实例
- LfasrClientImp lc = null;
- try {
- lc = LfasrClientImp.initLfasrClient();
- } catch (LfasrException e) {
- // 初始化异常,解析异常描述信息
- Message initMsg = JSON.parseObject(e.getMessage(), Message.class);
- System.out.println("ecode=" + initMsg.getErr_no());
- System.out.println("failed=" + initMsg.getFailed());
- }
上传语音文件
-
- // 上传音频文件
- Message uploadMsg = lc.lfasrUpload(local_file, type, params);
-
- // 判断返回值
- int ok = uploadMsg.getOk();
- if (ok == 0) {
- // 创建任务成功
- task_id = uploadMsg.getData();
循环等待任务处理结果:
- // 循环等待音频处理结果
- while (true) {
- try {
- // 等待20s在获取任务进度
- Thread.sleep(sleepSecond * 1000);
- System.out.println("waiting ...");
- } catch (InterruptedException e) {
- e.printStackTrace();
- }
- try {
- // 获取处理进度
- Message progressMsg = lc.lfasrGetProgress(task_id);
-
- // 如果返回状态不等于0,则任务失败
- if (progressMsg.getOk() != 0) {
- System.out.println("task was fail. task_id:" + task_id);
- System.out.println("ecode=" + progressMsg.getErr_no());
- System.out.println("failed=" + progressMsg.getFailed());
-
- return;
- } else {
- ProgressStatus progressStatus = JSON.parseObject(progressMsg.getData(), ProgressStatus.class);
- if (progressStatus.getStatus() == 9) {
- // 处理完成
- System.out.println("task was completed. task_id:" + task_id);
- break;
- } else {
- // 未处理完成
- System.out.println("task is incomplete. task_id:" + task_id + ", status:" + progressStatus.getDesc());
- continue;
- }
- }
- } catch (LfasrException e) {
- // 获取进度异常处理,根据返回信息排查问题后,再次进行获取
- Message progressMsg = JSON.parseObject(e.getMessage(), Message.class);
- System.out.println("ecode=" + progressMsg.getErr_no());
- System.out.println("failed=" + progressMsg.getFailed());
- }
- }
获取最终结果:
- // 获取任务结果
- try {
- Message resultMsg = lc.lfasrGetResult(task_id);
- // 如果返回状态等于0,则获取任务结果成功
- if (resultMsg.getOk() == 0) {
- // 打印转写结果
- System.out.println(resultMsg.getData());
- System.out.println(Test.getFinalResult(resultMsg.getData()));
- } else {
- // 获取任务结果失败
- System.out.println("ecode=" + resultMsg.getErr_no());
- System.out.println("failed=" + resultMsg.getFailed());
- }
- } catch (LfasrException e) {
- // 获取结果异常处理,解析异常描述信息
- Message resultMsg = JSON.parseObject(e.getMessage(), Message.class);
- System.out.println("ecode=" + resultMsg.getErr_no());
- System.out.println("failed=" + resultMsg.getFailed());
- }
resultMsg.getData()返回一个json数组,里面有多个元素,在此将“onebest”元素取出拼接组成最终的输出文本。
- String str = "[{\"bg\":\"0\",\"ed\":\"2180\",\"onebest\":\"科大讯飞是中国最大!\",\"si\":\"0\",\"speaker\":\"0\","
- + "\"wordsResultList\":[{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"6\",\"wordEd\":\"114\",\"wordsName\":"
- + "\"科大讯飞\",\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"118\",\"wordEd\":\"147\",\"wordsName\""
- + ":\"是\",\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"148\",\"wordEd\":\"193\",\"wordsName\":\"中国\","
- + "\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"194\",\"wordEd\":\"213\",\"wordsName\":\"最\","
- + "\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"214\",\"wordEd\":\"218\",\"wordsName\":\"大\","
- + "\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"0.0000\",\"wordBg\":\"218\",\"wordEd\":\"218\",\"wordsName\":\"!\","
- + "\"wp\":\"p\"},{\"alternativeList\":[],\"wc\":\"0.0000\",\"wordBg\":\"218\",\"wordEd\":\"218\",\"wordsName\":\"\","
- + "\"wp\":\"g\"}]},{\"bg\":\"2190\",\"ed\":\"3080\",\"onebest\":\"的智能。\",\"si\":\"1\",\"speaker\":\"0\","
- + "\"wordsResultList\":[{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"15\",\"wordEd\":\"42\","
- + "\"wordsName\":\"的\",\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"47\",\"wordEd\":\"89\","
- + "\"wordsName\":\"智能\",\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"0.0000\",\"wordBg\":\"89\",\"wordEd\":\"89\","
- + "\"wordsName\":\"。\",\"wp\":\"p\"},{\"alternativeList\":[],\"wc\":\"0.0000\",\"wordBg\":\"89\",\"wordEd\":\"89\","
- + "\"wordsName\":\"\",\"wp\":\"g\"}]},{\"bg\":\"3090\",\"ed\":\"4950\",\"onebest\":\"语音技术提供商,\",\"si\":\"2\","
- + "\"speaker\":\"0\",\"wordsResultList\":[{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"4\",\"wordEd\":\"46\","
- + "\"wordsName\":\"语音\",\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"47\",\"wordEd\":\"92\","
- + "\"wordsName\":\"技术\",\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"93\",\"wordEd\":\"164\","
- + "\"wordsName\":\"提供商\",\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"0.0000\",\"wordBg\":\"164\",\"wordEd\":\"164\","
- + "\"wordsName\":\",\",\"wp\":\"p\"}]}]";
- public static String getFinalResult(String data){
-
- JSONArray ja = JSONArray.parseArray(data);
- StringBuilder sb = new StringBuilder();
- for(int i=0; i<ja.size(); i++){
- //System.out.println(ja.get(i));
- sb.append(JSON.parseObject(ja.get(i).toString()).get("onebest"));
- //System.out.println(JSON.parseObject(ja.get(i).toString()).get("onebest"));
- }
- return sb.toString();
- }
直接使用有点费劲,想利用SWT来封装一个客户端,这里使用Eclipse来开发,首先安装SWT环境
参考地址如下:https://www.cnblogs.com/xinyan123/p/6225194.html
下载SWT插件:https://www.eclipse.org/windowbuilder/download.php
将安装包features以及plugins放入到eclipse安装目录对应文件夹下,重启eclipse
新建SWT工程
新建一个ApplicationWindow
可以使用图形化界面来进行界面UI设计
SWT核心实现:开始转换按钮的实现逻辑
- //开始转换按钮
- Button startThansfer = new Button(container, SWT.NONE);
- startThansfer.addSelectionListener(new SelectionAdapter() {
- @Override
- public void widgetSelected(SelectionEvent e) {
- logDetailText.append(datePrefix + "开始转换........" + "\n");
- startThansfer.setEnabled(false);
- voicePath = voicePathText.getText();
- textPath = textPathText.getText();
- int status = 0;
- Callable<Integer> f = new TransferThread(logDetailText, countDownLatch, datePrefix, voicePath, textPath);
- //Callable<Integer> f = new TransferThreadAsyc(parent, logDetailText, countDownLatch, datePrefix, voicePath, textPath);
- try {
- status = f.call();
- } catch (Exception e2) {
- // TODO Auto-generated catch block
- e2.printStackTrace();
- }
- try {
- countDownLatch.await();
- } catch (InterruptedException e1) {
- // TODO Auto-generated catch block
- e1.printStackTrace();
- }
- if(status == 1){
- logDetailText.append(datePrefix + "转换完成" + "\n");
- }else{
- logDetailText.append(datePrefix + "转换失败" + "\n");
- }
- startThansfer.setEnabled(true);
- }
- });
转换执行线程工作类:
- package com.voice.text;
-
- import java.io.FileOutputStream;
- import java.util.HashMap;
- import java.util.concurrent.Callable;
- import java.util.concurrent.CountDownLatch;
-
- import org.eclipse.swt.widgets.Text;
-
- import com.alibaba.fastjson.JSON;
- import com.iflytek.msp.cpdb.lfasr.client.LfasrClientImp;
- import com.iflytek.msp.cpdb.lfasr.exception.LfasrException;
- import com.iflytek.msp.cpdb.lfasr.model.LfasrType;
- import com.iflytek.msp.cpdb.lfasr.model.Message;
- import com.iflytek.msp.cpdb.lfasr.model.ProgressStatus;
- import com.iflytek.voicecloud.lfasr.demo.Test;
-
- public class TransferThread implements Callable<Integer> {
-
- private Text logDetailText;
- private CountDownLatch countDownLatch;
- private LfasrType type = LfasrType.LFASR_STANDARD_RECORDED_AUDIO;
- private int sleepSecond = 20;
- private String datePrefix;
- private String voicePath;
- private String textPath;
-
- public TransferThread(Text logDetailText, CountDownLatch countDownLatch, String datePrefix, String voicePath, String textPath) {
- this.logDetailText = logDetailText;
- this.countDownLatch = countDownLatch;
- this.datePrefix = datePrefix;
- this.voicePath = voicePath;
- this.textPath = textPath;
- }
- @Override
- public Integer call() throws Exception {
- // 初始化LFASRClient实例
- LfasrClientImp lc = null;
- try {
- lc = LfasrClientImp.initLfasrClient();
- } catch (LfasrException e) {
- // 初始化异常,解析异常描述信息
- Message initMsg = JSON.parseObject(e.getMessage(), Message.class);
- logDetailText.append(datePrefix + "ecode=" + initMsg.getErr_no() + "\n");
- System.out.println("ecode=" + initMsg.getErr_no());
- logDetailText.append(datePrefix + "failed=" + initMsg.getFailed() + "\n");
- //System.out.println(datePrefix + "failed=" + initMsg.getFailed());
- countDownLatch.countDown();
- return -1;
- }
-
- // 获取上传任务ID
- String task_id = "";
- HashMap<String, String> params = new HashMap<String, String>();
- params.put("has_participle", "true");
- //合并后标准版开启电话版功能
- //params.put("has_seperate", "true");
- try {
- // 上传音频文件
- Message uploadMsg = lc.lfasrUpload(voicePath, type, params);
-
- // 判断返回值
- int ok = uploadMsg.getOk();
- if (ok == 0) {
- // 创建任务成功
- task_id = uploadMsg.getData();
- //System.out.println("创建任务成功 task_id=" + task_id);
- logDetailText.append(datePrefix + "创建任务成功 task_id=" + task_id + "\n");
- } else {
- // 创建任务失败-服务端异常
- //System.out.println(datePrefix + "ecode=" + uploadMsg.getErr_no());
- logDetailText.append(datePrefix + "ecode=" + uploadMsg.getErr_no() + "\n");
- //System.out.println(datePrefix + "failed=" + uploadMsg.getFailed());
- logDetailText.append(datePrefix + "failed=" + uploadMsg.getFailed() + "\n");
- countDownLatch.countDown();
- return -1;
- }
- } catch (LfasrException e) {
- // 上传异常,解析异常描述信息
- Message uploadMsg = JSON.parseObject(e.getMessage(), Message.class);
- //System.out.println(datePrefix + "ecode=" + uploadMsg.getErr_no());
- logDetailText.append(datePrefix + "ecode=" + uploadMsg.getErr_no() + "\n");
- //System.out.println(datePrefix + "failed=" + uploadMsg.getFailed());
- logDetailText.append(datePrefix + "failed=" + uploadMsg.getFailed() + "\n");
- countDownLatch.countDown();
- return -1;
- }
-
- // 循环等待音频处理结果
- while (true) {
- try {
- // 等待20s在获取任务进度
- Thread.sleep(sleepSecond * 1000);
- //System.out.println("waiting ...");
- logDetailText.append(datePrefix + "failed=" + "waiting ..." + "\n");
- } catch (InterruptedException e) {
- e.printStackTrace();
- }
- try {
- // 获取处理进度
- Message progressMsg = lc.lfasrGetProgress(task_id);
-
- // 如果返回状态不等于0,则任务失败
- if (progressMsg.getOk() != 0) {
- //System.out.println("task was fail. task_id:" + task_id);
- //System.out.println("ecode=" + progressMsg.getErr_no());
- //System.out.println("failed=" + progressMsg.getFailed());
- logDetailText.append(datePrefix + "task was fail. task_id:" + task_id + "\n");
- logDetailText.append(datePrefix + "ecode=" + progressMsg.getErr_no() + "\n");
- logDetailText.append(datePrefix + "failed=" + progressMsg.getFailed() + "\n");
- countDownLatch.countDown();
- return -1;
- } else {
- ProgressStatus progressStatus = JSON.parseObject(progressMsg.getData(), ProgressStatus.class);
- if (progressStatus.getStatus() == 9) {
- // 处理完成
- //System.out.println(datePrefix + "task was completed. task_id:" + task_id + "\n");
- logDetailText.append(datePrefix + "task was completed. task_id:" + task_id + "\n");
- break;
- } else {
- // 未处理完成
- //System.out.println(datePrefix + "task is incomplete. task_id:" + task_id + ", status:" + progressStatus.getDesc() + "\n");
- logDetailText.append(datePrefix + "task is incomplete. task_id:" + task_id + ", status:" + progressStatus.getDesc() + "\n");
- continue;
- }
- }
- } catch (LfasrException e) {
- // 获取进度异常处理,根据返回信息排查问题后,再次进行获取
- Message progressMsg = JSON.parseObject(e.getMessage(), Message.class);
- //System.out.println(datePrefix + "ecode=" + progressMsg.getErr_no() + "\n");
- //System.out.println(datePrefix + "failed=" + progressMsg.getFailed() + "\n");
- logDetailText.append(datePrefix + "ecode=" + progressMsg.getErr_no() + "\n");
- logDetailText.append(datePrefix + "failed=" + progressMsg.getFailed() + "\n");
- }
- }
-
- // 获取任务结果
- try {
- Message resultMsg = lc.lfasrGetResult(task_id);
- // 如果返回状态等于0,则获取任务结果成功
- if (resultMsg.getOk() == 0) {
- // 打印转写结果
- String result = Test.getFinalResult(resultMsg.getData());
- String output = textPath + "\\" + System.currentTimeMillis() + ".txt";
- FileOutputStream f = new FileOutputStream(output);
- f.write(result.getBytes());
- //System.out.println(result);
- logDetailText.append(datePrefix + "结果存放路径: " + output + "\n");
- logDetailText.append(datePrefix + "最终转换结果: " + "\n");
- logDetailText.append(datePrefix + result + "\n");
- } else {
- // 获取任务结果失败
- //System.out.println(datePrefix + "ecode=" + resultMsg.getErr_no() + "\n");
- //System.out.println(datePrefix + "failed=" + resultMsg.getFailed() + "\n");
- logDetailText.append(datePrefix + "ecode=" + resultMsg.getErr_no() + "\n");
- logDetailText.append(datePrefix + "failed=" + resultMsg.getFailed() + "\n");
- countDownLatch.countDown();
- return -1;
- }
- } catch (LfasrException e) {
- // 获取结果异常处理,解析异常描述信息
- Message resultMsg = JSON.parseObject(e.getMessage(), Message.class);
- //System.out.println(datePrefix + "ecode=" + resultMsg.getErr_no() + "\n");
- //System.out.println(datePrefix + "failed=" + resultMsg.getFailed() + "\n");
- logDetailText.append(datePrefix + "ecode=" + resultMsg.getErr_no() + "\n");
- logDetailText.append(datePrefix + "failed=" + resultMsg.getFailed() + "\n");
- countDownLatch.countDown();
- return -1;
- }
- countDownLatch.countDown();
- return 1;
- }
- }
整合代码,实现最终效果如下:
代码位置:https://github.com/ChenWenKaiVN/VoiceToText
下一阶段优化方向
1.主线程会出现假死现象。需要深入研究一下SWT UI线程与非UI线程的运行机制。
https://blog.csdn.net/dollyn/article/details/38582743/
2.研究一下进度条的问题,显示转换进度。
3.配置界面需要与SDK配置文件进一步相结合,许多变量还是写死在SDK配置文件中。
4.研究一下可执行jar的打包方法,将JRE一起加入到可执行jar中。
5.研究一下语音识别的技术原理
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。