本文将向您介绍如何向系统提供语音识别的 SpeechRecognizer 服务,3rd Party App 如何使用它们,以及系统地联系这两者。



首先我们得提供识别服务的实现,简单来说继承 RecognitionService 实现最重要的几个抽象方法即可:

  1. 首先可以定义抽象的识别 Engine 的接口 IRecognitionEngine;

  2. 在 RecognitionService 启动的时候获取识别 engine 提供商的实现实例;

  3. 在 onStartListening() 里解析识别请求 Intent 中的参数,比如语言、最大结果数等信息,封装成 json 字符串传递给 engine 的开始识别。那么 Engine 也需要依据参数进行识别实现方面的调整,并将识别过程中相应的状态、结果返回,比如开始说话 beginningOfSpeech()、结束说话 endOfSpeech()、中间结果 partialResults() 等;

  4. onStopListening() 里调用 engine 的停止识别,同样需要 engine 回传结果,比如最终识别结果 results();

  5. onCancel() 里执行 engine 提供的 release() 进行识别 engine 的解绑、资源释放。

  1. interface IRecognitionEngine {
  2. fun init()
  3. fun startASR(parameter: String, callback: Callback?)
  4. fun stopASR(callback: Callback?)
  5. fun release(callback: Callback?)
  6. }
  7. class CommonRecognitionService : RecognitionService() {
  8. private val recognitionEngine: IRecognitionEngine by lazy {
  9. RecognitionProvider.provideRecognition()
  10. }
  11. override fun onCreate() {
  12. super.onCreate()
  13. recognitionEngine.init()
  14. }
  15. override fun onStartListening(intent: Intent?, callback: Callback?) {
  16. val params: String = "" // Todo parse parameter from intent
  17. recognitionEngine.startASR(params, callback)
  18. }
  19. override fun onStopListening(callback: Callback?) {
  20. recognitionEngine.stopASR(callback)
  21. }
  22. override fun onCancel(callback: Callback?) {
  23. recognitionEngine.release(callback)
  24. }
  25. }

当然不要忘记在 Manifest 中声明:

  1. <service
  2. android:name=".recognition.service.CommonRecognitionService"
  3. android:exported="true">
  4. <intent-filter>
  5. <action android:name="android.speech.RecognitionService"/>
  6. </intent-filter>
  7. </service>



首先得声明 capture audio 的 Runtime 权限,还需补充运行时权限的代码逻辑。

  1. <manifest ... >
  2. <uses-configuration android:name="android.permission.RECORD_AUDIO"/>
  3. </manifest>

另外,Android 11 以上的话,需要额外添加对识别服务的包名 query 声明。

  1. <manifest ... >
  2. ...
  3. <queries>
  4. <intent>
  5. <action
  6. android:name="android.speech.RecognitionService" />
  7. </intent>
  8. </queries>
  9. </manifest>

权限满足之后,最好先检查整个系统里是否有 Recognition 服务可用,没有的话直接结束即可。

  1. class RecognitionHelper(val context: Context) {
  2. fun prepareRecognition(): Boolean {
  3. if (!SpeechRecognizer.isRecognitionAvailable(context)) {
  4. Log.e("RecognitionHelper", "System has no recognition service yet.")
  5. return false
  6. }
  7. ...
  8. }
  9. }

有可用服务的话,通过 SpeechRecognizer 提供的静态方法创建调用识别的入口实例,该方法必须在主线程调用

  1. class RecognitionHelper(val context: Context) : RecognitionListener{
  2. private lateinit var recognizer: SpeechRecognizer
  3. fun prepareRecognition(): Boolean {
  4. ...
  5. recognizer = SpeechRecognizer.createSpeechRecognizer(context)
  6. ...
  7. }
  8. }


  1. public static SpeechRecognizer createSpeechRecognizer (Context context,
  2. ComponentName serviceComponent)

接下来就是设置 Recognition 的监听器,对应着识别过程中各种状态,比如:

  • onPartialResults() 返回的中间识别结果,通过 SpeechRecognizer#RESULTS_RECOGNITION key 去 Bundle 中获取识别字符串 getStringArrayList(String);

  • onResults() 将返回最终识别的结果,解析办法同上;

  • onBeginningOfSpeech(): 检测到说话开始;

  • onEndOfSpeech(): 检测到说话结束;

  • onError() 将返回各种错误,和 SpeechRecognizer#ERROR_XXX 中各数值相对应,例如没有麦克风权限的话,会返回 ERROR_INSUFFICIENT_PERMISSIONS;

  • 等等。

  1. class RecognitionHelper(val context: Context) : RecognitionListener{
  2. ...
  3. fun prepareRecognition(): Boolean {
  4. ...
  5. recognizer.setRecognitionListener(this)
  6. return true
  7. }
  8. override fun onReadyForSpeech(p0: Bundle?) {
  9. }
  10. override fun onBeginningOfSpeech() {
  11. }
  12. override fun onRmsChanged(p0: Float) {
  13. }
  14. override fun onBufferReceived(p0: ByteArray?) {
  15. }
  16. override fun onEndOfSpeech() {
  17. }
  18. override fun onError(p0: Int) {
  19. }
  20. override fun onResults(p0: Bundle?) {
  21. }
  22. override fun onPartialResults(p0: Bundle?) {
  23. }
  24. override fun onEvent(p0: Int, p1: Bundle?) {
  25. }
  26. }

之后创建识别的必要 Intent 信息并启动,信息包括:


  • EXTRA_PARTIAL_RESULTS: 可选,是否要求识别服务回传识别途中的结果,默认 false;

  • EXTRA_MAX_RESULTS: 可选,设置允许服务返回的最多结果数值,int 类型;

  • EXTRA_LANGUAGE: 可选,设置识别语言,默认情况下是 Locale.getDefault() 的地区语言 (笔者使用的是 Google Assistant 提供的识别服务,暂不支持中文,所以此处配置的 Locale 为 ENGLISH);

  • 等等。

另外,需要留意两点:1. 此方法必须在上述监听器设置之后进行;2. 该方法得在主线程发起:

  1. class RecognitionHelper(val context: Context) : RecognitionListener{
  2. ...
  3. fun startRecognition() {
  4. val intent = createRecognitionIntent()
  5. recognizer.startListening(intent)
  6. }
  7. ...
  8. }
  9. fun createRecognitionIntent() = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH).apply {
  10. putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
  11. putExtra(RecognizerIntent.EXTRA_PARTIAL_RESULTS, true)
  12. putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 3)
  13. putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.ENGLISH)
  14. }

下面我们添加一个布局调用上述的 RecognitionHelper 进行识别的初始化和启动,并将结果进行展示。


同时添加和 UI 交互的中间识别结果和最终识别结果的 interface,将 RecognitionListener 的数据带回。

  1. interface ASRResultListener {
  2. fun onPartialResult(result: String)
  3. fun onFinalResult(result: String)
  4. }
  5. class RecognitionHelper(private val context: Context) : RecognitionListener {
  6. ...
  7. private lateinit var mResultListener: ASRResultListener
  8. fun prepareRecognition(resultListener: ASRResultListener): Boolean {
  9. ...
  10. mResultListener = resultListener
  11. ...
  12. }
  13. ...
  14. override fun onPartialResults(bundle: Bundle?) {
  15. bundle?.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)?.let {
  16. Log.d(
  17. "RecognitionHelper", "onPartialResults() with:$bundle" +
  18. " results:$it"
  19. )
  20. mResultListener.onPartialResult(it[0])
  21. }
  22. }
  23. override fun onResults(bundle: Bundle?) {
  24. bundle?.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)?.let {
  25. Log.d(
  26. "RecognitionHelper", "onResults() with:$bundle" +
  27. " results:$it"
  28. )
  29. mResultListener.onFinalResult(it[0])
  30. }
  31. }
  32. }

接着,Activity 实现该接口,将数据展示到 TextView,为了能够肉眼分辨中间结果的识别过程,在更新 TextView 前进行 300ms 的等待。

  1. class RecognitionActivity : AppCompatActivity(), ASRResultListener {
  2. private lateinit var binding: RecognitionLayoutBinding
  3. private val recognitionHelper: RecognitionHelper by lazy {
  4. RecognitionHelper(this)
  5. }
  6. private var updatingTextTimeDelayed = 0L
  7. private val mainHandler = Handler(Looper.getMainLooper())
  8. override fun onCreate(savedInstanceState: Bundle?) {
  9. ...
  10. if (!recognitionHelper.prepareRecognition(this)) {
  11. Toast.makeText(this, "Recognition not available", Toast.LENGTH_SHORT).show()
  12. return
  13. }
  14. binding.start.setOnClickListener {
  15. Log.d("RecognitionHelper", "startRecognition()")
  16. recognitionHelper.startRecognition()
  17. }
  18. binding.stop.setOnClickListener {
  19. Log.d("RecognitionHelper", "stopRecognition()")
  20. recognitionHelper.stopRecognition()
  21. }
  22. }
  23. override fun onStop() {
  24. super.onStop()
  25. Log.d("RecognitionHelper", "onStop()")
  26. recognitionHelper.releaseRecognition()
  27. }
  28. override fun onPartialResult(result: String) {
  29. Log.d("RecognitionHelper", "onPartialResult() with result:$result")
  30. updatingTextTimeDelayed += 300L
  31. mainHandler.postDelayed(
  32. {
  33. Log.d("RecognitionHelper", "onPartialResult() updating")
  34. binding.recoAsr.text = result
  35. }, updatingTextTimeDelayed
  36. )
  37. }
  38. override fun onFinalResult(result: String) {
  39. Log.d("RecognitionHelper", "onFinalResult() with result:$result")
  40. updatingTextTimeDelayed += 300L
  41. mainHandler.postDelayed(
  42. {
  43. Log.d("RecognitionHelper", "onFinalResult() updating")
  44. binding.recoAsr.text = result
  45. }, updatingTextTimeDelayed
  46. )
  47. }
  48. }

我们点击 "START RECOGNITION" button,然后可以看到手机右上角显示了 mic 录音中,当我们说出 "Can you introduce yourself" 后,TextView 能够逐步上屏,呈现打字机的效果。


下面是过程中的 log,也反映了识别过程:

  1. // 初始化
  2. 08-15 22:43:13.963 6879 6879 D RecognitionHelper: onCreate()
  3. 08-15 22:43:14.037 6879 6879 E RecognitionHelper: audio recording permission granted
  4. 08-15 22:43:14.050 6879 6879 D RecognitionHelper: onStart()
  5. // 开始识别
  6. 08-15 22:43:41.491 6879 6879 D RecognitionHelper: startRecognition()
  7. 08-15 22:43:41.577 6879 6879 D RecognitionHelper: onReadyForSpeech()
  8. 08-15 22:43:41.776 6879 6879 D RecognitionHelper: onRmsChanged() with:-2.0
  9. ...
  10. 08-15 22:43:46.532 6879 6879 D RecognitionHelper: onRmsChanged() with:-0.31999993
  11. // 检测到开始说话
  12. 08-15 22:43:46.540 6879 6879 D RecognitionHelper: onBeginningOfSpeech()
  13. // 第 1 个识别结果:Can
  14. 08-15 22:43:46.541 6879 6879 D RecognitionHelper: onPartialResults() with:Bundle[{results_recognition=[Can], android.speech.extra.UNSTABLE_TEXT=[]}] results:[Can]
  15. 08-15 22:43:46.541 6879 6879 D RecognitionHelper: onPartialResult() with result:Can
  16. // 第 2 个识别结果:Can you
  17. 08-15 22:43:46.542 6879 6879 D RecognitionHelper: onPartialResults() with:Bundle[{results_recognition=[Can you], android.speech.extra.UNSTABLE_TEXT=[]}] results:[Can you]
  18. 08-15 22:43:46.542 6879 6879 D RecognitionHelper: onPartialResult() with result:Can you
  19. // 第 3 个识别结果:Can you in
  20. 08-15 22:43:46.542 6879 6879 D RecognitionHelper: onPartialResults() with:Bundle[{results_recognition=[Can you in], android.speech.extra.UNSTABLE_TEXT=[]}] results:[Can you in]
  21. 08-15 22:43:46.542 6879 6879 D RecognitionHelper: onPartialResult() with result:Can you in
  22. // 第 4 个识别结果:Can you intro
  23. 08-15 22:43:46.542 6879 6879 D RecognitionHelper: onPartialResults() with:Bundle[{results_recognition=[Can you intro], android.speech.extra.UNSTABLE_TEXT=[]}] results:[Can you intro]
  24. 08-15 22:43:46.542 6879 6879 D RecognitionHelper: onPartialResult() with result:Can you intro
  25. // 第 n 个识别结果:Can you introduce yourself
  26. 08-15 22:43:46.542 6879 6879 D RecognitionHelper: onPartialResults() with:Bundle[{results_recognition=[Can you introduce yourself], android.speech.extra.UNSTABLE_TEXT=[]}] results:[Can you introduce yourself]
  27. 08-15 22:43:46.542 6879 6879 D RecognitionHelper: onPartialResult() with result:Can you introduce yourself
  28. // 检测到停止说话
  29. 08-15 22:43:46.543 6879 6879 D RecognitionHelper: onEndOfSpeech()
  30. 08-15 22:43:46.543 6879 6879 D RecognitionHelper: onEndOfSpeech()
  31. 08-15 22:43:46.545 6879 6879 D RecognitionHelper: onResults() with:Bundle[{results_recognition=[Can you introduce yourself], confidence_scores=[0.0]}] results:[Can you introduce yourself]
  32. // 识别到最终结果:Can you introduce yourself
  33. 08-15 22:43:46.545 6879 6879 D RecognitionHelper: onFinalResult() with result:Can you introduce yourself



SpeechRecognizer 没有像 Text-to-speech 一样在设置中提供独立的设置入口,其默认 App 由 VoiceInteraction 联动设置。

但如下命令可以 dump 出系统默认的识别服务。

adb shell settings get secure voice_recognition_service

当在模拟器中 dump 的话,可以看到默认搭载的是 Google 的识别服务。


在三星设备中 dump 的话,则是 Samsung 提供的识别服务。


我们从请求识别中提及的几个 API 入手探究一下识别服务的实现原理。


检查服务是否可用的实现很简单,即是用 Recognition 专用的 Action (*"android.speech.RecognitionService"*) 去 PackageManager 中检索,能够启动的 App 存在 1 个的话,即认为系统有识别服务可用。

  1. public static boolean isRecognitionAvailable(final Context context) {
  2. final List<ResolveInfo> list = context.getPackageManager().queryIntentServices(
  3. new Intent(RecognitionService.SERVICE_INTERFACE), 0);
  4. return list != null && list.size() != 0;
  5. }


正如【如何请求识别?】章节中讲述的,调用静态方法 createSpeechRecognizer() 完成初始化,内部将检查 Context 是否存在、依据是否指定识别服务的包名决定是否记录目标的服务名称。

  1. public static SpeechRecognizer createSpeechRecognizer(final Context context) {
  2. return createSpeechRecognizer(context, null);
  3. }
  4. public static SpeechRecognizer createSpeechRecognizer(final Context context,
  5. final ComponentName serviceComponent) {
  6. if (context == null) {
  7. throw new IllegalArgumentException("Context cannot be null");
  8. }
  9. checkIsCalledFromMainThread();
  10. return new SpeechRecognizer(context, serviceComponent);
  11. }
  12. private SpeechRecognizer(final Context context, final ComponentName serviceComponent) {
  13. mContext = context;
  14. mServiceComponent = serviceComponent;
  15. mOnDevice = false;
  16. }

得到 SpeechRecognizer 之后调用 setRecognitionListener() 则稍微复杂些:

  1. 检查调用源头是否属于主线程;

  2. 创建专用 Message MSG_CHANGE_LISTENER;

  3. 如果系统处理 Recognition 请求的服务 SpeechRecognitionManagerService 尚未建立连接,先将该 Message 排入 Pending Queue,等后续发起识别的时候创建连接后会将 Message 发往 Handler;

  4. 反之直接放入 Handler 等待调度。

  1. public void setRecognitionListener(RecognitionListener listener) {
  2. checkIsCalledFromMainThread();
  3. putMessage(Message.obtain(mHandler, MSG_CHANGE_LISTENER, listener));
  4. }
  5. private void putMessage(Message msg) {
  6. if (mService == null) {
  7. mPendingTasks.offer(msg);
  8. } else {
  9. mHandler.sendMessage(msg);
  10. }
  11. }

而 Handler 通过 handleChangeListener() 将 Listener 实例更新。

  1. private Handler mHandler = new Handler(Looper.getMainLooper()) {
  2. @Override
  3. public void handleMessage(Message msg) {
  4. switch (msg.what) {
  5. ...
  7. handleChangeListener((RecognitionListener) msg.obj);
  8. break;
  9. ...
  10. }
  11. }
  12. };
  13. private void handleChangeListener(RecognitionListener listener) {
  14. if (DBG) Log.d(TAG, "handleChangeListener, listener=" + listener);
  15. mListener.mInternalListener = listener;
  16. }


startListening() 首先将确保识别请求的 Intent 不为空,否则弹出 "intent must not be null" 的提示,接着检查调用线程是否是主线程,反之抛出 "SpeechRecognizer should be used only from the application's main thread" 的 Exception。

然后就是确保服务是准备妥当的,否则调用 connectToSystemService() 建立识别服务的连接。

  1. public void startListening(final Intent recognizerIntent) {
  2. if (recognizerIntent == null) {
  3. throw new IllegalArgumentException("intent must not be null");
  4. }
  5. checkIsCalledFromMainThread();
  6. if (mService == null) {
  7. // First time connection: first establish a connection, then dispatch #startListening.
  8. connectToSystemService();
  9. }
  10. putMessage(Message.obtain(mHandler, MSG_START, recognizerIntent));
  11. }

connectToSystemService() 的第一步是调用 getSpeechRecognizerComponentName() 获取识别服务的组件名称,一种是来自于请求 App 的指定,一种是来自 SettingsProvider 中存放的当前识别服务的包名 VOICE_RECOGNITION_SERVICE,其实就是和 VoiceInteraction 的 App 一致。如果包名不存在的话结束。

包名确实存在的话,通过 IRecognitionServiceManager.aidl 向 SystemServer 中管理语音识别的 SpeechRecognitionManagerService 系统服务发送创建 Session 的请求。

  1. /** Establishes a connection to system server proxy and initializes the session. */
  2. private void connectToSystemService() {
  3. if (!maybeInitializeManagerService()) {
  4. return;
  5. }
  6. ComponentName componentName = getSpeechRecognizerComponentName();
  7. if (!mOnDevice && componentName == null) {
  8. mListener.onError(ERROR_CLIENT);
  9. return;
  10. }
  11. try {
  12. mManagerService.createSession(
  13. componentName,
  14. mClientToken,
  15. mOnDevice,
  16. new IRecognitionServiceManagerCallback.Stub(){
  17. @Override
  18. public void onSuccess(IRecognitionService service) throws RemoteException {
  19. mService = service;
  20. while (!mPendingTasks.isEmpty()) {
  21. mHandler.sendMessage(mPendingTasks.poll());
  22. }
  23. }
  24. @Override
  25. public void onError(int errorCode) throws RemoteException {
  26. mListener.onError(errorCode);
  27. }
  28. });
  29. } catch (RemoteException e) {
  30. e.rethrowFromSystemServer();
  31. }
  32. }

SpeechRecognitionManagerService 的处理是调用 SpeechRecognitionManagerServiceImpl 实现。

  1. // SpeechRecognitionManagerService.java
  2. final class SpeechRecognitionManagerServiceStub extends IRecognitionServiceManager.Stub {
  3. @Override
  4. public void createSession(
  5. ComponentName componentName,
  6. IBinder clientToken,
  7. boolean onDevice,
  8. IRecognitionServiceManagerCallback callback) {
  9. int userId = UserHandle.getCallingUserId();
  10. synchronized (mLock) {
  11. SpeechRecognitionManagerServiceImpl service = getServiceForUserLocked(userId);
  12. service.createSessionLocked(componentName, clientToken, onDevice, callback);
  13. }
  14. }
  15. ...
  16. }

SpeechRecognitionManagerServiceImpl 则是交给 RemoteSpeechRecognitionService 类完成和 App 识别服务的绑定,可以看到 RemoteSpeechRecognitionService 将负责和识别服务的通信。

  1. // SpeechRecognitionManagerServiceImpl.java
  2. void createSessionLocked( ... ) {
  3. ...
  4. RemoteSpeechRecognitionService service = createService(creatorCallingUid, serviceComponent);
  5. ...
  6. service.connect().thenAccept(binderService -> {
  7. if (binderService != null) {
  8. try {
  9. callback.onSuccess(new IRecognitionService.Stub() {
  10. @Override
  11. public void startListening( ... )
  12. throws RemoteException {
  13. ...
  14. service.startListening(recognizerIntent, listener, attributionSource);
  15. }
  16. ...
  17. });
  18. } catch (RemoteException e) {
  19. tryRespondWithError(callback, SpeechRecognizer.ERROR_CLIENT);
  20. }
  21. } else {
  22. tryRespondWithError(callback, SpeechRecognizer.ERROR_CLIENT);
  23. }
  24. });
  25. }

当和识别服务 App 的连接建立成功或者已经存在的话,发送 MSG_START 的 Message,Main Handler 则是调用 handleStartListening() 继续。其首先会再度检查 mService 是否存在,避免引发 NPE。

接着,向该 AIDL 接口代理对象发送开始聆听的请求。

  1. private Handler mHandler = new Handler(Looper.getMainLooper()) {
  2. @Override
  3. public void handleMessage(Message msg) {
  4. switch (msg.what) {
  5. case MSG_START:
  6. handleStartListening((Intent) msg.obj);
  7. break;
  8. ...
  9. }
  10. }
  11. };
  12. private void handleStartListening(Intent recognizerIntent) {
  13. if (!checkOpenConnection()) {
  14. return;
  15. }
  16. try {
  17. mService.startListening(recognizerIntent, mListener, mContext.getAttributionSource());
  18. }
  19. ...
  20. }

该 AIDL 的定义在如下文件中:

  1. // android/speech/IRecognitionService.aidl
  2. oneway interface IRecognitionService {
  3. void startListening(in Intent recognizerIntent, in IRecognitionListener listener,
  4. in AttributionSource attributionSource);
  5. void stopListening(in IRecognitionListener listener);
  6. void cancel(in IRecognitionListener listener, boolean isShutdown);
  7. ...
  8. }

该 AIDL 的实现在系统的识别管理类 SpeechRecognitionManagerServiceImpl 中:

  1. // com/android/server/speech/SpeechRecognitionManagerServiceImpl.java
  2. void createSessionLocked( ... ) {
  3. ...
  4. service.connect().thenAccept(binderService -> {
  5. if (binderService != null) {
  6. try {
  7. callback.onSuccess(new IRecognitionService.Stub() {
  8. @Override
  9. public void startListening( ...) {
  10. attributionSource.enforceCallingUid();
  11. if (!attributionSource.isTrusted(mMaster.getContext())) {
  12. attributionSource = mMaster.getContext()
  13. .getSystemService(PermissionManager.class)
  14. .registerAttributionSource(attributionSource);
  15. }
  16. service.startListening(recognizerIntent, listener, attributionSource);
  17. }
  18. ...
  19. });
  20. } ...
  21. } else {
  22. tryRespondWithError(callback, SpeechRecognizer.ERROR_CLIENT);
  23. }
  24. });
  25. }

此后还要经过一层 RemoteSpeechRecognitionService 的中转:

  1. // com/android/server/speech/RemoteSpeechRecognitionService.java
  2. void startListening(Intent recognizerIntent, IRecognitionListener listener,
  3. @NonNull AttributionSource attributionSource) {
  4. ...
  5. synchronized (mLock) {
  6. if (mSessionInProgress) {
  7. tryRespondWithError(listener, SpeechRecognizer.ERROR_RECOGNIZER_BUSY);
  8. return;
  9. }
  10. mSessionInProgress = true;
  11. mRecordingInProgress = true;
  12. mListener = listener;
  13. mDelegatingListener = new DelegatingListener(listener, () -> {
  14. synchronized (mLock) {
  15. resetStateLocked();
  16. }
  17. });
  18. final DelegatingListener listenerToStart = this.mDelegatingListener;
  19. run(service ->
  20. service.startListening(
  21. recognizerIntent,
  22. listenerToStart,
  23. attributionSource));
  24. }
  25. }

最后调用具体服务的实现,自然位于 RecognitionService 中,该 Binder 线程向主线程发送 MSG_START_LISTENING Message:

  1. /** Binder of the recognition service */
  2. private static final class RecognitionServiceBinder extends IRecognitionService.Stub {
  3. ...
  4. @Override
  5. public void startListening(Intent recognizerIntent, IRecognitionListener listener,
  6. @NonNull AttributionSource attributionSource) {
  7. final RecognitionService service = mServiceRef.get();
  8. if (service != null) {
  9. service.mHandler.sendMessage(Message.obtain(service.mHandler,
  10. MSG_START_LISTENING, service.new StartListeningArgs(
  11. recognizerIntent, listener, attributionSource)));
  12. }
  13. }
  14. ...
  15. }
  16. private final Handler mHandler = new Handler() {
  17. @Override
  18. public void handleMessage(Message msg) {
  19. switch (msg.what) {
  21. StartListeningArgs args = (StartListeningArgs) msg.obj;
  22. dispatchStartListening(args.mIntent, args.mListener, args.mAttributionSource);
  23. break;
  24. ...
  25. }
  26. }
  27. };

Handler 接受一样将具体事情交由 dispatchStartListening() 继续,最重要的内容是检查发起识别的 Intent 中是否提供了 EXTRA_AUDIO_SOURCE 活跃音频来源,或者请求的 App 是否具备 RECORD_AUDIO 的 permission。

  1. private void dispatchStartListening(Intent intent, final IRecognitionListener listener,
  2. @NonNull AttributionSource attributionSource) {
  3. try {
  4. if (mCurrentCallback == null) {
  5. boolean preflightPermissionCheckPassed =
  6. intent.hasExtra(RecognizerIntent.EXTRA_AUDIO_SOURCE)
  7. || checkPermissionForPreflightNotHardDenied(attributionSource);
  8. if (preflightPermissionCheckPassed) {
  9. mCurrentCallback = new Callback(listener, attributionSource);
  10. RecognitionService.this.onStartListening(intent, mCurrentCallback);
  11. }
  12. if (!preflightPermissionCheckPassed || !checkPermissionAndStartDataDelivery()) {
  13. listener.onError(SpeechRecognizer.ERROR_INSUFFICIENT_PERMISSIONS);
  14. if (preflightPermissionCheckPassed) {
  15. // If we attempted to start listening, cancel the callback
  16. RecognitionService.this.onCancel(mCurrentCallback);
  17. dispatchClearCallback();
  18. }
  19. }
  20. ...
  21. }
  22. } catch (RemoteException e) {
  23. Log.d(TAG, "onError call from startListening failed");
  24. }
  25. }

任一条件满足的话,调用服务实现的 onStartListening 方法发起识别,具体逻辑由各自的服务决定,其最终将调用 Callback 返回识别状态和结果,对应着【如何请求识别?】章节里的 RecognitionListener 回调。

protected abstract void onStartListening(Intent recognizerIntent, Callback listener);

停止识别 & 取消服务

后续的停止识别 stopListening()、取消服务 cancel() 的实现链路和开始识别基本一致,最终分别抵达 RecognitionService 的 onStopListening() 以及 onCancel() 回调。

唯一区别的地方在于 stop 只是暂时停止识别,识别 App 的连接还在,而 cancel 则是断开了连接、并重置了相关数据

  1. void cancel(IRecognitionListener listener, boolean isShutdown) {
  2. ...
  3. synchronized (mLock) {
  4. ...
  5. mRecordingInProgress = false;
  6. mSessionInProgress = false;
  7. mDelegatingListener = null;
  8. mListener = null;
  9. // Schedule to unbind after cancel is delivered.
  10. if (isShutdown) {
  11. run(service -> unbind());
  12. }
  13. }
  14. }




最后我们结合一张图整体了解一下 SpeechRecognizer 机制的链路:

  • 需要语音识别的 App 通过 SpeechRecognizer 发送 Request;

  • SpeechRecognizer 在发起识别的时候通过 IRecognitionServiceManager.aidl 告知 SystemServer 的 SpeechRecognitionManagerService 系统服务,去 SettingsProvider 中获取默认的 Recognition 服务包名;

  • SpeechRecognitionManagerService 并不直接负责绑定,而是交由 SpeechRecognitionManagerServiceImpl 调度;

  • SpeechRecognitionManagerServiceImpl 则是交给 RemoteSpeechRecognitionService 专门绑定和管理;

  • RemoteSpeechRecognitionService 通过 IRecognitionService.aidl 和具体的识别服务 RecognitionService 进行交互;

  • RecognitionService 则会通过 Handler 切换到主线程,调用识别 engine 开始处理识别请求,并通过 Callback 内部类完成识别状态、结果的返回;

  • 后续则是 RecognitionService 通过 IRecognitionListener.aidl 将结果传递至 SystemServer,以及进一步抵达发出请求的 App 源头。


  • https://developer.android.google.cn/reference/android/speech/SpeechRecognizer

  • https://developer.android.google.cn/reference/kotlin/android/speech/RecognitionService




