赞
踩
在这篇博客中,我们来看一下AMS处理App crash时涉及到的主要流程。
一、设置异常处理器
在Android平台中,应用进程fork出来后会为虚拟机设置一个未截获异常处理器,
即在程序运行时,如果有任何一个线程抛出了未被截获的异常,
那么该异常最终会抛给未截获异常处理器处理。
我们首先看看Android N中设置异常处理器的这部分代码。
在ZygoteInit.Java的runSelectLoop中:
- private static void runSelectLoop(String abiList) throws MethodAndArgsCaller {
- ...........
- while (true) {
- ..........
- for (int i = pollFds.length - 1; i >= 0; --i) {
- ..........
- if (i == 0) {
- //zygote中的server socket收到消息后,建立起ZygoteConnection
- ZygoteConnection newPeer = acceptCommandPeer(abiList);
- peers.add(newPeer);
- fds.add(newPeer.getFileDesciptor());
- } else {
- //ZygoteConnection建立后,收到消息调用自己的runOnce函数
- boolean done = peers.get(i).runOnce();
- .............
- }
- }
- }
- }
我们知道zygote启动后,会在自己的进程中定义一个server socket,专门接收创建进程的消息。
如上面的代码所示,收到创建进程的消息后,zygote会创建出ZygoteConnection,并调用其runOnce函数:
- boolean runOnce() throws ZygoteInit.MethodAndArgsCaller {
- ...............
- try {
- ...........
- //fork出子进程
- pid = Zygote.forkAndSpecialize(.......);
- } catch (ErrnoException ex) {
- ............
- } catch (IllegalArgumentException ex) {
- ...........
- } catch (ZygoteSecurityException ex) {
- ..........
- }
-
- try {
- if (pid == 0) {
- ........
- //进程fork成功后,进行处理
- handleChildProc(parsedArgs, descriptors, childPipeFd, newStderr);
- ........
- } else {
- ...........
- }
- } finally {
- ..........
- }
- }
我们跟进一下handleChildProc函数:
- private void handleChildProc(......) {
- ............
- if (parsedArgs.invokeWith != null) {
- ..........
- } else {
- //进入到RuntimeInit中的zygoteInit函数
- RuntimeInit.zygoteInit(parsedArgs.targetSdkVersion,
- parsedArgs.remainingArgs, null /* classLoader */);
- }
- }
顺着流程看一看RuntimeInit中的zygoteInit函数:
- public static final void zygoteInit(int targetSdkVersion, String[] argv, ClassLoader classLoader)
- throws ZygoteInit.MethodAndArgsCaller {
- ............
- //跟进commonInit
- commonInit();
- ............
- }
-
- private static final void commonInit() {
- ...........
- /* set default handler; this applies to all threads in the VM */
- //到达目的地!
- Thread.setDefaultUncaughtExceptionHandler(new UncaughtHandler());
- ...........
- }
从上面的代码可以看出,fork出进程后,将在进程commonInit的阶段设置异常处理器UncaughtHandler。
二、异常处理的流程
1、UncaughtHandler的异常处理
接下来我们看看UncaughtHandler如何处理未被捕获的异常。
- private static class UncaughtHandler implements Thread.UncaughtExceptionHandler {
- public void uncaughtException(Thread t, Throwable e) {
- try {
- // Don't re-enter -- avoid infinite loops if crash-reporting crashes.
- if (mCrashing) return;
- mCrashing = true;
-
- if (mApplicationObject == null) {
- Clog_e(TAG, "*** FATAL EXCEPTION IN SYSTEM PROCESS: " + t.getName(), e);
- } else {
- //打印进程的crash信息
- .............
- }
- .............
- // Bring up crash dialog, wait for it to be dismissed
- //调用AMS的接口,进行处理
- ActivityManagerNative.getDefault().handleApplicationCrash(
- mApplicationObject, new ApplicationErrorReport.CrashInfo(e));
- } catch (Throwable t2) {
- if (t2 instanceof DeadObjectException) {
- // System process is dead; ignore
- } else {
- try {
- Clog_e(TAG, "Error reporting crash", t2);
- } catch (Throwable t3) {
- // Even Clog_e() fails! Oh well.
- }
- }
- } finally {
- // Try everything to make sure this process goes away.
- //crash的最后,会杀死进程
- Process.killProcess(Process.myPid());
- //并exit
- System.exit(10);
- }
- }
- }
从代码来看,UncaughtHandler对异常的处理流程比较清晰,基本上就是:
1、记录log信息;
2、调用AMS的接口进行一些处理;
3、杀死出现crash的进程。
其中比较重要的应该是AMS处理crash的流程,接下来我们跟进一下这部分流程的代码。
2、AMS的异常处理
- public void handleApplicationCrash(IBinder app, ApplicationErrorReport.CrashInfo crashInfo) {
- //得到crash app对应的信息
- ProcessRecord r = findAppProcess(app, "Crash");
- final String processName = app == null ? "system_server"
- : (r == null ? "unknown" : r.processName);
-
- //调用handleApplicationCrashInner进一步处理
- handleApplicationCrashInner("crash", r, processName, crashInfo);
- }
-
- void handleApplicationCrashInner(String eventType, ProcessRecord r, String processName,
- ApplicationErrorReport.CrashInfo crashInfo) {
- ...............
- //Write a description of an error (crash, WTF, ANR) to the drop box.
- //记录信息到drop box
- addErrorToDropBox(eventType, r, processName, null, null, null, null, null, crashInfo);
-
- //调用内部类AppErrors的crashApplication函数
- mAppErrors.crashApplication(r, crashInfo);
- }
我们跟进一下AppErrors类中的crashApplication函数:
- /**
- * Bring up the "unexpected error" dialog box for a crashing app.
- * Deal with edge cases (intercepts from instrumented applications,
- * ActivityController, error intent receivers, that sort of thing).
- * /
- void crashApplication(ProcessRecord r, ApplicationErrorReport.CrashInfo crashInfo) {
- final long origId = Binder.clearCallingIdentity();
- try {
- //实际的处理函数为crashApplicationInner
- crashApplicationInner(r, crashInfo);
- } finally {
- Binder.restoreCallingIdentity(origId);
- }
- }
此处实际的处理函数为crashApplicationInner。
- void crashApplicationInner(ProcessRecord r, ApplicationErrorReport.CrashInfo crashInfo) {
- long timeMillis = System.currentTimeMillis();
-
- //从应用进程传递过来的crashInfo中获取相关的信息
- String shortMsg = crashInfo.exceptionClassName;
- String longMsg = crashInfo.exceptionMessage;
- String stackTrace = crashInfo.stackTrace;
- ................
-
- AppErrorResult result = new AppErrorResult();
- TaskRecord task;
- synchronized (mService) {
- /**
- * If crash is handled by instance of {@link android.app.IActivityController},
- * finish now and don't show the app error dialog.
- */
- //通知观察者处理crash
- //如果存在观察者且能够处理crash,那么不显示error dialog
- //例如在进行Monkey Test,那么可设置检测到crash后,就停止测试等
- if (handleAppCrashInActivityController(r, crashInfo, shortMsg, longMsg, stackTrace,
- timeMillis)) {
- return;
- }
-
- /**
- * If this process was running instrumentation, finish now - it will be handled in
- * {@link ActivityManagerService#handleAppDiedLocked}.
- */
- if (r != null && r.instrumentationClass != null) {
- return;
- }
-
- // Log crash in battery stats.
- if (r != null) {
- mService.mBatteryStatsService.noteProcessCrash(r.processName, r.uid);
- }
-
- AppErrorDialog.Data data = new AppErrorDialog.Data();
- data.result = result;
- data.proc = r;
-
- // If we can't identify the process or it's already exceeded its crash quota,
- // quit right away without showing a crash dialog.
- // 调用makeAppCrashingLocked进行处理,如果返回false,则无需进行后续处理
- if (r == null || !makeAppCrashingLocked(r, shortMsg, longMsg, stackTrace, data)) {
- return;
- }
-
- //发送SHOW_ERROR_UI_MSG给mUiHandler,将弹出一个对话框,提示用户某进程crash
- //用户可以选择"退出"或"退出并报告"等
- //一般的厂商应该都定制了这个界面
- Message msg = Message.obtain();
- msg.what = ActivityManagerService.SHOW_ERROR_UI_MSG;
-
- task = data.task;
- msg.obj = data;
- mService.mUiHandler.sendMessage(msg);
- }
-
- //调用AppErrorResult的get函数,该函数是阻塞的,直到用户处理了对话框为止
- //注意此处涉及了两个线程的工作
- //crashApplicationInner函数工作在Binder调用所在的线程
- //对话框工作于AMS的Ui线程
- int res = result.get();
-
- Intent appErrorIntent = null;
-
- //以下开始根据对话框中用户的选择,进行对应的处理
- ...................
- //长时间未点击对话框或者点击取消,那么相当于选择强行停止crash进程
- if (res == AppErrorDialog.TIMEOUT || res == AppErrorDialog.CANCEL) {
- res = AppErrorDialog.FORCE_QUIT;
- }
-
- //根据res的值进行相应的处理
- synchronized (mService) {
- //选择不再提示错误
- if (res == AppErrorDialog.MUTE) {
- //将进程名加入到AMS的mAppsNotReportingCrashes表中
- stopReportingCrashesLocked(r);
- }
-
- //选择了重新启动
- if (res == AppErrorDialog.RESTART) {
- mService.removeProcessLocked(r, false, true, "crash");
- if (task != null) {
- try {
- //尝试重启进程
- mService.startActivityFromRecents(task.taskId,
- ActivityOptions.makeBasic().toBundle());
- } catch (IllegalArgumentException e) {
- // Hmm, that didn't work, app might have crashed before creating a
- // recents entry. Let's see if we have a safe-to-restart intent.
- if (task.intent.getCategories().contains(
- Intent.CATEGORY_LAUNCHER)) {
- //换一种方式重启
- mService.startActivityInPackage(............);
- }
- }
- }
- }
-
- //选择强行停止
- if (res == AppErrorDialog.FORCE_QUIT) {
- long orig = Binder.clearCallingIdentity();
- try {
- // Kill it with fire!
- //handleAppCrashLocked主要是结束activity,并更新oom_adj
- mService.mStackSupervisor.handleAppCrashLocked(r);
-
- if (!r.persistent) {
- //如果不是常驻应用,则在此处kill掉
- mService.removeProcessLocked(r, false, false, "crash");
- mService.mStackSupervisor.resumeFocusedStackTopActivityLocked();
- }
- } finally {
- Binder.restoreCallingIdentity(orig);
- }
- }
-
- //选择强制停止并报告
- if (res == AppErrorDialog.FORCE_QUIT_AND_REPORT) {
- //该函数中将生成错误信息,并构造一个Intent用于拉起报告界面
- appErrorIntent = createAppErrorIntentLocked(r, timeMillis, crashInfo);
- }
-
- if (r != null && !r.isolated && res != AppErrorDialog.RESTART) {
- // XXX Can't keep track of crash time for isolated processes,
- // since they don't have a persistent identity.
- //记录crash时间
- mProcessCrashTimes.put(r.info.processName, r.uid,
- SystemClock.uptimeMillis());
- }
- }
-
- if (appErrorIntent != null) {
- try {
- //如果选择了强制停止并报告,那么此时就会拉起报告界面
- mContext.startActivityAsUser(appErrorIntent, new UserHandle(r.userId));
- } catch (ActivityNotFoundException e) {
- ..............
- }
- }
- }
整体来看AMS处理crash的流程还是相当清晰的:
1、首先记录crash相关的信息到drop box;
2、如果存在可以处理App crash的ActivityController,那么将crash交给它处理;
否则,弹出crash对话框,然用户选择后续操作。
3、根据用户的选择,AMS可以进行重启应用、强行停止应用或拉起报告界面等操作。
不过上述流程中,在拉起对话框前,先调用了makeAppCrashingLocked函数。
若这个函数返回false,那么后续的流程就不会继续进行。
我们来看看这个函数的具体用途。
- private boolean makeAppCrashingLocked(ProcessRecord app,
- String shortMsg, String longMsg, String stackTrace, AppErrorDialog.Data data) {
- app.crashing = true;
-
- //就是创建一个对象,其中包含了所有的错误信息
- app.crashingReport = generateProcessError(app,
- ActivityManager.ProcessErrorStateInfo.CRASHED, null, shortMsg, longMsg, stackTrace);
-
- //前面的代码已经提到过,系统可以通过Intent拉起一个crash报告界面
- //startAppProblemLocked函数,就是在系统中找到这个报告界面对应的ComponentName
- //此外,如果crash应用正好在处理有序广播,那么为了不影响后续广播接受器的处理,
- //startAppProblemLocked会停止crash应用对广播的处理流程,
- //即后续的广播接受器可以跳过crash应用,直接开始处理有序广播
- startAppProblemLocked(app);
-
- //停止“冻结”屏幕
- app.stopFreezingAllLocked();
-
- //进行一些后续的处理
- //从代码来看,如果应用不是在1min内连续crash,该函数都会返回true
- return handleAppCrashLocked(app, "force-crash" /*reason*/, shortMsg, longMsg, stackTrace,
- data);
- }
根据上面的代码,可以看出makeAppCrashingLocked函数最主要的工作主要有两个:
1、查找crash报告界面对应的componentName;
2、避免进程短时间内连续crash,导致频繁拉起对话框。
三、后续的清理工作
根据前面的流程,我们知道当进程crash后,最终将被kill掉,
此时AMS还需要完成后续的清理工作。
我们先来回忆一下进程启动后,注册到AMS的部分流程:
- //进程启动后,对应的ActivityThread会attach到AMS上
- private final boolean attachApplicationLocked(IApplicationThread thread,
- int pid) {
- ............
- ProcessRecord app;
- if (pid != MY_PID && pid >= 0) {
- synchronized (mPidsSelfLocked) {
- app = mPidsSelfLocked.get(pid);
- }
- } else {
- app = null;
- }
- ............
- final String processName = app.processName;
- try {
- //生成了一个“讣告”接收者
- AppDeathRecipient adr = new AppDeathRecipient(
- app, pid, thread);
- thread.asBinder().linkToDeath(adr, 0);
- app.deathRecipient = adr;
- } catch (RemoteException e) {
- app.resetPackageList(mProcessStats);
- startProcessLocked(app, "link fail", processName);
- return false;
- }
- ................
- }
从上面的代码可以看出,当进程注册到AMS时,AMS注册了一个“讣告”接收者注册到进程中。
因此,当crash进程被kill后,AppDeathRecipient中的binderDied函数将被回调:
- @Override
- public void binderDied() {
- ..........
- synchronized(ActivityManagerService.this) {
- appDiedLocked(mApp, mPid, mAppThread, true);
- }
- }
根据代码可知,接收到进程“死亡”的通知后,最后还是调用AMS的appDiedLocked函数进行处理:
- final void appDiedLocked(ProcessRecord app, int pid, IApplicationThread thread,
- boolean fromBinderDied) {
- // First check if this ProcessRecord is actually active for the pid.
- synchronized (mPidsSelfLocked) {
- ProcessRecord curProc = mPidsSelfLocked.get(pid);
- if (curProc != app) {
- ...........
- return;
- }
- }
- .............
- if (!app.killed) {
- if (!fromBinderDied) {
- Process.killProcessQuiet(pid);
- }
- killProcessGroup(app.uid, pid);
- app.killed = true;
- }
- //以上都是一些保证健壮性的代码
-
- if (app.pid == pid && app.thread != null &&
- app.thread.asBinder() == thread.asBinder()) {
- //进程是正常启动的,非测试启动,那么需要内存调整
- boolean doLowMem = app.instrumentationClass == null;
- boolean doOomAdj = doLowMem;
-
- if (!app.killedByAm) {
- ............
- mAllowLowerMemLevel = true;
- } else {
- mAllowLowerMemLevel = false;
- doLowMem = false;
- }
- ..............
- //handleAppDiedLocked进行实际的工作
- handleAppDiedLocked(app, false, true);
-
- if (doOomAdj) {
- //重新更新进程的oom_adj
- updateOomAdjLocked();
- }
- if (doLowMem) {
- //在必要时,触发系统中的进程做内存回收
- doLowMemReportIfNeededLocked(app);
- }
- }.........
- ..........
- }
appDiedLocked函数中比较重要的是handleAppDiedLocked函数:
- private final void handleAppDiedLocked(ProcessRecord app,
- boolean restarting, boolean allowRestart) {
- int pid = app.pid;
- //进行进程中service、ContentProvider、BroadcastReceiver等的收尾工作
- //这个函数虽然很长,但实际的功能还是很清晰的,这里不作进一步展开
- //比较重要的是:1、对于crash进程中的Bounded Service而言,会清理掉service与客户端之间的联系;
- //此外若service的客户端重要性过低,还会被直接kill掉
- //2、清理ContentProvider时,在removeDyingProviderLocked函数中,可能清理掉其客户端进程(对于stable contentProvider而言)
- boolean kept = cleanUpApplicationRecordLocked(app, restarting, allowRestart, -1);
- if (!kept && !restarting) {
- //不再保留和重启时,从LRU表中移除
- removeLruProcessLocked(app);
- if (pid > 0) {
- ProcessList.remove(pid);
- }
- }
-
- ..................
-
- // Remove this application's activities from active lists.
- //进行Activity相关的收尾工作
- boolean hasVisibleActivities = mStackSupervisor.handleAppDiedLocked(app);
-
- app.activities.clear();
-
- if (app.instrumentationClass != null) {
- ..............
- }
-
- if (!restarting && hasVisibleActivities
- && !mStackSupervisor.resumeFocusedStackTopActivityLocked()) {
- // If there was nothing to resume, and we are not already restarting this process, but
- // there is a visible activity that is hosted by the process... then make sure all
- // visible activities are running, taking care of restarting this process.
- // 从注释来看,若当前只有crash进程中存在可视Activity,那么AMS还是会试图重启该进程
- mStackSupervisor.ensureActivitiesVisibleLocked(null, 0, !PRESERVE_WINDOWS);
- }
- }
上述代码中cleanUpApplicationRecordLocked函数,在此不做深入分析。
其中唯一比较麻烦的就是Bounded Service和ContentProvider的清理,
因为这两种组件全部要考虑其客户端进程。
四、总结
整体来讲,Android中进程crash后的处理流程基本上如上图所示。
这个流程相对来说是比较简单的,唯一麻烦点的地方可能是进程结束后,
AMS进行的清理工作。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。