赞
踩
ANR系列一共打算分为7篇文章,主要分为三块来讲:
第一块,讲解ANR发生后,是如何显示的,以及对应的日志文件如何生成的:
ANR系列之一:ANR显示和日志生成原理讲解_失落夏天的博客-CSDN博客
第二块,讲四种类型的ANR是如何产生的:
ANR系列之四:ContentProvider类型ANR产生原理讲解
第三块,讲如何ANR问题如何进行分析,以及如何建立有效的机制来检测上报ANR类型问题。(未完成)
ANR系列之六:ANR文件和日志如何分析
ANR系列之七:如何建立有效的机制来监控分析ANR问题
本篇是该系列的第一篇,主要讲解以下几点内容:
1.ANR的一些基本概念;
2.ANR弹框是如何显示出来的;
3.ANR的logcat日志以及ANR文件是如何产生的;
4.发生ANR时,会有哪些日志,以及如何解读;
5.一些扩展性的知识点。
ANR,指的是安卓中一种长时间应用无响应的一种场景。发生ANR时,会弹出下面这样的提示框,用户点击按钮可以强行杀死应用。
简单来说,ANR就是应用的主线程长时间处于阻塞状态,导致其任务无法正常执行完成的一种场景。所以,主线程阻塞就一定会造成ANR吗?当然不会,因为还有另外一个必要条件:待执行的任务。
本章主要讲解ANR发生后的完整流程,当然更具体的内容,比如ANR弹框是怎么显示的,以及日志是如何记录的,我们放到了后面章节来讲。
首先,我们通过一张流程图来了解整个ANR的流程:
无论是哪种类型的ANR,哪怕是native层的ANR,最终也会通知到AnrHelper类的appNotResponding方法。所以,我们从这个方法开始了解整个ANR的流程。我们用来区分ANR的四种不同类型,其实也就是appNotResponding这个方法中的annotation不同而已,而ANR本身是不去分类型的。
1.appNotResponding方法中,主要是生成AnrRecord对象,加入到mAnrRecords集合中。然后调用startAnrConsumerIfNeeded方法。
2.startAnrConsumerIfNeeded方法主要是通过cas进行判断,避免两个ANR线程同时执行。如果没有冲突的话,则开启AnrConsumerThread线程,对mAnrRecords中的对象进行消费。
3.AnrConsumerThread的run方法中,就是从mAnrRecords中取出对象,这些对象第一条中添加的。通过AnrRecord自身的appNotResponding方法进行消费。
4.appNotResponding方法就是整个ANR流程的核心执行逻辑了。总结一下,其实主要分为两大块:
第一,生成ANR的相关log以及日志并打印或保存。这个我们第四章来讲
第二,弹出应用无响应的框。这个我们第三章来讲。
logcat的日志和ANR文件的生成的核心逻辑仍然在appNotResponding方法中。
发生ANR的时候,会有3种形式的记录,分别记录不同的信息。
1.logcat中记录的日志。这里主要起到的是打印的作用,打印一些ANR的信息,以及会把一些CPU状态打印出来
2.data/anr/目录下生成对应的ANR日志文件。这里面主要记录的是dump进程的信息。
3.dropbox中记录的信息。
ANR发生的原因,有时候并不是APP自身导致的,和其他进程也有关系。比如其他进程耗尽CPU资源,导致发生ANR的APP的任务无法正常执行。所以,发生ANR后,收集原因时,并不能单纯的只收集发生ANR的那个APP进程。
appNotResponding方法中,有两个集合,分别是firstPids和lastPids,分别收集
- ArrayList<Integer> firstPids = new ArrayList<>(5);
- SparseArray<Boolean> lastPids = new SparseArray<>(20);
firstPids中会加入以下类型的进程ID:
1.发生ANR的进程PID;
2.发生ANR的进程的父进程PID;
3.系统进程的PID;
4.最近使用到并且还存活的进程PID;
5.存在被处理过的Activity的进程PID,这个说实在的,我没太理解,不过不影响主流程,就先忽略。
发生ANR的时候,会有两种形式的记录:
接下来,appNotResponding方法中会收集一些必要的ANR信息,打印显示在logcat中,主要代码如下:
- // Log the ANR to the main log.
- StringBuilder info = new StringBuilder();
- info.setLength(0);
- info.append("ANR in ").append(mApp.processName);//打印发生ANR的进程名
- if (activityShortComponentName != null) {
- info.append(" (").append(activityShortComponentName).append(")");
- }
- info.append("\n");
- info.append("PID: ").append(pid).append("\n");//打印发生ANR的进程PID
- if (annotation != null) {
- info.append("Reason: ").append(annotation).append("\n");//打印ANR类型和原因
- }
- if (parentShortComponentName != null
- && parentShortComponentName.equals(activityShortComponentName)) {
- info.append("Parent: ").append(parentShortComponentName).append("\n");
- }
- if (errorId != null) {
- info.append("ErrorId: ").append(errorId.toString()).append("\n");
- }
- info.append("Frozen: ").append(mApp.mOptRecord.isFrozen()).append("\n");
-
- ...
-
- Slog.e(TAG, info.toString());
具体记录的内容和如何分析,我们下面再讲,这里只要知道,logcat中的信息是这里记录的就好。这些必要的信息都存储在info中,最终会通过logcat打印出来。
这个主要是通过appNotResponding中的ActivityManagerService.dumpStackTraces方法来实现的,我们先看下这个方法的参数:
- static File dumpStackTraces(ArrayList<Integer> firstPids,
- ProcessCpuTracker processCpuTracker, SparseArray<Boolean> lastPids,
- ArrayList<Integer> nativePids, StringWriter logExceptionCreatingFile,
- long[] firstPidOffsets, String subject) {
tracesFile = createAnrDumpFile(tracesDir);
1.firstPids:第一等级要收集的进程PID,firstPids集合3.1有介绍。
2.processCpuTracker:CPU分析记录对象,这个方法中会用到
3.lastPids:第二等级要收集的进程PID集合。
4.nativePids:要收集的native进程的PID集合。
5.tracesFileException:传递进去一个输出流对象,供方法内写入,最终这些信息会写入到dropbox中
6.offsets:用来记录采集的第一个非系统进程的日志文件的偏移量。两个参数,分别记录写入之前和写入之后的文件大小。
7.annotation:ANR类型和原因
dumpStackTraces方法中,首先初始化processCpuTracker,然后休眠200毫秒,这200毫秒主要是让CPU去采集和分析CPU状态。然后遍历lastPids,获取其中CPU占用率最高的5个进程加入到extraPids集合中,等于说对lastPids集合进行了一遍处理。所以,我们现在有三个PID集合了:firstPids,nativePids和extraPids,这三个集合就是我们要dump进程的。
生成trace文件,并且把ARN原因写入
- File tracesFile;
- try {
- tracesFile = createAnrDumpFile(tracesDir);
- } catch (IOException e) {
- Slog.w(TAG, "Exception creating ANR dump file:", e);
- if (logExceptionCreatingFile != null) {
- logExceptionCreatingFile.append("----- Exception creating ANR dump file -----\n");
- e.printStackTrace(new PrintWriter(logExceptionCreatingFile));
- }
- return null;
- }
-
- if (subject != null) {
- try (FileOutputStream fos = new FileOutputStream(tracesFile, true)) {
- String header = "Subject: " + subject + "\n";
- fos.write(header.getBytes(StandardCharsets.UTF_8));
- } catch (IOException e) {
- Slog.w(TAG, "Exception writing subject to ANR dump file:", e);
- }
- }
最后通过重写的dumpStackTraces方法,分别回去采集firstPids,nativePids,extraPids集合中对应进程的堆栈状态。
这里系统会设置上限20秒的时间来允许对集合中的进程PID进程捕获,如果超过20秒,则认为捕获失败。
我们先看遍历firstPids集合的操作,简化的代码如下:
- for (int i = 0; i < num; i++) {
- final int pid = firstPids.get(i);
- ...
-
- Slog.i(TAG, "Collecting stacks for pid " + pid);
- final long timeTaken = dumpJavaTracesTombstoned(pid, tracesFile,
- remainingTime);
-
- remainingTime -= timeTaken;
- if (remainingTime <= 0) {
- Slog.e(TAG, "Aborting stack trace dump (current firstPid=" + pid
- + "); deadline exceeded.");
- return firstPidStart >= 0 ? new Pair<>(firstPidStart, firstPidEnd) : null;
- }
-
- if (firstPid) {
- firstPidEnd = tf.length();
- }
- if (DEBUG_ANR) {
- Slog.d(TAG, "Done with pid " + firstPids.get(i) + " in " + timeTaken + "ms");
- }
- }
dumpJavaTracesTombstoned中,会调用Debug类,最终会通知native层去获取对应PID的进程堆栈。现在常用的使用信号量监听ANR也是在此时发送的。
遍历nativePids集合的操作其实和第一步操作类似,只不过最终调用的就是dumpNativeBacktraceToFileTimeout方法了。
而最后遍历extraPids集合,操作和第一步是一样的了。
具体日志和代码的对应,我们第五章再讲。
这一章其实是很早之前写的,比较简单,就直接复用之前的内容,不再改了。对于新手排查系统问题,是一个很好的参考。
想知道弹框是如何显示的,最简单的方式就是先造一个程序无响应的场景,然后打开uiautomatorviewer工具查看布局,这样我们就可以找到对应的XMl然后在源码中搜索,就能找到我们的目标类了。
首先我们看到的是下面这样一个场景,先基于android-29的源码全局文字:没有响应,关闭应用,等待,没有搜到,正常,这些文字应该是配置在xml中的。换个思路搜id,关闭应用对应的是aerr_close,全局搜果然搜到了,在AppNotRespondingDialog类中。对应的类名正好也符合:APP无响应。所以说我们的入口找到了。
另外我们看一下AppNotRespondingDialog的包名:com.android.server.am,说明是跑在Serve端的。
发现有是AppErrors的handleShowAnrUi方法调用的。我们先看整个调用流程:
1.InputManagerService的notifyANR方法
- // Native callback.
- private long notifyANR(IBinder token, String reason) {
- return mWindowManagerCallbacks.notifyANR(
- token, reason);
- }
好吧,调用触发点是naive。
2.InputManagerCallback的notifyANR方法
- @Override
- public long notifyANR(IBinder token, String reason) {
- AppWindowToken appWindowToken = null;
- WindowState windowState = null;
- boolean aboveSystem = false;
- synchronized (mService.mGlobalLock) {
- ...
- //存储日志的
- mService.saveANRStateLocked(appWindowToken, windowState, reason);
- }
-
- // All the calls below need to happen without the WM lock held since they call into AM.
- mService.mAtmInternal.saveANRState(reason);
-
- if (appWindowToken != null && appWindowToken.appToken != null) {
- ...
- } else if (windowState != null) {
- //通知弹出ANR的框
- long timeout = mService.mAmInternal.inputDispatchingTimedOut(
- windowState.mSession.mPid, aboveSystem, reason);
- if (timeout >= 0) {
- // The activity manager declined to abort dispatching.
- // Wait a bit longer and timeout again later.
- return timeout * 1000000L; // nanoseconds
- }
- }
- return 0; // abort dispatching
- }
3.ActivityManagerService.LocalService的inputDispatchingTimedOut方法
4.ActivityManagerService的inputDispatchingTimedOut方法
5.ProcessRecord的appNotResponding方法
- if (mService.mUiHandler != null) {
- // Bring up the infamous App Not Responding dialog
- Message msg = Message.obtain();
- msg.what = ActivityManagerService.SHOW_NOT_RESPONDING_UI_MSG;
- msg.obj = new AppNotRespondingDialog.Data(this, aInfo, aboveSystem);
-
- mService.mUiHandler.sendMessage(msg);
- }
6.ProcessRecord的appNotResponding方法
- case SHOW_NOT_RESPONDING_UI_MSG: {
- mAppErrors.handleShowAnrUi(msg);
- ensureBootCompleted();
- } break;
7.ProcessRecord的appNotResponding方法
- void handleShowAnrUi(Message msg) {
- Dialog dialogToShow = null;
- List<VersionedPackage> packageList = null;
- synchronized (mService) {
- AppNotRespondingDialog.Data data = (AppNotRespondingDialog.Data) msg.obj;
- final ProcessRecord proc = data.proc;
- if (proc == null) {
- Slog.e(TAG, "handleShowAnrUi: proc is null");
- return;
- }
- ...
- if (mService.mAtmInternal.canShowErrorDialogs() || showBackground) {
- dialogToShow = new AppNotRespondingDialog(mService, mContext, data);
- proc.anrDialog = dialogToShow;
- } else {
- MetricsLogger.action(mContext, MetricsProto.MetricsEvent.ACTION_APP_ANR,
- AppNotRespondingDialog.CANT_SHOW);
- // Just kill the app if there is no dialog to be shown.
- mService.killAppAtUsersRequest(proc, null);
- }
- }
- // If we've created a crash dialog, show it without the lock held
- if (dialogToShow != null) {
- dialogToShow.show();//显示程序无响应的框
- }
- ...
- }
总结一下,流程图如下:
上面有讲到,ANR发生时,会有三种信息来记录, 分别是logcat以及ANR的日志。其实还有一些logcat中的信息,可以辅助我们判断ANR流程何时开始,合适采集,合适结束等等,所以,我们一一来讲解。
首先,我们先梳理一下整个logcat中ANR日志的完整流程:
- 2022-10-08 15:03:32.023 24194-24330/? W/InputDispatcher: Window ... is not responding. Waited 5007ms for MotionEvent //InputDispatcher::onAnrLocked中,识别到输入事件响应超时
- 2022-10-08 15:03:32.023 24194-24330/? W/InputDispatcher: Canceling events ... it is unresponsive //InputDispatcher::cancelEventsForAnrLocked中,InputDispatcher进行ANR流程分法
- 2022-10-08 15:03:32.023 24194-24330/? I/WindowManager: ANR in Window{} //AnrController.java的notifyWindowUnresponsive中,window中超时,准备通知AMS
- 2022-10-08 15:03:32.146 24194-31707/? I/ActivityManager: dumpStackTraces pids={7705=true, 7740=true, ...} //ActivityManagerService.java的dumpStackTraces中,准备收集的进程集合
- 2022-10-08 15:03:32.471 24194-31707/? I/ActivityManager: Dumping to /data/anr/anr_2022-10-08-15-03-32-471 //ActivityManagerService.java的dumpStackTraces中,准备通知APP收集ANR日志
- 2022-10-08 15:03:32.471 24194-31707/? I/ActivityManager: Collecting stacks for pid 31655 //ActivityManagerService.java的dumpStackTraces中,待收集的进程PID
- 2022-10-08 15:03:32.472 24194-31707/? I/system_server: libdebuggerd_client: started dumping process 31655 //debuggerd_client.cpp的debuggerd_trigger_dump中,native中准备通知待收集的进程
- 2022-10-08 15:03:32.472 31655-31660/com.xt.client I/com.xt.client: Thread[6,tid=31660,...]: reacting to signal 3 //APP进程收到信号,准备开始写入
- 2022-10-08 15:03:32.568 31655-31660/com.xt.client I/com.xt.client: Wrote stack traces to tombstoned //APP进程写入完成,通知系统侧
- 2022-10-08 15:03:32.568 24194-31707/? I/system_server: libdebuggerd_client: done dumping process 31655 //系统进程收到通知,进行保存
- 2022-10-08 15:03:32.589 24194-31707/? I/ActivityManager: Collecting stacks for pid 24194 //ActivityManagerService.java dumpStackTraces中,准备开始收集下一个进程的PID,这个进程为系统进程system_server
- 2022-10-08 15:03:32.601 24194-31707/? I/system_server: libdebuggerd_client: started dumping process 24194 //debuggerd_client.cpp的debuggerd_trigger_dump中,native中准备通知待收集的进程
- 2022-10-08 15:03:32.603 24194-24202/? I/system_server: Thread[2,...]: reacting to signal 3 //system_server进程收到信号,准备开始写入
- 2022-10-08 15:03:32.858 24194-24202/? I/system_server: Wrote stack traces to tombstoned //system_server进程写入完成,通知系统侧
- ...继续收集其它java进程的信息
- 2022-10-08 15:03:34.972 24194-31707/? I/ActivityManager: Collecting stacks for native pid 660 //准备收集native进程660的信息
- 2022-10-08 15:03:34.973 24194-31707/? I/system_server: libdebuggerd_client: started dumping process 660 //debuggerd_client.cpp的debuggerd_trigger_dump中,native中准备通知待收集的进程
- 2022-10-08 15:03:34.974 660-660/? I/libc: Requested dump for pid 660 (binder:660_2) //debuggerd_handler.cpp的log_signal_summary中,native中准备通知待收集的进程
- ...继续收集其它native进程的信息
- 2022-10-08 15:03:36.273 24194-31707/? I/ActivityManager: Done dumping //全部收集完成
- 2022-10-08 15:03:36.274 24194-31707/? E/ActivityManager: ANR in com.xt.client (com.xt.client/.MainActivity) //打印发生ANR时收集到的日志
- ...
这里我们以输入类型为例,
1.InputDispatcher中识别到发生ANR了,通过回调java方法,通知到InputManagerService。如InputDispatcher开头的两行日志所记录。
2.InputManagerService中,根据WindowState找到对应的归属,先交给ActivityRecord处理,最终转交到AMS中。如WindowManager开头的一行日志所记录。
3.AMS中委托给AnrHelper.java进行处理,最终会通知到ProcessErrorStateRecord.java的appNotResponding方法,该方法中,会进行三个操作:
首先,收集需要dump的进程列表,包含java进程,native进程,其它进程,采集规则是发生ANR的进程+最近活动的进程。
其次,进行采集操作。这个会交给ActivityManagerService.java的dumpStackTraces方法执行。
最后,显示无响应的弹框。
4.采集日志会进入到ActivityManagerService.java的dumpStackTraces方法中,此时先打印ActivityManager开头的两行日志,分别代表着开始采集和生成日志文件。下面这两个日志分别记录将要采集的进程ID以及日志文件路径:
- 2022-10-08 15:03:32.146 24194-31707/? I/ActivityManager: dumpStackTraces pids={7705=true, 7740=true, ...} //ActivityManagerService.java的dumpStackTraces中,准备收集的进程集合
- 2022-10-08 15:03:32.471 24194-31707/? I/ActivityManager: Dumping to /data/anr/anr_2022-10-08-15-03-32-471 //ActivityManagerService.java的dumpStackTraces中,准备通知APP收集ANR日志
5.因为需要dump多个进程的,所以开始采集以及采集完成的日志会有很多,如下面所示,分别代表开始采集,发送信号通知,收到信号通知,采集完成以及系统收到采集完成的通知。这种日志会经历很多轮。
- 2022-10-08 15:03:32.471 24194-31707/? I/ActivityManager: Collecting stacks for pid 31655 //ActivityManagerService.java的dumpStackTraces中,待收集的进程PID
- 2022-10-08 15:03:32.472 24194-31707/? I/system_server: libdebuggerd_client: started dumping process 31655 //debuggerd_client.cpp的debuggerd_trigger_dump中,native中准备通知待收集的进程
- 2022-10-08 15:03:32.472 31655-31660/com.xt.client I/com.xt.client: Thread[6,tid=31660,...]: reacting to signal 3 //APP进程收到信号,准备开始写入
- 2022-10-08 15:03:32.568 31655-31660/com.xt.client I/com.xt.client: Wrote stack traces to tombstoned //APP进程写入完成,通知系统侧
- 2022-10-08 15:03:32.568 24194-31707/? I/system_server: libdebuggerd_client: done dumping process 31655 //系统进程收到通知,进行保存
6.采集native的信息
- 2022-10-08 15:03:34.972 24194-31707/? I/ActivityManager: Collecting stacks for native pid 660 //准备收集native进程660的信息
- 2022-10-08 15:03:34.973 24194-31707/? I/system_server: libdebuggerd_client: started dumping process 660 //debuggerd_client.cpp的debuggerd_trigger_dump中,native中准备通知待收集的进程
- 2022-10-08 15:03:34.974 660-660/? I/libc: Requested dump for pid 660 (binder:660_2) //debuggerd_handler.cpp的log_signal_summary中,native中准备通知待收集的进程
7.所有进程dump完成。
2022-10-08 15:03:36.273 24194-31707/? I/ActivityManager: Done dumping //全部收集完成
上面的流程采集完成后,会一次性打印采集到的ANR的信息,如下面所示:
我们这一小节就重点介绍下,这份logcat中的ANR信息,如何解读。
- 2022-10-08 15:03:36.274 24194-31707/? E/ActivityManager: ANR in com.xt.client (com.xt.client/.MainActivity)
- PID: 31655 //进程ID
- Reason: Input dispatching timed out (dff7ad com.xt.client/com.xt.client.MainActivity (server) is not responding. Waited 5007ms for MotionEvent) //原因
- Parent: com.xt.client/.MainActivity //触发点
- ErrorId: bcef99f2-c232-4bee-bdd8-233b4ff598f0
- Frozen: false
- Load: 0.54 / 0.15 / 0.12 //ANR发生之前的1分钟,5分钟,15分钟CPU占用率。0.54代表0.54%
- ----- Output from /proc/pressure/memory -----
- some avg10=0.00 avg60=0.00 avg300=0.00 total=18377466
- full avg10=0.00 avg60=0.00 avg300=0.00 total=10458714
- ----- End output from /proc/pressure/memory -----
- //ANR发生之前的129秒内CPU占用率
- CPU usage from 129271ms to 0ms ago (2022-10-08 15:01:22.846 to 2022-10-08 15:03:32.116):
- //user代表用户态占用,kernel代表内核态占用
- 1.7% 24194/system_server: 1.2% user + 0.5% kernel / faults: 3611 minor
- 1.5% 689/surfaceflinger: 1.1% user + 0.4% kernel / faults: 441 minor
- 1.4% 24426/com.android.systemui: 1% user + 0.3% kernel / faults: 2682 minor
- 1% 28207/kworker/u16:6: 0% user + 1% kernel / faults: 18 minor
- 0.8% 31505/kworker/u16:3: 0% user + 0.8% kernel
- 0.7% 27789/kworker/u16:1: 0% user + 0.7% kernel / faults: 2 minor
- 0.6% 29978/com.tencent.mm: 0.1% user + 0.4% kernel / faults: 66 minor
- 0.6% 1122/vendor.google.wifi_ext@1.0-service-vendor: 0.3% user + 0.2% kernel / faults: 1190 minor
- 0.3% 24523/com.breel.wallpapers20a: 0.2% user + 0% kernel / faults: 1291 minor
- 0.4% 24828/com.google.android.apps.nexuslauncher: 0.3% user + 0% kernel / faults: 1984 minor
- 0.1% 691/android.hardware.graphics.composer@2.4-service-sm8150: 0% user + 0% kernel / faults: 133 minor
- 0.3% 18902/com.google.android.gms.persistent: 0.2% user + 0.1% kernel / faults: 2221 minor
- 0.3% 30388/com.tencent.mm:tools: 0.2% user + 0% kernel / faults: 93 minor
- ...
具体解释如下:
行数 | 内容 | 解释 |
1 | ANR in com.test.sceneengineservice | 发生ANR应用的包名 |
2 | PID: 2468 | 发生ANR的进程ID,请注意, 此日志是system_server进程打印的 |
3 | Reason: executing service com.test.sceneengineservice/.SceneEngineService | 发生ANR的的原因 |
4 | Load: 0.0 / 0.0 / 0.0 | 发生ANR之前CPU的负载,分别代表发生前的1分钟/5分钟/15分钟 |
5 | CPU usage from 14781ms to 0ms ago (2022-09-21 07:32:34.092 to 2022-09-21 07:32:48.873): | 这期间的CPU占用。从发生之前的14秒到发生时 |
6 | E ActivityManager: 5.6% 1000/system_server: 1.8% user + 3.7% kernel / faults: 3555 minor | 这段时间内具体的CPU占用率。user代表应用侧的占用,kernel代表内核侧的占用。 |
文件内容信息如下:
- Subject: Input dispatching timed out (4dff4f9 com.xt.client/com.xt.client.MainActivity (server) is not responding. Waited 5001ms for MotionEvent)
-
- --- CriticalEventLog ---
- capacity: 20
- events <
- timestamp_ms: 1665198570560
- anr <
- pid: 20001
- process: "com.xt.client"
- process_class: 1
- subject: "Input dispatching timed out (b9701e com.xt.client/com.xt.client.MainActivity (server) is not responding. Waited 5007ms for MotionEvent)"
- uid: 10244
- >
- >
- events <
- timestamp_ms: 1665198580245
- java_crash <
- exception_class: "java.lang.SecurityException"
- pid: 20001
- process: "com.xt.client"
- process_class: 1
- uid: 10244
- >
- >
- timestamp_ms: 1665198615793
- window_ms: 300000
-
-
- ----- pid 20223 at 2022-10-08 11:10:16.144897760+0800 -----
- Cmd line: com.xt.client
- Build fingerprint: 'google/sunfish/sunfish:13/TP1A.220905.004/8927612:user/release-keys'
- ABI: 'arm64'
- Build type: optimized
- /** 类加载信息-开始 class_linker.cc中写入,记录类加载信息 */
- Zygote loaded classes=21523 post zygote classes=573
- Dumping registered class loaders
- #0 dalvik.system.PathClassLoader: [], parent #1
- #1 java.lang.BootClassLoader: [], no parent
- #2 dalvik.system.PathClassLoader: [/data/app/~~9kDnfQhhAsXX5wlTVjeRlA==/com.xt.client-DT6oy9kys_nok1ibzgGmFQ==/base.apk!classes12.dex:/data/data/com.xt.client/code_cache/.overlay/base.apk/classes14.dex:/data/app/~~9kDnfQhhAsXX5wlTVjeRlA==/com.xt.client-DT6oy9kys_nok1ibzgGmFQ==/base.apk:/data/app/~~9kDnfQhhAsXX5wlTVjeRlA==/com.xt.client-DT6oy9kys_nok1ibzgGmFQ==/base.apk!classes11.dex:/data/app/~~9kDnfQhhAsXX5wlTVjeRlA==/com.xt.client-DT6oy9kys_nok1ibzgGmFQ==/base.apk!classes3.dex:/data/app/~~9kDnfQhhAsXX5wlTVjeRlA==/com.xt.client-DT6oy9kys_nok1ibzgGmFQ==/base.apk!classes8.dex:/data/data/com.xt.client/code_cache/.overlay/base.apk/classes9.dex:/data/app/~~9kDnfQhhAsXX5wlTVjeRlA==/com.xt.client-DT6oy9kys_nok1ibzgGmFQ==/base.apk!classes5.dex:/data/app/~~9kDnfQhhAsXX5wlTVjeRlA==/com.xt.client-DT6oy9kys_nok1ibzgGmFQ==/base.apk!classes6.dex], parent #1
- Done dumping class loaders
- Classes initialized: 0 in 0
- /** 类加载信息-结束 */
-
- /** 引用表-开始 intern_table.cc中写入,记录强弱引用数量 */
- Intern table: 30098 strong; 1210 weak
- /** 引用表-结束 */
-
- /** JNI加载信息-开始 java_vm_ext.cc中写入,记录加载的so */
- JNI: CheckJNI is on; globals=400 (plus 68 weak)
- Libraries: libandroid.so libaudioeffect_jni.so libcompiler_rt.so libframework-connectivity-jni.so libframework-connectivity-tiramisu-jni.so libicu_jni.so libjavacore.so libjavacrypto.so libjnigraphics.so libmedia_jni.so libopenjdk.so librs_jni.so librtp_jni.so libsoundpool.so libstats_jni.so libwebviewchromium_loader.so (16)
- /** JNI加载信息-结束 */
-
- /** java虚拟机堆状态-开始 heap.cc中写入,记录当前的内存状态 */
- Heap: 74% free, 4180KB/16MB; 183984 objects
- Dumping cumulative Gc timings
- Average major GC reclaim bytes ratio inf over 0 GC cycles
- Average major GC copied live bytes ratio 0.732525 over 4 major GCs
- Cumulative bytes moved 29923016
- Cumulative objects moved 547043
- Peak regions allocated 60 (15MB) / 1024 (256MB)
- Total madvise time 7.031ms
- Average minor GC reclaim bytes ratio nan over 0 GC cycles
- Average minor GC copied live bytes ratio nan over 0 minor GCs
- Cumulative bytes moved 0
- Cumulative objects moved 0
- Peak regions allocated 60 (15MB) / 1024 (256MB)
- Total number of allocations 183984
- Total bytes allocated 4180KB
- Total bytes freed 0B
- Free memory 11MB
- Free memory until GC 11MB
- Free memory until OOME 251MB
- Total memory 16MB
- Max memory 256MB
- Zygote space size 7760KB
- Total mutator paused time: 0
- Total time waiting for GC to complete: 2.448us
- Total GC count: 0
- Total GC time: 0
- Total blocking GC count: 0
- Total blocking GC time: 0
- Total pre-OOME GC count: 0
- Native bytes total: 14386550 registered: 2142838
- Total native bytes at last GC: 4357592
- /** java虚拟机堆状态-结束 */
-
- /** OTA文件记录-开始 oat_file_manager.cc中写入,记录当前使用到的OTA文件 */
- /data/app/~~9kDnfQhhAsXX5wlTVjeRlA==/com.xt.client-DT6oy9kys_nok1ibzgGmFQ==/oat/arm64/base.vdex: verify
- /system/framework/oat/arm64/android.hidl.manager-V1.0-java.odex: verify
- /system/framework/oat/arm64/android.test.base.odex: verify
- /system/framework/oat/arm64/android.hidl.base-V1.0-java.odex: verify
- /** OTA文件记录-结束 */
-
- /** JIT信息记录-开始 jit.cc中写入,记录JIT的信息 */
- Current JIT code cache size (used / resident): 40KB / 60KB
- Current JIT data cache size (used / resident): 28KB / 60KB
- Zygote JIT code cache size (at point of fork): 22KB / 32KB
- Zygote JIT data cache size (at point of fork): 16KB / 32KB
- Current JIT mini-debug-info size: 19KB
- Current JIT capacity: 128KB
- Current number of JIT JNI stub entries: 0
- Current number of JIT code cache entries: 46
- Total number of JIT baseline compilations: 34
- Total number of JIT optimized compilations: 4
- Total number of JIT compilations for on stack replacement: 1
- Total number of JIT code cache collections: 1
- Memory used for stack maps: Avg: 377B Max: 11KB Min: 32B
- Memory used for compiled code: Avg: 1156B Max: 26KB Min: 164B
- Memory used for profiling info: Avg: 187B Max: 4416B Min: 24B
- Start Dumping Averages for 56 iterations for JIT timings
- Compiling optimized: Sum: 49.228ms Avg: 879.071us
- Compiling baseline: Sum: 27.108ms Avg: 484.071us
- TrimMaps: Sum: 2.699ms Avg: 48.196us
- Code cache collection: Sum: 380us Avg: 6.785us
- Compiling OSR: Sum: 240us Avg: 4.285us
- Done Dumping Averages
- Memory used for compilation: Avg: 154KB Max: 4050KB Min: 16KB
- ProfileSaver total_bytes_written=0
- ProfileSaver total_number_of_writes=0
- ProfileSaver total_number_of_code_cache_queries=0
- ProfileSaver total_number_of_skipped_writes=0
- ProfileSaver total_number_of_failed_writes=0
- ProfileSaver total_ms_of_sleep=5000
- ProfileSaver total_ms_of_work=0
- ProfileSaver total_number_of_hot_spikes=0
- ProfileSaver total_number_of_wake_ups=0
- /** JIT信息记录-结束 */
-
- /** ART信息指标-开始 metrics_common.cc中写入,记录ART虚拟机的一些关键指标 */
- *** ART internal metrics ***
- Metadata:
- timestamp_since_start_ms: 19952
- Metrics:
- ClassLoadingTotalTime: count = 17358
- ClassVerificationTotalTime: count = 15303
- ClassVerificationCount: count = 28
- WorldStopTimeDuringGCAvg: count = 0
- YoungGcCount: count = 0
- FullGcCount: count = 0
- TotalBytesAllocated: count = 2349168
- TotalGcCollectionTime: count = 0
- YoungGcThroughputAvg: count = 0
- FullGcThroughputAvg: count = 0
- YoungGcTracingThroughputAvg: count = 0
- FullGcTracingThroughputAvg: count = 0
- JitMethodCompileTotalTime: count = 28492
- JitMethodCompileCount: count = 39
- YoungGcCollectionTime: range = 0...60000, buckets: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
- FullGcCollectionTime: range = 0...60000, buckets: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
- YoungGcThroughput: range = 0...10000, buckets: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
- FullGcThroughput: range = 0...10000, buckets: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
- YoungGcTracingThroughput: range = 0...10000, buckets: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
- FullGcTracingThroughput: range = 0...10000, buckets: 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
- *** Done dumping ART internal metrics ***
- /** ART信息指标-结束 */
-
- /** 线程信息-开始 thread_list.cc中写入,记录当前进程的线程信息 */
- suspend all histogram: Sum: 112us 99% C.I. 0.083us-54us Avg: 4.666us Max: 54us
- DALVIK THREADS (19):
- "Signal Catcher" daemon prio=10 tid=6 Runnable
- | group="system" sCount=0 ucsCount=0 flags=0 obj=0x12c801a0 self=0xb400007193e16a60
- | sysTid=20229 nice=-20 cgrp=system sched=0/0 handle=0x7004dfbcb0
- | state=R schedstat=( 17930731 0 5 ) utm=0 stm=0 core=7 HZ=100
- | stack=0x7004d04000-0x7004d06000 stackSize=991KB
- | held mutexes= "mutator lock"(shared held)
- native: #00 pc 00000000005374cc /apex/com.android.art/lib64/libart.so (art::DumpNativeStack(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, int, BacktraceMap*, char const*, art::ArtMethod*, void*, bool)+128) (BuildId: 97fdb979efb7d2b596fa4fceabaad95b)
- native: #01 pc 00000000006f0d94 /apex/com.android.art/lib64/libart.so (art::Thread::DumpStack(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, bool, BacktraceMap*, bool) const+236) (BuildId: 97fdb979efb7d2b596fa4fceabaad95b)
- native: #02 pc 00000000006fe620 /apex/com.android.art/lib64/libart.so (art::DumpCheckpoint::Run(art::Thread*)+208) (BuildId: 97fdb979efb7d2b596fa4fceabaad95b)
- native: #03 pc 00000000003641d4 /apex/com.android.art/lib64/libart.so (art::ThreadList::RunCheckpoint(art::Closure*, art::Closure*)+440) (BuildId: 97fdb979efb7d2b596fa4fceabaad95b)
- native: #04 pc 00000000006fcdc0 /apex/com.android.art/lib64/libart.so (art::ThreadList::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, bool)+280) (BuildId: 97fdb979efb7d2b596fa4fceabaad95b)
- native: #05 pc 00000000006fc7b4 /apex/com.android.art/lib64/libart.so (art::ThreadList::DumpForSigQuit(std::__1::basic_ostream<char, std::__1::char_traits<char> >&)+292) (BuildId: 97fdb979efb7d2b596fa4fceabaad95b)
- native: #06 pc 00000000006d57e4 /apex/com.android.art/lib64/libart.so (art::Runtime::DumpForSigQuit(std::__1::basic_ostream<char, std::__1::char_traits<char> >&)+184) (BuildId: 97fdb979efb7d2b596fa4fceabaad95b)
- native: #07 pc 00000000006e1928 /apex/com.android.art/lib64/libart.so (art::SignalCatcher::HandleSigQuit()+468) (BuildId: 97fdb979efb7d2b596fa4fceabaad95b)
- native: #08 pc 0000000000573c40 /apex/com.android.art/lib64/libart.so (art::SignalCatcher::Run(void*)+264) (BuildId: 97fdb979efb7d2b596fa4fceabaad95b)
- native: #09 pc 00000000000b62b8 /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+208) (BuildId: 058e3ec96fa600fb840a6a6956c6b64e)
- native: #10 pc 0000000000052fb8 /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64) (BuildId: 058e3ec96fa600fb840a6a6956c6b64e)
- (no managed stack frames)
-
- "main" prio=5 tid=1 Sleeping
- | group="main" sCount=1 ucsCount=0 flags=1 obj=0x72424e88 self=0xb400007193e0a7b0
- | sysTid=20223 nice=-10 cgrp=system sched=0/0 handle=0x72dd5244f8
- | state=S schedstat=( 331627619 74690560 345 ) utm=30 stm=2 core=7 HZ=100
- | stack=0x7fcfb32000-0x7fcfb34000 stackSize=8188KB
- | held mutexes=
- at java.lang.Thread.sleep(Native method)
- - sleeping on <0x0e92c80b> (a java.lang.Object)
- at java.lang.Thread.sleep(Thread.java:450)
- - locked <0x0e92c80b> (a java.lang.Object)
- at java.lang.Thread.sleep(Thread.java:355)
- at com.xt.client.MainActivity.doActionWithoutClass(MainActivity.java:201)
- at com.xt.client.MainActivity.access$000(MainActivity.java:85)
- at com.xt.client.MainActivity$2.onItemClick(MainActivity.java:177)
- at com.xt.client.inter.RecyclerItemClickListener.onInterceptTouchEvent(RecyclerItemClickListener.java:34)
- at androidx.recyclerview.widget.RecyclerView.findInterceptingOnItemTouchListener(RecyclerView.java:3286)
- at androidx.recyclerview.widget.RecyclerView.dispatchToOnItemTouchListeners(RecyclerView.java:3257)
- at androidx.recyclerview.widget.RecyclerView.onTouchEvent(RecyclerView.java:3421)
- at android.view.View.dispatchTouchEvent(View.java:15004)
- at android.view.ViewGroup.dispatchTransformedTouchEvent(ViewGroup.java:3115)
- at android.view.ViewGroup.dispatchTouchEvent(ViewGroup.java:2788)
- at android.view.ViewGroup.dispatchTransformedTouchEvent(ViewGroup.java:3121)
- at android.view.ViewGroup.dispatchTouchEvent(ViewGroup.java:2802)
- at android.view.ViewGroup.dispatchTransformedTouchEvent(ViewGroup.java:3121)
- at android.view.ViewGroup.dispatchTouchEvent(ViewGroup.java:2802)
- at android.view.ViewGroup.dispatchTransformedTouchEvent(ViewGroup.java:3121)
- at android.view.ViewGroup.dispatchTouchEvent(ViewGroup.java:2802)
- at android.view.ViewGroup.dispatchTransformedTouchEvent(ViewGroup.java:3121)
- at android.view.ViewGroup.dispatchTouchEvent(ViewGroup.java:2802)
- at com.android.internal.policy.DecorView.superDispatchTouchEvent(DecorView.java:498)
- at com.android.internal.policy.PhoneWindow.superDispatchTouchEvent(PhoneWindow.java:1899)
- at android.app.Activity.dispatchTouchEvent(Activity.java:4262)
- at com.android.internal.policy.DecorView.dispatchTouchEvent(DecorView.java:456)
- at android.view.View.dispatchPointerEvent(View.java:15263)
- at android.view.ViewRootImpl$ViewPostImeInputStage.processPointerEvent(ViewRootImpl.java:6548)
- at android.view.ViewRootImpl$ViewPostImeInputStage.onProcess(ViewRootImpl.java:6348)
- at android.view.ViewRootImpl$InputStage.deliver(ViewRootImpl.java:5804)
- at android.view.ViewRootImpl$InputStage.onDeliverToNext(ViewRootImpl.java:5861)
- at android.view.ViewRootImpl$InputStage.forward(ViewRootImpl.java:5827)
- at android.view.ViewRootImpl$AsyncInputStage.forward(ViewRootImpl.java:5992)
- at android.view.ViewRootImpl$InputStage.apply(ViewRootImpl.java:5835)
- at android.view.ViewRootImpl$AsyncInputStage.apply(ViewRootImpl.java:6049)
- at android.view.ViewRootImpl$InputStage.deliver(ViewRootImpl.java:5808)
- at android.view.ViewRootImpl$InputStage.onDeliverToNext(ViewRootImpl.java:5861)
- at android.view.ViewRootImpl$InputStage.forward(ViewRootImpl.java:5827)
- at android.view.ViewRootImpl$InputStage.apply(ViewRootImpl.java:5835)
- at android.view.ViewRootImpl$InputStage.deliver(ViewRootImpl.java:5808)
- at android.view.ViewRootImpl.deliverInputEvent(ViewRootImpl.java:8857)
- at android.view.ViewRootImpl.doProcessInputEvents(ViewRootImpl.java:8808)
- at android.view.ViewRootImpl.enqueueInputEvent(ViewRootImpl.java:8777)
- at android.view.ViewRootImpl$WindowInputEventReceiver.onInputEvent(ViewRootImpl.java:8980)
- at android.view.InputEventReceiver.dispatchInputEvent(InputEventReceiver.java:267)
- at android.os.MessageQueue.nativePollOnce(Native method)
- at android.os.MessageQueue.next(MessageQueue.java:335)
- at android.os.Looper.loopOnce(Looper.java:161)
- at android.os.Looper.loop(Looper.java:288)
- at android.app.ActivityThread.main(ActivityThread.java:7898)
- at java.lang.reflect.Method.invoke(Native method)
- at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
- at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:936)
-
- ...
- /** 线程信息-结束/
-
- ----- end 20223 -----
-
- ...
具体解释如下:
所在行 | 内容 | 解释 |
1 | ”main“ | 线程名 |
1 | prio=5 | 线程优先级 |
1 | tid=1 | 线程标识符 |
1 | Native | 线程状态,native是native中的线程状态,对应java中runnable |
2 | group="main" | 线程所属的线程组 |
2 | sCount=1 | 程被正常挂起的次数 |
2 | dsCount=0 | 线程因调试而挂起次数 |
2 | flags=1 | |
2 | obj=0x75c4db98 | 线程的Java对象的地址 |
2 | self=0x7b08e14c00 | 线程本身的地址 |
3 | sysTid=2468 | Linux下的内核线程id |
3 | nice=0 | 线程的调度优先级 |
3 | cgrp=default | 调度属组 |
3 | sched=0/0 | 线程的调度策略和优先级 |
3 | handle=0x7b8e933548 | 线程的处理函数地址 |
4 | state=S | 调度状态 |
4 | schedstat=( 562066147 3061697256 542 ) | 执行时间记录: 三个值分别表示线程在cpu上执行的时间、线程的等待时间和线程执行的时间片长度 |
4 | utm=31 | 用户态CPU占用次数,由于频率是100,则占用31次就等于31*10ms=310ms |
4 | stm=24 | 内核态CPU占用次数 |
4 | core=2 | 最后执行这个线程的CPU核序号 |
4 | HZ=100 | 运行频率,一秒100次 |
5 | stack=0x7fe841d000-0x7fe841f000 | 虚拟机栈地址 |
5 | stackSize=8MB | 虚拟机栈大小 |
6 | at android.os.BinderProxy.transactNative(Native method) at android.os.BinderProxy.transact(Binder.java:1127) | 此时采集到的APP对应线程的对应状态 |
dropBox中的日志文件更像是一个综合体,包括了logcat日志中的部分以及ANR文件中的部分。
ProcessErrorStateRecord的appNotResponding方法中,会通过以下的方法,把report信息和traceFile中的信息进行记录。report信息基本上就是上面logcat中的部分。
- mService.addErrorToDropBox("anr", mApp, mApp.processName, activityShortComponentName,
- parentShortComponentName, parentPr, null, report.toString(), tracesFile,
- null, new Float(loadingProgress), incrementalMetrics, errorId);
文件位置是:/data/system/dropbox/data_app_anr@1670396615724.txt.gz
是一个压缩文件,解析后就是我们想要的内容了,大体的格式如下。
- Process: com.beantechs.apm
- PID: 10589
- Flags: 0x20e8bf46
- Package: com.xxx.apm v1 (1.0)
- Foreground: Yes
- Activity: com.xxx.apm/.TestActivity
- Subject: Input dispatching timed out (Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago. Wait queue length: 4. Wait queue head age: 5571.8ms.)
- Build: qti/gwm/gwm:9/PQ1A.190105.004/bean08231515:userdebug/test-keys
-
- CPU usage from 141069ms to 0ms ago (2022-12-07 15:01:10.951 to 2022-12-07 15:03:32.020):
- 20% 6354/adbd: 1.7% user + 18% kernel / faults: 4493599 minor
- 5.8% 322/logd: 1.7% user + 4% kernel / faults: 26 minor
- 4.3% 514/android.hardware.sensors@1.0-service: 0.6% user + 3.7% kernel
- 3.9% 385/vendor.ts.systemlog@1.0-service: 1.7% user + 2.1% kernel / faults: 281 minor
- 3.4% 497/android.hardware.audio@2.0-service.rbgwm: 0.9% user + 2.4% kernel
- 3.2% 1020/system_server: 1.3% user + 1.9% kernel / faults: 21547 minor 1 major
- 3.2% 543/audioserver: 1.2% user + 1.9% kernel / faults: 4052 minor
- 3.2% 26582/kworker/u16:1: 0% user + 3.2% kernel
- 2.2% 4415/com.beantechs.datapipeline: 1.5% user + 0.7% kernel / faults: 41778 minor
- 2.1% 4442/com.lx.usbaudiohid: 1% user + 1% kernel / faults: 2 minor
- 2% 3376/com.beantechs.intelligentvehiclecontrol: 1.3% user + 0.7% kernel / faults: 5915 minor
- 1.8% 9937/kworker/u17:1: 0% user + 1.8% kernel
- ...
- 10% TOTAL: 3.2% user + 6.4% kernel + 0% iowait + 0.7% irq + 0.1% softirq
-
- ----- pid 10589 at 2022-12-07 15:03:32 -----
- Cmd line: com.xxxx.apm
问题1:ANR发生后会立马弹出ANR框吗?
答:不会,2.3中讲到,触发了ANR并且采集完相关线程日志后,会延时5S在弹出ANR的框。
问题2:主线程卡顿一定会导致ANR吗?
答:不会。首先,卡顿一定要达到某个阈值才会触发。其次,一定要满足某个场景才可以。比如ActivityA跳转ActivityB之后,延时1S后主线程休眠20S,就不会ANR,因为输入事件已经得到响应,并不满足任何触发ANR的场景。
问题3:为什么ANR的日志traces.txt中,有时候会打印nativePollOnce的堆栈?
答:理论上,系统侧在发现达到超市事件后,会立马出发ANR检查。如果检查发生真的超时,会走ANR流程,捕获APP的堆栈状态并保存。但是实际执行中,创建文件,dump指定进程堆栈信息都需要时间,并且ANR流程的开始还会sleep200毫秒。
我们看3.1中日志,从15:03:32.023开始识别到ANR,到15:03:32.568结束dump完成APP侧信息。
- 2022-10-08 15:03:32.023 24194-24330/? W/InputDispatcher: Window ... is not responding. Waited 5007ms for MotionEvent //InputDispatcher::onAnrLocked中,识别到输入事件响应超时
- 2022-10-08 15:03:32.568 31655-31660/com.xt.client I/com.xt.client: Wrote stack traces to tombstoned //APP进程写入完成,通知系统侧
整个流程大约要500多毫秒,其中包括:
1.InputDispatcher中识别到ANR,传递到AMS准备开始采集要100多毫秒
2.识别到ANR的时候,系统进程会休眠200毫秒采集CPU信息。
3.系统进程采集一些必要信息,会耗费一些时间,大约100毫秒左右。
4.客户端收到信号去采集ANR的时候,也需要时间,大约要100毫秒左右。
所以,漂移的是有上限时间的,正常漂移上限一般为500ms左右,即使极端的场景,应该也不会很长。
参考文章:
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。