赞
踩
集群所有数据节点频繁因为StackOverflowError的错误挂掉,启动后还会挂掉,StackOverflowError异常栈如下
[2023-12-22T16:03:44,057][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [xr-data-hdp-dn-rtyarn0725] fatal error in thread [elasticsearch[xr-data-hdp-dn-rtyarn0725][write][T#6]], exiting java.lang.StackOverflowError: null at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:283) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:237) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parse(ObjectMapper.java:210) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:319) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:237) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parse(ObjectMapper.java:210) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:319) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:237) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parse(ObjectMapper.java:210) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:319) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:237) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parse(ObjectMapper.java:210) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:319) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:237) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parse(ObjectMapper.java:210) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:319) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:237) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parse(ObjectMapper.java:210) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:319) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:237) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parse(ObjectMapper.java:210) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:319) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:237) ~[elasticsearch-7.9.1.jar:7.9.1] ... at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parse(ObjectMapper.java:210) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:319) ~[elasticsearch-7.9.1.jar:7.9.1] at org.elasticsearch.index.mapper.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:237) ~[elasticsearch-7.9.1.jar:7.9.1]
通过堆栈可以看出是写入线程池[write]发生的Stackoverflow,并且可能是在解析mapping的过程发生的,通过ObjectMapper类推断是Object类型数据写入导致的。因此通过拉取集群内所有索引的mapping,尝试找出哪个索引的mapping有Object类型的字段,但结果没能找到。
最后,因为这个集群的索引较少,我们通过简单暴力的方法——二分查找停掉作业观察集群状态,来找到问题索引。
为什么会发生Stackoverflow?
栈溢出的堆栈发生在ES服务端处理客户端的写入请求时,在开启dynamic mapping的情况下,如果写入数据包含新的字段配置,需要解析字段配置,解析字段配置的逻辑是递归解析配置对应的JSON数据,当字段类型为嵌套格式(Object/nested)时,递归的次数取决于用户数据的嵌套层数。问题索引的数据嵌套层数过多导致,递归次数过多,进而导致栈溢出。
验证:
测试写入一条多层嵌套的数据,结果中的代码堆栈和现象中发生StackOverflowError的栈相同,出现了多次递归
{ "o1":{ "a":{ "b":{ "c":{ "d":{ "e":{ "f":{ "g":{ "h":{ "j":"ddd" } } } } } } } } } }
代码堆栈:
查看问题索引确实开启了dynamic mapping,并且原始日志确实存在包含大量嵌套结构的数据
为什么问题索引的mapping中不包含Object类型的字段?
异常堆栈的触发时机为数据写入解析mapping,此时还未将新的mapping更新为索引的mapping,由于解析mapping时发生了Stackoverflow导致ES进程crash,因此索引mapping没有更新,自然问题索引的mapping中不包含Object类型的字段。
ES侧有nested字段的深度限制(index.mapping.depth.limit),为什么没拦截掉该消息?
该检查在解析字段配置之后,解析字段时就发生了栈溢出,详见下面的代码
private synchronized Map<String, DocumentMapper> internalMerge(Map<String, CompressedXContent> mappings, MergeReason reason) { //...省略无关代码... try { documentMapper = documentParser.parse(type, entry.getValue(), applyDefault ? defaultMappingSourceOrLastStored : null); // 数据的mapping解析 } catch (Exception e) { throw new MapperParsingException("Failed to parse mapping [{}]: {}", e, entry.getKey(), e.getMessage()); } } return internalMerge(defaultMapper, defaultMappingSource, documentMapper, reason);// 这里会检查mapping } private synchronized Map<String, DocumentMapper> internalMerge(@Nullable DocumentMapper defaultMapper, @Nullable String defaultMappingSource, DocumentMapper mapper, MergeReason reason) { //...省略无关代码... boolean hasNested = this.hasNested; Map<String, ObjectMapper> fullPathObjectMappers = this.fullPathObjectMappers; Map<String, DocumentMapper> results = new LinkedHashMap<>(2); if (defaultMapper != null) { if (indexSettings.getIndexVersionCreated().onOrAfter(Version.V_7_0_0)) { throw new IllegalArgumentException(DEFAULT_MAPPING_ERROR_MESSAGE); } else if (reason == MergeReason.MAPPING_UPDATE) { // only log in case of explicit mapping updates deprecationLogger.deprecatedAndMaybeLog("default_mapping_not_allowed", DEFAULT_MAPPING_ERROR_MESSAGE); } assert defaultMapper.type().equals(DEFAULT_MAPPING); results.put(DEFAULT_MAPPING, defaultMapper); } for (ObjectMapper objectMapper : objectMappers) { if (reason != MergeReason.MAPPING_RECOVERY) { checkTotalFieldsLimit(objectMappers.size() + fieldMappers.size() - metadataMappers.length + fieldAliasMappers.size()); checkFieldNameSoftLimit(objectMappers, fieldMappers, fieldAliasMappers); checkNestedFieldsLimit(fullPathObjectMappers); checkDepthLimit(fullPathObjectMappers.keySet()); // 检查mapping的最大深度是打破阈值,是则抛出IllegalArgumentException } results.put(newMapper.type(), newMapper); } return results; }
官方社区在v8.6修复了该问题,https://github.com/elastic/elasticsearch/issues/52098,我们使用的版本是ES7,需要升级或者打patch才能解决
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。