当前位置:   article > 正文

处理Springboot项目启动时streamBridge.send导致的卡住卡死问题_springboot项目启动一半,卡住了

springboot项目启动一半,卡住了

现象

  • 我们的Spring Boot 项目启动时,偶现卡死的现象,启动到一半卡主不动了
  • 也没有啥特别明显的报错信息(一开始没仔细看没发现)
2023-01-16 10:23:10.338  INFO 1 --- [  restartedMain] com.hazelcast.core.LifecycleService      : [172.18.0.14]:5701 [dev] [4.2.4] [172.18.0.14]:5701 is STARTED
2023-01-16 10:23:12.391  INFO 1 --- [  restartedMain] c.h.h.HazelcastCacheRegionFactory        : Starting up HazelcastCacheRegionFactory
2023-01-16 10:23:12.396  INFO 1 --- [  restartedMain] c.h.h.i.IHazelcastInstanceFactory        : Using existing HazelcastInstance [unit].
2023-01-16 10:23:16.685  INFO 1 --- [ntLoopGroup-2-1] com.newatc.socketio.SocketIOServer       : SocketIO server started at port: 7654
2023-01-16 10:23:16.702  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : controllerAgentServer bean listeners added
2023-01-16 10:23:16.972  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : channelParamHandler bean listeners added
2023-01-16 10:23:16.979  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : controllerAlarmHandler bean listeners added
2023-01-16 10:23:16.988  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : controllerFaultHandler bean listeners added
2023-01-16 10:23:16.995  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : controllerModelHandler bean listeners added
2023-01-16 10:23:17.000  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : controllerTimeHandler bean listeners added
2023-01-16 10:23:17.006  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : dayPlanParamHandler bean listeners added
2023-01-16 10:23:17.011  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : detectorEventHandler bean listeners added
2023-01-16 10:23:17.016  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : detectorParamHandler bean listeners added
2023-01-16 10:23:17.022  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : lampOffHandler bean listeners added
2023-01-16 10:23:17.027  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : patternParamHandler bean listeners added
2023-01-16 10:23:17.033  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : phaseFlowHandler bean listeners added
2023-01-16 10:23:17.038  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : phaseParamHandler bean listeners added
2023-01-16 10:23:17.044  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : phaseStatusHandler bean listeners added
2023-01-16 10:23:17.050  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : scheduleParamHandler bean listeners added
2023-01-16 10:23:17.057  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : signalIPParamHandler bean listeners added
2023-01-16 10:23:17.062  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : softwareVersionHandler bean listeners added
2023-01-16 10:23:17.068  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : stageParamHandler bean listeners added
2023-01-16 10:23:17.073  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : trafficParamsHandler bean listeners added
2023-01-16 10:23:17.079  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : vehGPSTrajectoryHandler bean listeners added
2023-01-16 10:23:17.084  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : wirelessOpLogHandler bean listeners added
2023-01-16 10:23:17.091  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : wirelessSecurityHandler bean listeners added
2023-01-16 10:23:18.018  INFO 1 --- [ntLoopGroup-3-1] c.n.socketio.handler.AuthorizeHandler    : channel active cb27f0b9-236a-404c-8cb1-369366729ce2
2023-01-16 10:23:18.466  INFO 1 --- [ntLoopGroup-3-1] c.h.i.p.impl.PartitionStateManager       : [172.18.0.14]:5701 [dev] [4.2.4] Initializing cluster partition table arrangement...
2023-01-16 10:23:18.776  INFO 1 --- [ntLoopGroup-3-1] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.101 Port=38167 unitId:101
2023-01-16 10:23:18.981  INFO 1 --- [ntLoopGroup-3-2] c.n.socketio.handler.AuthorizeHandler    : channel active 539fb94d-a5de-4fef-b81d-82803885973f
2023-01-16 10:23:19.008  INFO 1 --- [ntLoopGroup-3-2] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.93 Port=39568 unitId:93
2023-01-16 10:23:22.177  INFO 1 --- [        Timer-0] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.101
2023-01-16 10:23:22.244  INFO 1 --- [        Timer-2] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.93
2023-01-16 10:23:25.953  INFO 1 --- [ntLoopGroup-3-3] c.n.socketio.handler.AuthorizeHandler    : channel active 874e3919-489d-4970-8a53-e51e8a880057
2023-01-16 10:23:25.978  INFO 1 --- [ntLoopGroup-3-3] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.107 Port=45166 unitId:107
2023-01-16 10:23:26.821  WARN 1 --- [  restartedMain] onConfiguration$FunctionBindingRegistrar : You have defined function definition that does not exist: streamBridge
2023-01-16 10:23:29.213  INFO 1 --- [        Timer-4] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.107
2023-01-16 10:24:00.174  INFO 1 --- [        Timer-1] c.n.unit.agent.ControllerAgentServer     : Get signal time ipAddress:172.16.1.101
2023-01-16 10:24:00.248  INFO 1 --- [        Timer-3] c.n.unit.agent.ControllerAgentServer     : Get signal time ipAddress:172.16.1.93
2023-01-16 10:24:07.212  INFO 1 --- [        Timer-5] c.n.unit.agent.ControllerAgentServer     : Get signal time ipAddress:172.16.1.107
2023-01-16 10:24:37.351  INFO 1 --- [ntLoopGroup-3-4] c.n.socketio.handler.AuthorizeHandler    : channel active bd580b9b-8050-4315-bf0d-07e8acc797b2
2023-01-16 10:24:37.368  INFO 1 --- [ntLoopGroup-3-4] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.107 Port=45167 unitId:107
2023-01-16 10:24:40.595  INFO 1 --- [        Timer-6] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.107
2023-01-16 10:24:43.373  INFO 1 --- [ntLoopGroup-3-5] c.n.socketio.handler.AuthorizeHandler    : channel active a7ffe375-462d-4c18-bd67-1811933b8084
2023-01-16 10:24:43.388  INFO 1 --- [ntLoopGroup-3-5] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.93 Port=39570 unitId:93
2023-01-16 10:24:46.611  INFO 1 --- [        Timer-8] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.93
2023-01-16 10:25:18.595  INFO 1 --- [        Timer-7] c.n.unit.agent.ControllerAgentServer     : Get signal time ipAddress:172.16.1.107
2023-01-16 10:25:24.612  INFO 1 --- [        Timer-9] c.n.unit.agent.ControllerAgentServer     : Get signal time ipAddress:172.16.1.93
2023-01-16 10:25:51.726  INFO 1 --- [ntLoopGroup-3-6] c.n.socketio.handler.AuthorizeHandler    : channel active fb525bf4-fdd7-4f88-9a0d-88c923ef24df
2023-01-16 10:25:51.743  INFO 1 --- [ntLoopGroup-3-6] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.107 Port=45168 unitId:107
2023-01-16 10:25:54.964  INFO 1 --- [       Timer-10] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.107
2023-01-16 10:26:32.965  INFO 1 --- [       Timer-11] c.n.unit.agent.ControllerAgentServer     : Get signal time ipAddress:172.16.1.107
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 这个启动卡死不是必现的,综合了出现卡死时的情况,发现,当平台连接了大量信号机设备时,容易出现卡死问题
  • 设备和程序使用socket通信,接收到socket消息时,会使用kafka转发处理过的消息,此时程序还未启动完成,就发生了偶尔的卡死
  • 描述下发生和不发生的链各个场景
    • 正常的:平台未连接信号机设备,或只连接了少量设备(公司测试环境,没这么多信号机)
    • 异常的:客户现场,连接了很多信号机设备
  • 这让我想起之前处理的一个启动报错 Spring Cloud Stream kafka项目启动时报错
  • 项目使用的是spring-cloud-starter-stream-kafka,函数式编程。代码业务逻辑是,程序接收到硬件的socket.io消息推送,就会使用kafka转发消息
  • 之前报错原因是项目启动时,在还未与kafka建立连接时,程序就接收到socket消息开始按照程序逻辑发送kafka消息,导致空指针异常,之后加了判断与初始化处理,问题解决
  • 本次现象类似,但初始化部署,程序没有连接硬件时,启动都是成功的。但程序已经使用一段时间,连接上硬件,接收硬件上报数据,就出现了偶发的更新启动卡死现象

排查和解决

  • 先是对比了启动失败和启动成功的日志,没有发现什么异常报错,启动成功日志如下
2023-01-16 10:13:32.705  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : vehGPSTrajectoryHandler bean listeners added
2023-01-16 10:13:32.709  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : wirelessOpLogHandler bean listeners added
2023-01-16 10:13:32.715  INFO 1 --- [  restartedMain] c.n.s.a.SpringAnnotationScanner          : wirelessSecurityHandler bean listeners added
2023-01-16 10:13:33.786  INFO 1 --- [ntLoopGroup-3-1] c.n.socketio.handler.AuthorizeHandler    : channel active 65c277be-ed56-496a-9a3f-7243b1e7adff
2023-01-16 10:13:33.786  INFO 1 --- [ntLoopGroup-3-2] c.n.socketio.handler.AuthorizeHandler    : channel active f9f1acf4-ada2-4590-b949-46e0214e879b
2023-01-16 10:13:34.263  INFO 1 --- [ntLoopGroup-3-1] c.h.i.p.impl.PartitionStateManager       : [172.18.0.14]:5701 [dev] [4.2.4] Initializing cluster partition table arrangement...
2023-01-16 10:13:34.583  INFO 1 --- [ntLoopGroup-3-1] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.93 Port=39558 unitId:93
2023-01-16 10:13:34.583  INFO 1 --- [ntLoopGroup-3-2] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.107 Port=45152 unitId:107
2023-01-16 10:13:37.960  INFO 1 --- [        Timer-0] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.93
2023-01-16 10:13:37.960  INFO 1 --- [        Timer-1] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.107
2023-01-16 10:13:38.637  INFO 1 --- [ntLoopGroup-3-3] c.n.socketio.handler.AuthorizeHandler    : channel active 6dce6405-53af-4914-aa6c-dcbf2860babc
2023-01-16 10:13:38.660  INFO 1 --- [ntLoopGroup-3-3] c.n.unit.agent.ControllerAgentServer     : onConnect -- client = 172.16.1.101 Port=38154 unitId:101
2023-01-16 10:13:41.930  INFO 1 --- [        Timer-4] c.n.unit.agent.ControllerAgentServer     : Sync signal time ipAddress:172.16.1.101
2023-01-16 10:13:42.544  WARN 1 --- [  restartedMain] onConfiguration$FunctionBindingRegistrar : You have defined function definition that does not exist: streamBridge
2023-01-16 10:13:43.942  INFO 1 --- [ntLoopGroup-3-1] c.n.socketio.protocol.PacketDecoder      : Rcv Signal Machine IP:172.16.1.93
2023-01-16 10:13:43.942  INFO 1 --- [ntLoopGroup-3-2] c.n.socketio.protocol.PacketDecoder      : Rcv Signal Machine IP:172.16.1.107
2023-01-16 10:13:43.942  INFO 1 --- [ntLoopGroup-3-3] c.n.socketio.protocol.PacketDecoder      : Rcv Signal Machine IP:172.16.1.101
2023-01-16 10:13:43.944  INFO 1 --- [ntLoopGroup-3-3] c.n.u.a.handler.SignalIPParamHandler     : rcv:172.16.1.101
2023-01-16 10:13:43.945  INFO 1 --- [ntLoopGroup-3-2] c.n.u.a.handler.SignalIPParamHandler     : rcv:172.16.1.107
2023-01-16 10:13:43.945  INFO 1 --- [ntLoopGroup-3-1] c.n.u.a.handler.SignalIPParamHandler     : rcv:172.16.1.93
2023-01-16 10:13:47.039  INFO 1 --- [  restartedMain] org.jboss.threads                        : JBoss Threads version 3.1.0.Final
2023-01-16 10:13:47.304  INFO 1 --- [  restartedMain] com.newatc.unit.UnitApp                  : Started UnitApp in 41.426 seconds (JVM running for 44.067)
2023-01-16 10:13:47.319  INFO 1 --- [  restartedMain] com.newatc.unit.UnitApp                  : 
----------------------------------------------------------
	Application 'unit' is running! Access URLs:
	Local: 		http://localhost:8282/
	External: 	http://172.18.0.14:8282/
	当前版本为: 		1.4.2
	代码打包时间为: 		202301060900
	Profile(s): 	[prod, api-docs, no-liquibase]
----------------------------------------------------------
2023-01-16 10:13:47.320  INFO 1 --- [  restartedMain] com.newatc.unit.UnitApp                  : 
----------------------------------------------------------
	Config Server: 	Connected to Consul Server running in Docker
----------------------------------------------------------
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 这个启动卡死现象是偶发的,确定与kafka发送消息有关,但不是必现,无法通过错误日志和日志对比找问题,只好另寻他法
  • 由于启动成功的日志里,也有接收硬件消息的日志打印,想着是不是有线程等待,暂时卡主了,需要使用一些自带的java命令来检查下
  • 在docker里使用 jstack 1,结果提示bash: jstack: command not found,去java文件夹看了下,确实没这些命令
    在这里插入图片描述
  • 为了减少镜像包大小,我们使用的是jre版本的java 11 环境docker镜像eclipse-temurin:11-jre-focal(程序镜像包170M左右),没有自带一些java命令,只好换成jdk版本的eclipse-temurin:11-jdk-focal重新打镜像包(300M左右)部署
  • 部署后重现了启动卡死的问题,使用 jstack 1命令查看此时的线程情况,一下子发现了问题Found 1 deadlock.,发现了一个死锁线程,具体日志如下(省略了部分日志):
I have no name!@315764d2cd4b:/$ jstack 1
2023-01-16 11:39:57
Full thread dump OpenJDK 64-Bit Server VM (11.0.17+8 mixed mode, sharing):


JNI global refs: 29, weak refs: 0


Found one Java-level deadlock:
=============================
"restartedMain":
  waiting to lock monitor 0x00007f730406bb00 (object 0x00000000e34c2160, a org.springframework.cloud.stream.binder.DefaultBinderFactory),
  which is held by "nioEventLoopGroup-3-2"
"nioEventLoopGroup-3-2":
  waiting to lock monitor 0x00007f725c019500 (object 0x00000000e1516e80, a java.util.concurrent.ConcurrentHashMap),
  which is held by "restartedMain"

Java stack information for the threads listed above:
===================================================
"restartedMain":
	at org.springframework.cloud.stream.binder.DefaultBinderFactory.getBinder(DefaultBinderFactory.java:129)
	- waiting to lock <0x00000000e34c2160> (a org.springframework.cloud.stream.binder.DefaultBinderFactory)
	at org.springframework.cloud.stream.function.FunctionConfiguration$FunctionToDestinationBinder.determineTargetProtocol(FunctionConfiguration.java:640)
	at org.springframework.cloud.stream.function.FunctionConfiguration$FunctionToDestinationBinder.createFunctionHandler(FunctionConfiguration.java:579)
	at org.springframework.cloud.stream.function.FunctionConfiguration$FunctionToDestinationBinder.bindFunctionToDestinations(FunctionConfiguration.java:562)
	at org.springframework.cloud.stream.function.FunctionConfiguration$FunctionToDestinationBinder.afterPropertiesSet(FunctionConfiguration.java:421)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1863)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1800)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:620)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:542)
	at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:335)
	at org.springframework.beans.factory.support.AbstractBeanFactory$$Lambda$287/0x00000008401f6040.getObject(Unknown Source)
	at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234)
	- locked <0x00000000e1516e80> (a java.util.concurrent.ConcurrentHashMap)
	at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333)
	at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:208)
	at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:953)
	at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:918)
	at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:583)
	- locked <0x00000000e10d9e48> (a java.lang.Object)
	at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:145)
	at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:740)
	at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:415)
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:303)
	at com.newatc.unit.UnitApp.main(UnitApp.java:72)
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@11.0.17/Native Method)
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@11.0.17/NativeMethodAccessorImpl.java:62)
	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.17/DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(java.base@11.0.17/Method.java:566)
	at org.springframework.boot.devtools.restart.RestartLauncher.run(RestartLauncher.java:49)
"nioEventLoopGroup-3-2":
	at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:216)
	- waiting to lock <0x00000000e1516e80> (a java.util.concurrent.ConcurrentHashMap)
	at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333)
	at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:213)
	at org.springframework.context.event.AbstractApplicationEventMulticaster.retrieveApplicationListeners(AbstractApplicationEventMulticaster.java:264)
	at org.springframework.context.event.AbstractApplicationEventMulticaster.getApplicationListeners(AbstractApplicationEventMulticaster.java:221)
	at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:138)
	at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:421)
	at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:427)
	at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:378)
	at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:938)
	at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:586)
	- locked <0x00000000e3d686a0> (a java.lang.Object)
	at org.springframework.cloud.stream.binder.DefaultBinderFactory.initializeBinderContextSimple(DefaultBinderFactory.java:471)
	at org.springframework.cloud.stream.binder.DefaultBinderFactory.getBinderInstance(DefaultBinderFactory.java:265)
	at org.springframework.cloud.stream.binder.DefaultBinderFactory.doGetBinder(DefaultBinderFactory.java:223)
	at org.springframework.cloud.stream.binder.DefaultBinderFactory.getBinder(DefaultBinderFactory.java:151)
	- locked <0x00000000e34c2160> (a org.springframework.cloud.stream.binder.DefaultBinderFactory)
	at org.springframework.cloud.stream.binding.BindingService.getBinder(BindingService.java:389)
	at org.springframework.cloud.stream.binding.BindingService.bindProducer(BindingService.java:273)
	at org.springframework.cloud.stream.function.StreamBridge.resolveDestination(StreamBridge.java:282)
	- locked <0x00000000e3453128> (a org.springframework.cloud.stream.function.StreamBridge)
	at org.springframework.cloud.stream.function.StreamBridge.send(StreamBridge.java:210)
	at org.springframework.cloud.stream.function.StreamBridge.send(StreamBridge.java:164)
	at org.springframework.cloud.stream.function.StreamBridge.send(StreamBridge.java:144)
	at com.newatc.unit.agent.ControllerAgentServer.onConnect(ControllerAgentServer.java:184)
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@11.0.17/Native Method)
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@11.0.17/NativeMethodAccessorImpl.java:62)
	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.17/DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(java.base@11.0.17/Method.java:566)
	at com.newatc.socketio.annotation.OnConnectScanner$1.onConnect(OnConnectScanner.java:41)
	at com.newatc.socketio.namespace.Namespace.onConnect(Namespace.java:186)
	at com.newatc.socketio.handler.AuthorizeHandler.connect(AuthorizeHandler.java:151)
	at com.newatc.socketio.handler.AuthorizeHandler.channelActive(AuthorizeHandler.java:90)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:230)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:216)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelActive(AbstractChannelHandlerContext.java:209)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelActive(DefaultChannelPipeline.java:1398)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:230)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:216)
	at io.netty.channel.DefaultChannelPipeline.fireChannelActive(DefaultChannelPipeline.java:895)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:522)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:429)
	at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:486)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:503)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(java.base@11.0.17/Thread.java:829)

Found 1 deadlock.
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 看了stack日志,看到了死锁互相等待的线程,就是这个问题了。具体触发死锁的代码为at org.springframework.cloud.stream.function.StreamBridge.send(StreamBridge.java:210),还是kafka问题
  • 发现问题,想着等启动完成后,再去执行发送kafka消息的逻辑。再启动类里增加一个公共变量,启动前为false,启动完成后为true,在kafka发送时,使用此变量判断

// 启动类UnitApp
	/**
     * 是否启动完成,默认false,启动完成后置true
     */
    public static Boolean START_COMPLETED = false;
    
	public static void main(String[] args) {
        SpringApplication app = new SpringApplication(UnitApp.class);
        DefaultProfileUtil.addDefaultProfile(app);
        Environment env = app.run(args).getEnvironment();
        logApplicationStartup(env);
        START_COMPLETED = true;
    }

// 使用时
			try {
                if (streamBridge != null && UnitApp.START_COMPLETED) {
                    streamBridge.afterSingletonsInstantiated();
                    streamBridge.send(OUTPUT_BINDING_NAME, mapper.writeValueAsString(offlineEventDTO));
                } else {
                    log.error("streamBridge is null or UnitApp.START_COMPLETED is {}--------!", UnitApp.START_COMPLETED);
                }
            } catch (JsonProcessingException e) {
                log.error(e.getMessage());
            }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 思路简单,程序启动完成,再使用kafka,问题解决
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/我家小花儿/article/detail/485745
推荐阅读
相关标签
  

闽ICP备14008679号