Java应用CPU使用率过高排查方式
2023-07-27 10:22:13
作者:烟雨楼台笑江湖
这篇文章主要介绍了Java应用CPU使用率过高排查方式,具有很好的参考价值,希望对大家有所帮助。如有错误或未考虑完全的地方,望不吝赐教
Java应用CPU使用率过高排查
1.使用top命令查看使用CPU过高的进程
top - 14:16:06 up 27 min, 3 users, load average: 0.00, 0.01, 0.02
Tasks: 88 total, 2 running, 86 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 999692 total, 454684 free, 206396 used, 338612 buff/cache
KiB Swap: 204796 total, 204796 free, 0 used. 614532 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1113 root 20 0 2583028 131816 15412 S 0.3 13.2 0:01.65 java
1 root 20 0 128164 6832 4060 S 0.0 0.7 0:00.66 systemd
2.使用 top -H -p [pid]查看使用CPU过高的线程
top -H -p 1113
top - 14:17:34 up 28 min, 3 users, load average: 0.00, 0.01, 0.02
Threads: 19 total, 0 running, 19 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 999692 total, 454800 free, 206272 used, 338620 buff/cache
KiB Swap: 204796 total, 204796 free, 0 used. 614652 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1114 root 20 0 2583028 131816 15412 S 0.0 13.2 0:00.74 java
1115 root 20 0 2583028 131816 15412 S 0.0 13.2 0:00.00 java
1116 root 20 0 2583028 131816 15412 S 0.0 13.2 0:00.02 java
1117 root 20 0 2583028 131816 15412 S 0.0 13.2 0:00.00 java
1118 root 20 0 2583028 131816 15412 S 0.0 13.2 0:00.00 java
3.使用jstack -l [pid]查看jvm当前的线程信息
jstack -l 1113
[root@node01 ~]# jstack -l 1113
2023-01-06 14:20:52
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.211-b12 mixed mode):
"Attach Listener" #17 daemon prio=9 os_prio=0 tid=0x00007fb904001000 nid=0x4b2 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"DestroyJavaVM" #16 prio=5 os_prio=0 tid=0x00007fb93400c800 nid=0x45a waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"FileWatchService" #15 prio=5 os_prio=0 tid=0x00007fb934788800 nid=0x46a waiting on condition [0x00007fb90d3d2000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000e392af58> (a org.apache.rocketmq.common.CountDownLatch2$Sync)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at org.apache.rocketmq.common.CountDownLatch2.await(CountDownLatch2.java:114)
at org.apache.rocketmq.common.ServiceThread.waitForRunning(ServiceThread.java:139)
at org.apache.rocketmq.srvutil.FileWatchService.run(FileWatchService.java:70)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None
"NettyEventExecutor" #10 prio=5 os_prio=0 tid=0x00007fb934787800 nid=0x469 waiting on condition [0x00007fb90d4d3000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000e13b8328> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
at org.apache.rocketmq.remoting.netty.NettyRemotingAbstract$NettyEventExecutor.run(NettyRemotingAbstract.java:589)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None
"NettyNIOBoss_1" #14 prio=5 os_prio=0 tid=0x00007fb934787000 nid=0x468 runnable [0x00007fb90d5d4000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000000e1c8e928> (a io.netty.channel.nio.SelectedSelectionKeySet)
- locked <0x00000000e1c83868> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000e1c82430> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
at io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:68)
at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:810)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:457)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None
"NSScheduledThread1" #12 prio=5 os_prio=0 tid=0x00007fb93473c000 nid=0x467 waiting on condition [0x00007fb90d6d5000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000e12f8ad0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None
"ServerHouseKeepingService" #11 daemon prio=5 os_prio=0 tid=0x00007fb93457c800 nid=0x466 in Object.wait() [0x00007fb920a25000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.util.TimerThread.mainLoop(Timer.java:552)
- locked <0x00000000e13b8b70> (a java.util.TaskQueue)
at java.util.TimerThread.run(Timer.java:505)
Locked ownable synchronizers:
- None
"AsyncAppender-Worker-RocketmqNamesrvAppender" #9 daemon prio=5 os_prio=0 tid=0x00007fb934549000 nid=0x465 waiting on condition [0x00007fb920b26000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000e128ee68> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403)
at ch.qos.logback.core.AsyncAppenderBase$Worker.run(AsyncAppenderBase.java:289)
Locked ownable synchronizers:
- None
"Service Thread" #8 daemon prio=9 os_prio=0 tid=0x00007fb9340e3800 nid=0x463 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"C1 CompilerThread1" #7 daemon prio=9 os_prio=0 tid=0x00007fb9340e0800 nid=0x462 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"C2 CompilerThread0" #6 daemon prio=9 os_prio=0 tid=0x00007fb9340de800 nid=0x461 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"Signal Dispatcher" #5 daemon prio=9 os_prio=0 tid=0x00007fb9340dc000 nid=0x460 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"Surrogate Locker Thread (Concurrent GC)" #4 daemon prio=9 os_prio=0 tid=0x00007fb9340db000 nid=0x45f waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007fb9340a8000 nid=0x45e in Object.wait() [0x00007fb9218f5000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000e0008ed0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
- locked <0x00000000e0008ed0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216)
Locked ownable synchronizers:
- None
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007fb9340a5000 nid=0x45d in Object.wait() [0x00007fb9219f6000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000e0006bf8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000000e0006bf8> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
Locked ownable synchronizers:
- None
"VM Thread" os_prio=0 tid=0x00007fb93409b800 nid=0x45c runnable
"Concurrent Mark-Sweep GC Thread" os_prio=0 tid=0x00007fb93402c000 nid=0x45b runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007fb9340e6800 nid=0x464 waiting on condition
JNI global references: 318
这里重点住 nid 的值,可以看到是16进制的,把之前使用 top -H -p 1113获取到的线程id转成16进制(比如上面的1114对应16进制45a),用这个值过滤 jstack -l 1113 | grep 45a
[root@node01 ~]# jstack -l 1113 | grep 45a
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007fb9340a5000 nid=0x45a in Object.wait() [0x00007fb9219f6000]
这样就能定位到有问题的线程了。
这里仅记录解决问题的过程,没有找到合适的例子,截图的命令结果可能不能体现出实际问题,后续遇到了,再更新。
如果发现CPU使用率飙高是fullgc导致,那可能需要看一下内存这块问题,比如内存不够用导致频繁fullgc。
cpu使用率飚高实战分析(java应用)
有关于java应用CPU使用率超高的问题,已经有很多博客有过分析了。
无外乎就是,某个线程执行执行耗CPU的动作了。
我们主要的工作就是通过Linux命令和java相关命令找到具体的线程,并分析线程执行代码。
本文比较简单,就当是做一个记录。
1.准备工作
1.1 安装并启动docker tomcat
# 1.拉取tomcat镜像
hxw@hxwdeMacBook-Pro ~ % docker pull tomcat
# 2.完成后,启动一个tomcat容器
# tomcat默认启动端口为8080,将该端口映射到Mac的8088上
hxw@hxwdeMacBook-Pro ~ % docker run -p 8088:8080 -d tomcat
# 3.启动完成后,通过docker ps查看
hxw@hxwdeMacBook-Pro ~ % docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
606ca0c7f783 tomcat:8.0.15 "catalina.sh run" 4 hours ago Up 42 minutes 0.0.0.0:8088->8080/tcp gracious_greider
# 4.验证tomcat启动
# 通过浏览器访问 localhost:8088,出现tomcat即可
1.2 准备war包,并拷贝到docker tomcat中
war包中有一个方法会引起CPU飙高,文章最后再做说明
# 1.准备war包,名称为SpringTest.war
# 2.拷贝到docker中
hxw@hxwdeMacBook-Pro ~ % docker cp SpringTest.war 606ca0c7f783:/usr/local/tomcat/webapps
# 3.重启tomcat
hxw@hxwdeMacBook-Pro ~ % docker restart 606ca0c7f783
2.分析CPU飙高问题
需要先调用tomcat中SpringTest.war相关方法,这里先不说明,文章最终再说
需要注意的是:以下操作都是在tomcat所在容器中进行的
2.1 进入tomcat容器,安装所需工具
# 1.通过docker exec命令进入
hxw@hxwdeMacBook-Pro ~ % docker exec -it 606ca0c7f783 /bin/bash
root@606ca0c7f783:/usr/local/tomcat#
# 2.安装JDK,笔者这里默认没有jdk
root@606ca0c7f783:/usr/local/tomcat# apt-get update
root@606ca0c7f783:/usr/local/tomcat# apt-get install default-jdk
root@606ca0c7f783:/usr/local/tomcat# apt-get install less
主要使用的工具就是top和jdk(jstack)
less命令笔者所在tomcat也没有,也需要安装的(读者如果还需要其他命令,可自行安装,按照debain的方式来安装即可)
2.2 top分析进程CPU使用率
root@606ca0c7f783:/usr/local/tomcat# top
top - 07:42:09 up 8:50, 0 users, load average: 1.06, 0.41, 0.25
Tasks: 3 total, 0 running, 2 sleeping, 0 stopped, 0 zombie
%Cpu(s): 22.0 us, 49.9 sy, 0.0 ni, 28.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 2036420 total, 1895796 used, 140624 free, 140304 buffers
KiB Swap: 1048572 total, 64248 used, 984324 free. 817248 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 3318896 400936 13704 S 101.3 19.7 0:52.43 java
58 root 20 0 169276 10464 3052 S 0.0 0.5 0:00.22 bash
3186 root 0 0 170660 9408 4084 0 0.0 0.5 0:00.00 top
很容易就看到,PID=1的进程占用了101%的CPU,就是这个进程使CPU飙高的
2.3 查看使当前进程飙高的线程信息
# 这里的1,就是上面占用CPU的进程PID
root@606ca0c7f783:/usr/local/tomcat# top -H -p 1
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3388 root 20 0 3319700 392824 13732 R 99.8 19.3 1:29.68 java
1 root 20 0 3319700 392824 13732 S 0.0 19.3 0:00.05 java
27 root 20 0 3319700 392824 13732 S 0.0 19.3 0:00.00 java
28 root 20 0 3319700 392824 13732 S 0.0 19.3 0:02.65 java
29 root 20 0 3319700 392824 13732 S 0.0 19.3 0:00.36 java
30 root 20 0 3319700 392824 13732 S 0.0 19.3 0:00.37 java
PID=3388的这个线程,占用了99.8%的CPU,那么我们直接分析这个线程就可以了
2.4 jstack命令查看进程栈信息
# 1.将进程号=1的进程栈信息导出到2.txt中
root@606ca0c7f783:/usr/local/tomcat# jstack 1 > 2.txt
# 2.将线程号转换为16进制(3388就是2.3中查出来的线程号)
root@606ca0c7f783:/usr/local/tomcat# printf "%x\n" 3388
# 3.查询d3c的线程信息
root@606ca0c7f783:/usr/local/tomcat# grep d3c 2.txt
"http-nio-8080-exec-7" daemon prio=10 tid=0x00000040b4005000 nid=0xd3c runnable [0x00000041100b6000]
java.lang.Thread.State: RUNNABLE
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:345)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
- locked <0x00000000d6d2ff58> (a java.io.BufferedOutputStream)
at java.io.PrintStream.write(PrintStream.java:482)
- locked <0x00000000d6d2ff38> (a java.io.PrintStream)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104)
- locked <0x00000000d6d30088> (a java.io.OutputStreamWriter)
at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:185)
at java.io.PrintStream.newLine(PrintStream.java:546)
- eliminated <0x00000000d6d2ff38> (a java.io.PrintStream)
at java.io.PrintStream.println(PrintStream.java:737)
- locked <0x00000000d6d2ff38> (a java.io.PrintStream)
at org.apache.tomcat.util.log.SystemLogHandler.println(SystemLogHandler.java:237)
# 在这里可以看到具体的代码块
at com.example.controller.JvmTest.incre(JvmTest.java:21)
最终分析到是由于JvmTest.incre()方法导致的CPU飙高
2.5 通过代码验证分析正确性
既然分析到JvmTest.incre()方法导致的,我们直接来看对应代码