kafaka consumer突然CPU占用100%.. kafka裡還沒有資料,原來是有死循環代碼.
如何定位呢?
先top 找到占用CPU最大的程序
top - 03:11:00 up 52 days, 17:50, 5 users, load average: 0.99, 0.97, 0.99
Tasks: 149 total, 1 running, 148 sleeping, 0 stopped, 0 zombie
Cpu(s): 10.6%us, 15.1%sy, 0.0%ni, 74.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 3859888k total, 3362444k used, 497444k free, 186664k buffers
Swap: 524280k total, 412704k used, 111576k free, 403636k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
23485 root 20 0 2665m 425m 13m S 0.0 11.3 48:38.02 java
28605 root 20 0 1654m 410m 16m S 0.0 10.9 0:17.55 java
12862 root 20 0 2635m 393m 3516 S 100.6 10.4 21826:30 java
20337 root 20 0 1681m 174m 5308 S 0.0 4.6 49:29.67 java
19670 root 20 0 1630m 138m 4764 S 0.0 3.7 50:51.80 java

然後檢視這個程序下哪個線程占用的資源最多top -Hp 12862
[[email protected] ~]# top -Hp 12862
top - 03:16:07 up 52 days, 17:55, 5 users, load average: 0.99, 0.97, 0.99
Tasks: 47 total, 1 running, 46 sleeping, 0 stopped, 0 zombie
Cpu(s): 9.4%us, 16.3%sy, 0.0%ni, 74.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 3859888k total, 3356104k used, 503784k free, 187052k buffers
Swap: 524280k total, 412620k used, 111660k free, 403724k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12907 root 20 0 2635m 393m 3576 R 100.2 10.4 21520:32 java
13878 root 20 0 2635m 393m 3576 S 0.3 10.4 12:27.99 java
12862 root 20 0 2635m 393m 3576 S 0.0 10.4 0:00.00 java
12863 root 20 0 2635m 393m 3576 S 0.0 10.4 0:01.96 java

定位到12907這個線程 把它轉化成16進制是 326b
用 jstack -l 12862 > jstack.log; 生成線程堆棧日志檔案
打開jstack.log檔案 搜尋0x326b


"pool-3-thread-2" prio=10 tid=0x00007fb780235000 nid=0x326b runnable [0x00007fb7c89c2000]
java.lang.Thread.State: RUNNABLE
at com.elasticsearch.river.kafka.KafkaRiver$UpLoadFileWorker.run(KafkaRiver.java:303)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Locked ownable synchronizers:
- <0x00000000f04077f8> (a java.util.concurrent.ThreadPoolExecutor$Worker)
定位到死循環的代碼塊,原因是沒有sleep....