天天看點

TOP does not show CPU usage for any process (0.0)

TOP does not show CPU usage for any process (0.0)​

環境

  • Red Hat Enterprise Linux 7.6

問題

  • TOP does not show CPU usage for any process (0.0)

決議

  • Restart the machine and check whether the TSC error message disappears.
  • A BIOS update is suggested.
  • If the error persists, update kernel to version kernel-3.10.0-957.21.3.el7 which has updated TSC clock drivers

根源

  • Due to a failure in TSC synchronization, top utility does not show CPU usage properly in the RHEL7.
  • The issue was identified in some Dell servers.

診斷步驟

After top had been running for around 1 minute while high intensive CPU operations were being executed, all processes showed no CPU usage.

You can verify that by running top in batch mode, then checking if some process is showing CPU usage (not 0.0).

​​Raw​​

#top -b -n10 > /tmp/top.txt
#awk '{if($1 ~ /^[0-9]/ && $9 != "0.0") print}' /tmp/top.txt
      

In a normal scenario, some output should appear (as the sample below).

#awk '{if($1 ~ /^[0-9]/ && $9 != "0.0") print}' /tmp/top.txt
19534 root       0 -20       0      0      0 I   0.3   0.0   0:07.00 kworker/u17:3-kcryptd/253:0
19880 root       0 -20       0      0      0 I   0.3   0.0   0:06.66 kworker/u17:7-kcryptd/253:0
20337 root       0 -20       0      0      0 I   0.3   0.0   0:04.18 kworker/u17:6-kcryptd/253:0
21208 root       20   0 2936108 282856 170584 S   0.3   1.8   0:09.72 Web Content
21299 root      20   0       0      0      0 I   0.3   0.0   0:00.05 kworker/7:2-mm_percpu_wq
21305 root       20   0 2874484 272912 136212 S   0.3   1.7   0:08.12 Web Content
21704 root       0 -20       0      0      0 I   0.3   0.0   0:00.45 kworker/u17:8-kcryptd/253:0
      

If no process appears, that may be a problem.

In addition to that, check whether the TSC clocksource error appeared in dmesg logs:

#dmesg | grep -ir "tsc\|clocksource"  
[    0.000000] tsc: Detected 2600.000 MHz processor
[    0.195594] TSC deadline timer enabled
[    0.201405] TSC synchronization [CPU#0 -> CPU#1]:
[    0.201407] Measured 249252531012 cycles TSC warp between CPUs, turning off TSC clock.
[    0.201409] tsc: Marking TSC unstable due to check_tsc_sync_source failed
[    0.456994] Switched to clocksource hpet
      

You should also see TSC clock is not available in available_clocksource file.

#cat sys/devices/system/clocksource/clocksource0/available_clocksource 
hpet acpi_pm 
      

As a consequence, the OS will switch to an alternative clocksource. Here, in this case, it's "hpet" and incorrect information may appear in top utility.

#cat sys/devices/system/clocksource/clocksource0/current_clocksource  
hpet      

繼續閱讀