In the top of the container, although the PID you see is the container, the statistics of the %CPU are that of the host.
As shown in Fig
principle
How is the CPU usage of a process calculated?
The state of each process is placed in a file, in the /proc directory, each process has its own folder named by pid,
For example, in my current Jenkins docker process, I can query Pid 30134 through inspect
Let's continue looking at the file under /proc/30134.
There are a lot of files in it, and we only care about stat files now
[root@pass-m-k8s-node-1 30134]# catstat30134 (he) s 30112 30134 30134 0 -1 1077944576 1639 0 0 0 244 934 0 0 20 0 0 1 0 2352840503 2514944 114 18446744073709551615 94288423874560 94288423890205 140731692225440 140731692224320 139648449412143 0 0 3145728 0 18446744072548467014 0 0 17 12 0 0 0 0 0 94288423901968 94288423904395 94288443621376 140731692228014 140731692228057 140731692228057 140731692228586 0 0
This stat file is the real-time status information of the process, and the real-time output of the status information of the process, such as the running state of the process (Running or
Sleeping), parent process PID, process priority, memory used by the process, and so on, totaling more than 50 items.
Each of these metrics is separated by one space, and now look at item 14 when utime, which is the CPU time slice occupied by the user-mode portion of the process. 244
The 15th item is stime, which is the CPU time slice occupied by the kernel state of the process. 934
It should be noted that both utime and stime are cumulative values, which means that they have been growing cumulatively since the process was started
The CPU usage of the process is then calculated based on these two values. I won't talk about the specific calculation formula.
The above is the CPU usage calculation of a single process, so where is the CPU data of the whole system?
in the /proc/stat file
I see
The reason why you see the CPU of the host in the top of the container is that the top checks the /proc/stat file, which reflects the status information of the entire host, not a single container.
Is there any way to do it
We know that each container has its own CPU cgroup control group, and there are many files in the directory of this control group
For example, the directory of my current container cgroup is
/sys/fs/cgroup/cpuacct/system.slice/docker-1a97854ff7856b7327122bea18c3676f05cd2bf74e9502fe24370c8f011ceb1c.scope
There is a cpuacct.stat file that contains the CPU information of the entire container
Note that the user and system information here are also cumulative
So we can get it once a second, and calculate the real-time CPU usage.
Many tools such as Prometheus, resource calculations in k8s, and docker calculations ultimately come from this cgroup file.
If the number of containers on a node is less than 1,000 and the computing cycle is 10 seconds, the consumption of computing resources is small.
Isn't it still troublesome, the next article shares a gadget to help us solve this problem.