Create and start “lxc-example” container by lxc tool(version:2.0.9), it will create several cgroups under /sys/fs/cgroup/"sub-system"/lxc/, I mainly focus on the configuration of below subsystems:
/sys/fs/cgroup/cpu/lxc/lxc-example/…
/sys/fs/cgroup/cpuset/lxc/lxc-example/…
/sys/fs/cgroup/memory/lxc/lxc-example/…
/sys/fs/cgroup/devices/lxc/lxc-example/…
cpu subsystem:
lxc.cgroup.cpu.shares
#define the lower limit of cpu usage for your container;
#it is used as the weight by kernel to do the CFS(completely fair schedule) for your container;
#Each container(contains some tasks, which are controlled by it) is a schedule-group for the linux kernel perspective;
#because we use lxc tool(lxc-start command) to start a container, it will create a "lxc" cgroup under the root cgroup by default, then the container you specified will be created under "lxc", as showed below:
lxc-start -n lxc-example:
/sys/fs/cgroup/cpu/lxc/lxc-example/…
lxc-start -n lxc-example2:
/sys/fs/cgroup/cpu/lxc/lxc-example2/…
lxc-start -n lxc-example3:
/sys/fs/cgroup/cpu/lxc/lxc-example3/…
"lxc" cgroup controls the cpu share of all its child-cgroup(lxc-example,lxc-example2,lxc-example3):
lxc-example + lxc-example2 + lxc-example3 = "lxc" #default value of "lxc" is 1024(/sys/fs/cgroup/cpu/lxc/cpu.shares=1024)
For example, if the cpu share of "lxc" and its child-cgroup are showed as below:
/sys/fs/cgroup/cpu/lxc/cpu.shares = 4096
/sys/fs/cgroup/cpu/lxc/lxc-example/cpu.shares = 1024
/sys/fs/cgroup/cpu/lxc/lxc-example2/cpu.shares = 2048
/sys/fs/cgroup/cpu/lxc/lxc-example3/cpu.shares = 1024
Then, the cpu usage ratio is:
lxc-example:lxc-example2:lxc-example3=1:2:1
cpu share of "lxc" is 4096, which is used by linux kernel to do schdule together with task in host
for example,
1,if there is a normal task runing on host, whose weight is 1024. and there are normal task also running on lxc-example,lxc-example2,lxc-example3(host task and container tast all are pin on the same single cpu), the cpu usage ratio is:
host:lxc-example:lxc-example2:lxc-example3=1:1:2:1
2,if there is a normal task runing on host, whose weight is 2048. and there are normal task also running on lxc-example,lxc-example2,lxc-example3(host task and container tast all are pin on the same single cpu), the cpu usage ratio is:
host:lxc-example:lxc-example2:lxc-example3=2:1:2:1
3,if there is a normal task runing on host, whose weight is 4096. and there are normal task also running on lxc-example,lxc-example2,lxc-example3(host task and container tast all are pin on the same single cpu), the cpu usage ratio is:
host:lxc-example:lxc-example2:lxc-example3=4:1:2:1
if no task of other container or host running on the corresponding cpu, the cpu usage of this container can reach to 100%.
lxc.cgroup.cpu.cfs_quota_us
lxc.cgroup.cpu.cfs_period_us
# this is suitable for normal task scheduling.
#the two parameters defined the upper limit cpu usage of your container(no matter you configure one cpu or mutiple cpus for your container);
# for example, if lxc.cgroup.cpu.cfs_quota_us=60000, lxc.cgroup.cpu.cfs_period_us=100000. The upper limit of cpu usage for this container is 60% of each period.
#please note that, even no other task except the task in your container is running, your container only can reach to 60% cpu usage(as this is upper limit, no matter you configure one cpu or mutiple cpus for container).
# please node, if /sys/fs/cgroup/cpu/lxc/cpu.cfs_quota_us = -1, it means no limit for cpu usage, then you can set lxc.cgroup.cpu.cfs_quota_us of your container to any value(-1 or any positive value);
# default value of /sys/fs/cgroup/cpu/lxc/lxc.cgroup.cpu.cfs_period_us is 100000;
#configuration example:
#when /sys/fs/cgroup/cpu/lxc/cpu.cfs_quota_us = -1, /sys/fs/cgroup/cpu/lxc/lxc.cgroup.cpu.cfs_period_us = 100000, lxc.cgroup.cpu.cfs_period_us = 100000:
# 1, set only single cpu for your container
1> lxc.cgroup.cpu.cfs_quota_us= 50000, the upper limit is 50%
2> lxc.cgroup.cpu.cfs_quota_us= 80000, the upper limit is 80%
3> lxc.cgroup.cpu.cfs_quota_us= 100000, the upper limit is 100%
4> lxc.cgroup.cpu.cfs_quota_us= 200000, the upper limit is 100% (as only one cpu can use)
5> lxc.cgroup.cpu.cfs_quota_us= 4000000, the upper limit is 100% (as only one cpu can use)
# 2, set two cpus for your container
1> lxc.cgroup.cpu.cfs_quota_us= 50000, the upper limit is 50%
2> lxc.cgroup.cpu.cfs_quota_us= 80000, the upper limit is 80%
3> lxc.cgroup.cpu.cfs_quota_us= 100000, the upper limit is 100%
4> lxc.cgroup.cpu.cfs_quota_us= 150000, the upper limit is 150%
5> lxc.cgroup.cpu.cfs_quota_us= 200000, the upper limit is 200%
6> lxc.cgroup.cpu.cfs_quota_us= 4000000, the upper limit is 200% (as only two cpu can use)
when /sys/fs/cgroup/cpu/lxc/cpu.cfs_quota_us = 80000, /sys/fs/cgroup/cpu/lxc/lxc.cgroup.cpu.cfs_period_us = 100000, lxc.cgroup.cpu.cfs_period_us = 100000:
#1, set only single cpu for your container
1> lxc.cgroup.cpu.cfs_quota_us= 50000, the upper limit is 50%
2> lxc.cgroup.cpu.cfs_quota_us= 80000, the upper limit is 80%
3> lxc.cgroup.cpu.cfs_quota_us= 100000, you container can't be started as you configured wrong parameter
4> lxc.cgroup.cpu.cfs_quota_us= 200000, you container can't be started as you configured wrong parameter
5> lxc.cgroup.cpu.cfs_quota_us= 4000000, you container can't be started as you configured wrong parameter
#2, set two cpus for your container
1> lxc.cgroup.cpu.cfs_quota_us= 50000, the upper limit is 50%
2> lxc.cgroup.cpu.cfs_quota_us= 80000, the upper limit is 80%
3> lxc.cgroup.cpu.cfs_quota_us= 100000, you container can't be started as you configured wrong parameter
4> lxc.cgroup.cpu.cfs_quota_us= 150000, you container can't be started as you configured wrong parameter
5> lxc.cgroup.cpu.cfs_quota_us= 200000, you container can't be started as you configured wrong parameter
6> lxc.cgroup.cpu.cfs_quota_us= 4000000, you container can't be started as your configured wrong parameter
# according to above examples, there is some limitation when configure the two parameters, that is the resource of lower layer( /sys/fs/cgroup/cpu/lxc/lxc-example/cpu.cfs_quota_us) can’t exceed upper layer(when /sys/fs/cgroup/cpu/lxc/cpu.cfs_quota_us is a positive value, not "-1"):
# 1, /sys/fs/cgroup/cpu/lxc/cpu.cfs_period_us <=/sys/fs/cgroup/cpu/lxc/lxc-example/cpu.cfs_period_us
# 2, /sys/fs/cgroup/cpu/lxc/cpu.cfs_quota_us >=/sys/fs/cgroup/cpu/lxc/lxc-example/cpu.cfs_quota_us
lxc.cgroup.cpu.rt_runtime_us
lxc.cgroup.cpu.rt_period_us
# similar as above cfs parameters, thisis suitable for real-time task scheduling, which defines the upper limit of cputime of the RT process can occupy in one period
cpuset subsystem:
lxc.cgroup.cpuset.cpus
#it set the cpus for the container, itsvalue can be:
# 1, 2 (cpu2)
# 2, 1,3-5 (cpu1,3,4,5)
# btw, for the default cpuset.cpusconfiguration of upper layer:
# 1, /sys/fs/cgroup/cpuset/cpuset.cpus=0-15 (default contains all cpus)
# 2, /sys/fs/cgroup/cpuset/lxc/cpuset.cpus=0-15 (default contains all cpus)
lxc.cgroup.cpuset.cpu_exclusive
#allocate the specific cpu for thiscontainer, other container can’t use this cpu, buthost can
# for example, if “lxc.cgroup.cpuset.cpus=2 and lxc.cgroup.cpuset.cpu_exclusive=1”, CPU2 can be only used by host and this container.
#If we want CPU2 only can be used bythis container, we need to isolate CPU2 in kernel command line(isolcpus=2) ofhost
#please note, before configurelxc.cgroup.cpuset.cpu_exclusive=1 for this container, we need to make sure theupper layer also set cpu_exclusive=1:
# 1, /sys/fs/cgroup/cpuset/cpuset.cpu_exclusive=1(default is 1)
# 2, /sys/fs/cgroup/cpuset/lxc/cpuset.cpu_exclusive=1 (default is 0)
memory subsystem:
lxc.cgroup.memory.limit_in_bytes
#this defines the hard limit of the maxmemory the container can use(kernel_space + user_space). For example, iflxc.cgroup.memory.limit_in_bytes=200M, “malloc” can only apply 200Mmemory;
# if =-1, it means no limit for memory;
# if it exceed the limit, and no enoughinactive-page can be swapped or writeback for further using, it will triggeroom and cause corresponding task killed(lxc.cgroup.memory.oom_control=0).
# even not kill thetask(lxc.cgroup.memory.oom_control=1), the corresponding task may enter “D” state for long time.
# So memory.limit_in_byte and belowmemory.memsw.limit_in_bytes is an insurance way to limit memory usage to avoidrunning out of memory when some task of this container in abnormal state.
# So if we estimate the container mayuse max 1G memory, we can set lxc.cgroup.memory.limit_in_bytes=1.4G
lxc.cgroup.memory.memsw.limit_in_bytes
# the limit for max memory(above) +swap
# if =-1, it means no limit
lxc.cgroup.memory.oom_control
#it is used to control whether to killtask when oom happened
#default is 0(enable oom killer), if setto 1, disable oom killer.
lxc.cgroup.memory.force_empty
#the scenario for this parameter is: “echo 0 > memory.force_empty” will trigger to reclaim the page just before destroy cgroup(no taskin this cgroup).
lxc.cgroup.memory.swappiness
#default is 60, if memory usage reachedto 100-60=40%, it start to use swap partition.
lxc.cgroup.memory.kmem.limit_in_bytes
#define the max memory for kernel space,there are three usage:
# 1, memory.kmem.limit_in_bytes < memory.limit_in_bytes #accurately define the memory size forkernel space
# 2, memory.kmem.limit_in_bytes >= memory.limit_in_bytes #only care the total memory for kernel_space+ user_space
# 3, memory.kmem.limit_in_bytes == unlimited #don’t care the kernel memory size
lxc.cgroup.devices.deny
lxc.cgroup.devices.allow
1, container can only use the specified devices:
lxc.cgroup.devices.deny = a #"a" means it applys to all devices, which including all char and block devices, are denied
# Allow any mknod (but not using the node, because no "rw" property)
lxc.cgroup.devices.allow = c *:* m
lxc.cgroup.devices.allow = b *:* m
# /dev/null and zero device
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
# consoles device
lxc.cgroup.devices.allow = c 5:1 rwm
lxc.cgroup.devices.allow = c 5:0 rwm
lxc.cgroup.devices.allow = c 4:0 rwm
lxc.cgroup.devices.allow = c 4:1 rwm
2, container can't use the specified devices:
lxc.cgroup.devices.allow = a #"a" means it applys to all devices, which including all char and block devices, are allowed
# /dev/tty63 is denied
lxc.cgroup.devices.deny = c 4:63 rwm
#the meaning of the key paratemters:
#c: char device, b: block device, r: read, w: write, m: create(eg, mknod)
#5:1=major:minor