Chapter 24. Using cgroups-v2 to control distribution of CPU time for applications
Chapter 24. Using cgroups-v2 to control distribution of CPU time for applications
https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/8/html/managing_monitoring_and_updating_the_kernel/using-cgroups-v2-to-control-distribution-of-cpu-time-for-applications_managing-monitoring-and-updating-the-kernel
Prevent resource exhaustion by placing applications into control groups version 2 (cgroups-v2). By configuring CPU limits for these groups, you can regulate CPU consumption and ensure system stability.
The user has two methods how to regulate distribution of CPU time allocated to a control group:
Setting CPU bandwidth (editing the cpu.max controller file) Setting CPU weight (editing the cpu.weight controller file) 24.1. Mounting cgroups-v2
RHEL 8 mounts cgroups-v1 by default. Configure the system manually to use cgroups-v2 for resource limiting. You can use systemd to control the resource usage. In special cases, you must manually configure cgroups, such as when you use cgroups-v1 controllers that have no cgroups-v2 equivalents.
Prerequisites
You have root permissions. Procedure
Configure the system to mount cgroups-v2 by default during system boot by the systemd system and service manager:
grubby –update-kernel=/boot/vmlinuz-$(uname -r) –args=”systemd.unified_cgroup_hierarchy=1”
Copy to Clipboard
This adds the necessary kernel command-line parameter to the current boot entry.
To add the systemd.unified_cgroup_hierarchy=1 parameter to all kernel boot entries:
grubby –update-kernel=ALL –args=”systemd.unified_cgroup_hierarchy=1”
Reboot the system for the changes to take effect. Verification
Verify the cgroups-v2 filesystem is mounted:
mount -l | grep cgroup
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,seclabel,nsdelegate)
The cgroups-v2 filesystem was successfully mounted on the /sys/fs/cgroup/ directory.
Inspect the contents of the /sys/fs/cgroup/ directory:
ll /sys/fs/cgroup/
-r—r—r–. 1 root root 0 Apr 29 12:03 cgroup.controllers -rw-r—r–. 1 root root 0 Apr 29 12:03 cgroup.max.depth -rw-r—r–. 1 root root 0 Apr 29 12:03 cgroup.max.descendants -rw-r—r–. 1 root root 0 Apr 29 12:03 cgroup.procs -r—r—r–. 1 root root 0 Apr 29 12:03 cgroup.stat -rw-r—r–. 1 root root 0 Apr 29 12:18 cgroup.subtree_control -rw-r—r–. 1 root root 0 Apr 29 12:03 cgroup.threads -rw-r—r–. 1 root root 0 Apr 29 12:03 cpu.pressure -r—r—r–. 1 root root 0 Apr 29 12:03 cpuset.cpus.effective -r—r—r–. 1 root root 0 Apr 29 12:03 cpuset.mems.effective -r—r—r–. 1 root root 0 Apr 29 12:03 cpu.stat drwxr-xr-x. 2 root root 0 Apr 29 12:03 init.scope -rw-r—r–. 1 root root 0 Apr 29 12:03 io.pressure -r—r—r–. 1 root root 0 Apr 29 12:03 io.stat -rw-r—r–. 1 root root 0 Apr 29 12:03 memory.pressure -r—r—r–. 1 root root 0 Apr 29 12:03 memory.stat drwxr-xr-x. 69 root root 0 Apr 29 12:03 system.slice drwxr-xr-x. 3 root root 0 Apr 29 12:18 user.slice Show more
The /sys/fs/cgroup/ directory, also called the root control group, by default, provides interface files (starting with cgroup) and controller-specific files such as cpuset.cpus.effective. In addition, some directories related to systemd exist, such as, /sys/fs/cgroup/init.scope, /sys/fs/cgroup/system.slice, and /sys/fs/cgroup/user.slice.
Additional resources
cgroups(7), sysfs(5) manual pages 24.2. Preparing the cgroup for distribution of CPU time
Enable CPU controllers and create dedicated control groups to manage application CPU consumption. For better organisation, establish at least two levels of child control groups within the /sys/fs/cgroup/ directory.
Prerequisites
You have root permissions. You have identified PIDs of processes that you want to control. You have mounted the cgroups-v2 file system. For more information, see Mounting cgroups-v2. Procedure
Identify the process IDs (PIDs) of applications whose CPU consumption you want to constrict:
top
Tasks: 104 total, 3 running, 101 sleeping, 0 stopped, 0 zombie %Cpu(s): 17.6 us, 81.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.8 hi, 0.0 si, 0.0 st MiB Mem : 3737.4 total, 3312.7 free, 133.3 used, 291.4 buff/cache MiB Swap: 4060.0 total, 4060.0 free, 0.0 used. 3376.1 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 34578 root 20 0 18720 1756 1468 R 99.0 0.0 0:31.09 sha1sum 34579 root 20 0 18720 1772 1480 R 99.0 0.0 0:30.54 sha1sum
1 root 20 0 186192 13940 9500 S 0.0 0.4 0:01.60 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd
3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp ... Show more
The example output reveals that PID 34578 and 34579 (two illustrative applications of sha1sum) consume a huge amount of resources, namely CPU. Both are the example applications used to demonstrate managing the cgroups-v2 functionality.
Verify that the cpu and cpuset controllers are available in the /sys/fs/cgroup/cgroup.controllers file:
cat /sys/fs/cgroup/cgroup.controllers
cpuset cpu io memory hugetlb pids rdma
Enable CPU-related controllers:
echo “+cpu” » /sys/fs/cgroup/cgroup.subtree_control
echo “+cpuset” » /sys/fs/cgroup/cgroup.subtree_control
These commands enable the cpu and cpuset controllers for the immediate children groups of the /sys/fs/cgroup/ root control group. A child group is where you can specify processes and apply control checks to each of the processes based on your criteria.
You can review the cgroup.subtree_control file at any level to identify the controllers that can be enabled in the immediate child group.
Note By default, the /sys/fs/cgroup/cgroup.subtree_control file in the root control group contains memory and pids controllers.
Create the /sys/fs/cgroup/Example/ directory:
mkdir /sys/fs/cgroup/Example/
The /sys/fs/cgroup/Example/ directory defines a child group. Also, the previous step enabled the cpu and cpuset controllers for this child group.
When you create the /sys/fs/cgroup/Example/ directory, some cgroups-v2 interface files and cpu and cpuset controller-specific files are automatically created in the directory. The /sys/fs/cgroup/Example/ directory also provides controller-specific files for the memory and pids controllers.
Optional: Inspect the newly created child control group:
ll /sys/fs/cgroup/Example/
-r—r—r–. 1 root root 0 Jun 1 10:33 cgroup.controllers -r—r—r–. 1 root root 0 Jun 1 10:33 cgroup.events -rw-r—r–. 1 root root 0 Jun 1 10:33 cgroup.freeze -rw-r—r–. 1 root root 0 Jun 1 10:33 cgroup.max.depth -rw-r—r–. 1 root root 0 Jun 1 10:33 cgroup.max.descendants -rw-r—r–. 1 root root 0 Jun 1 10:33 cgroup.procs -r—r—r–. 1 root root 0 Jun 1 10:33 cgroup.stat -rw-r—r–. 1 root root 0 Jun 1 10:33 cgroup.subtree_control … -rw-r—r–. 1 root root 0 Jun 1 10:33 cpuset.cpus -r—r—r–. 1 root root 0 Jun 1 10:33 cpuset.cpus.effective -rw-r—r–. 1 root root 0 Jun 1 10:33 cpuset.cpus.partition -rw-r—r–. 1 root root 0 Jun 1 10:33 cpuset.mems -r—r—r–. 1 root root 0 Jun 1 10:33 cpuset.mems.effective -r—r—r–. 1 root root 0 Jun 1 10:33 cpu.stat -rw-r—r–. 1 root root 0 Jun 1 10:33 cpu.weight -rw-r—r–. 1 root root 0 Jun 1 10:33 cpu.weight.nice … -r—r—r–. 1 root root 0 Jun 1 10:33 memory.events.local -rw-r—r–. 1 root root 0 Jun 1 10:33 memory.high -rw-r—r–. 1 root root 0 Jun 1 10:33 memory.low … -r—r—r–. 1 root root 0 Jun 1 10:33 pids.current -r—r—r–. 1 root root 0 Jun 1 10:33 pids.events -rw-r—r–. 1 root root 0 Jun 1 10:33 pids.max Show more
The example output shows files such as cpuset.cpus and cpu.max. These files are specific to the cpuset and cpu controllers. The cpuset and cpu controllers are manually enabled for the root’s (/sys/fs/cgroup/) direct child control groups using the /sys/fs/cgroup/cgroup.subtree_control file.
The directory also includes general cgroup control interface files such as cgroup.procs or cgroup.controllers, which are common to all control groups, regardless of enabled controllers.
The files such as memory.high and pids.max relate to the memory and pids controllers, which are in the root control group (/sys/fs/cgroup/), and are always enabled by default.
By default, the newly created child group inherits access to all of the system’s CPU and memory resources, without any limits.
Enable the CPU-related controllers in /sys/fs/cgroup/Example/ to obtain controllers that are relevant only to CPU:
echo “+cpu” » /sys/fs/cgroup/Example/cgroup.subtree_control
echo “+cpuset” » /sys/fs/cgroup/Example/cgroup.subtree_control
These commands ensure that the immediate child control group will only have controllers relevant to regulate the CPU time distribution - not to memory or pids controllers.
Create the /sys/fs/cgroup/Example/tasks/ directory:
mkdir /sys/fs/cgroup/Example/tasks/
The /sys/fs/cgroup/Example/tasks/ directory defines a child group with files that relate purely to cpu and cpuset controllers.
Optional: Inspect another child control group:
ll /sys/fs/cgroup/Example/tasks
-r—r—r–. 1 root root 0 Jun 1 11:45 cgroup.controllers -r—r—r–. 1 root root 0 Jun 1 11:45 cgroup.events -rw-r—r–. 1 root root 0 Jun 1 11:45 cgroup.freeze -rw-r—r–. 1 root root 0 Jun 1 11:45 cgroup.max.depth -rw-r—r–. 1 root root 0 Jun 1 11:45 cgroup.max.descendants -rw-r—r–. 1 root root 0 Jun 1 11:45 cgroup.procs -r—r—r–. 1 root root 0 Jun 1 11:45 cgroup.stat -rw-r—r–. 1 root root 0 Jun 1 11:45 cgroup.subtree_control -rw-r—r–. 1 root root 0 Jun 1 11:45 cgroup.threads -rw-r—r–. 1 root root 0 Jun 1 11:45 cgroup.type -rw-r—r–. 1 root root 0 Jun 1 11:45 cpu.max -rw-r—r–. 1 root root 0 Jun 1 11:45 cpu.pressure -rw-r—r–. 1 root root 0 Jun 1 11:45 cpuset.cpus -r—r—r–. 1 root root 0 Jun 1 11:45 cpuset.cpus.effective -rw-r—r–. 1 root root 0 Jun 1 11:45 cpuset.cpus.partition -rw-r—r–. 1 root root 0 Jun 1 11:45 cpuset.mems -r—r—r–. 1 root root 0 Jun 1 11:45 cpuset.mems.effective -r—r—r–. 1 root root 0 Jun 1 11:45 cpu.stat -rw-r—r–. 1 root root 0 Jun 1 11:45 cpu.weight -rw-r—r–. 1 root root 0 Jun 1 11:45 cpu.weight.nice -rw-r—r–. 1 root root 0 Jun 1 11:45 io.pressure -rw-r—r–. 1 root root 0 Jun 1 11:45 memory.pressure Show more
Ensure the processes that you want to control for CPU time compete on the same CPU:
echo “1” > /sys/fs/cgroup/Example/tasks/cpuset.cpus
This ensures the processes you will place in the Example/tasks child control group, compete on the same CPU. This setting is important for the cpu controller to activate.
Important The cpu controller is only activated if the relevant child control group has at least 2 processes to compete for time on a single CPU.
Verification
Optional: Ensure the CPU-related controllers are enabled for the immediate children cgroups:
cat /sys/fs/cgroup/cgroup.subtree_control /sys/fs/cgroup/Example/cgroup.subtree_control
cpuset cpu memory pids cpuset cpu Show more
Optional: Ensure the processes that you want to control for CPU time compete on the same CPU:
cat /sys/fs/cgroup/Example/tasks/cpuset.cpus
1
Additional resources
Introducing control groups Introducing kernel resource controllers Mounting cgroups-v2 cgroups(7), sysfs(5) manual pages 24.3. Controlling distribution of CPU time for applications by adjusting CPU bandwidth
You need to assign values to the relevant files of the cpu controller to regulate distribution of the CPU time to applications under the specific cgroup tree.
Prerequisites
You have root permissions. You have at least two applications for which you want to control distribution of CPU time. You ensured the relevant applications compete for CPU time on the same CPU as described in Preparing the cgroup for distribution of CPU time. You mounted cgroups-v2 filesystem as described in Mounting cgroups-v2. You enabled cpu and cpuset controllers both in the parent control group and in child control group similarly as described in Preparing the cgroup for distribution of CPU time. You created two levels of child control groups inside the /sys/fs/cgroup/ root control group as in the example below:
… ├── Example │ ├── tasks … Show more
Procedure
Configure CPU bandwidth to achieve resource restrictions within the control group:
echo “200000 1000000” > /sys/fs/cgroup/Example/tasks/cpu.max
The first value is the allowed time quota in microseconds for which all processes collectively in a child group can run during one period. The second value specifies the length of the period.
During a single period, when processes in a control group collectively exhaust the time specified by this quota, they are throttled for the remainder of the period and not allowed to run until the next period.
This command sets CPU time distribution controls so that all processes collectively in the /sys/fs/cgroup/Example/tasks child group can run on the CPU for only 0.2 seconds of every 1 second. That is, one fifth of each second.
Optional: Verify the time quotas:
cat /sys/fs/cgroup/Example/tasks/cpu.max
200000 1000000
Add the applications’ PIDs to the Example/tasks child group:
echo “34578” > /sys/fs/cgroup/Example/tasks/cgroup.procs
echo “34579” > /sys/fs/cgroup/Example/tasks/cgroup.procs
The example commands ensure that required applications become members of the Example/tasks child group and do not exceed the CPU time distribution configured for this child group.
Verification
Verify that the applications run in the specified control group:
cat /proc/34578/cgroup /proc/34579/cgroup
0::/Example/tasks 0::/Example/tasks Show more
The output above shows the processes of the specified applications that run in the Example/tasks child group.
Inspect the current CPU consumption of the throttled applications:
top
top - 11:13:53 up 23:10, 1 user, load average: 0.26, 1.33, 1.66 Tasks: 104 total, 3 running, 101 sleeping, 0 stopped, 0 zombie %Cpu(s): 3.0 us, 7.0 sy, 0.0 ni, 89.5 id, 0.0 wa, 0.2 hi, 0.2 si, 0.2 st MiB Mem : 3737.4 total, 3312.6 free, 133.4 used, 291.4 buff/cache MiB Swap: 4060.0 total, 4060.0 free, 0.0 used. 3376.0 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 34578 root 20 0 18720 1756 1468 R 10.0 0.0 37:36.13 sha1sum 34579 root 20 0 18720 1772 1480 R 10.0 0.0 37:41.22 sha1sum
1 root 20 0 186192 13940 9500 S 0.0 0.4 0:01.60 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd
3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp ... Show more
Notice that the CPU consumption for the PID 34578 and PID 34579 has decreased to 10%. The Example/tasks child group regulates its processes to 20% of the CPU time collectively. Since the control group contains 2 processes, each can use 10% of the CPU time.
24.4. Controlling distribution of CPU time for applications by adjusting CPU weight
You need to assign values to the relevant files of the cpu controller to regulate distribution of the CPU time to applications under the specific cgroup tree.
Prerequisites
You have root permissions. You have applications for which you want to control distribution of CPU time. You ensured the relevant applications compete for CPU time on the same CPU as described in Preparing the cgroup for distribution of CPU time. You mounted cgroups-v2 filesystem as described in Mounting cgroups-v2. You created a two level hierarchy of child control groups inside the /sys/fs/cgroup/ root control group as in the following example:
… ├── Example │ ├── g1 │ ├── g2 │ └── g3 … Show more
You enabled cpu and cpuset controllers in the parent control group and in child control groups similarly as described in Preparing the cgroup for distribution of CPU time. Procedure
Configure desired CPU weights to achieve resource restrictions within the control groups:
echo “150” > /sys/fs/cgroup/Example/g1/cpu.weight
echo “100” > /sys/fs/cgroup/Example/g2/cpu.weight
echo “50” > /sys/fs/cgroup/Example/g3/cpu.weight
Show more
Add the applications’ PIDs to the g1, g2, and g3 child groups:
echo “33373” > /sys/fs/cgroup/Example/g1/cgroup.procs
echo “33374” > /sys/fs/cgroup/Example/g2/cgroup.procs
echo “33377” > /sys/fs/cgroup/Example/g3/cgroup.procs
Show more
The example commands ensure that desired applications become members of the Example/g*/ child cgroups and will get their CPU time distributed as per the configuration of those cgroups.
The weights of the children cgroups (g1, g2, g3) that have running processes are summed up at the level of the parent cgroup (Example). The CPU resource is then distributed proportionally based on their weights.
As a result, when all processes run at the same time, the kernel allocates to each of them the proportionate CPU time based on their cgroup’s cpu.weight file:
Child cgroup cpu.weight file CPU time allocation g1
150
~50% (150/300)
g2
100
~33% (100/300)
g3
50
~16% (50/300)
The value of the cpu.weight controller file is not a percentage.
If one process stopped running, leaving cgroup g2 with no running processes, the calculation would omit the cgroup g2 and only account weights of cgroups g1 and g3:
Child cgroup cpu.weight file CPU time allocation g1
150
~75% (150/200)
g3
50
~25% (50/200)
Important If a child cgroup has multiple running processes, the CPU time allocated to the cgroup is distributed equally among its member processes.
Verification
Verify that the applications run in the specified control groups:
cat /proc/33373/cgroup /proc/33374/cgroup /proc/33377/cgroup
0::/Example/g1 0::/Example/g2 0::/Example/g3 Show more
The command output shows the processes of the specified applications that run in the Example/g*/ child cgroups.
Inspect the current CPU consumption of the throttled applications:
top
top - 05:17:18 up 1 day, 18:25, 1 user, load average: 3.03, 3.03, 3.00 Tasks: 95 total, 4 running, 91 sleeping, 0 stopped, 0 zombie %Cpu(s): 18.1 us, 81.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 0.0 si, 0.0 st MiB Mem : 3737.0 total, 3233.7 free, 132.8 used, 370.5 buff/cache MiB Swap: 4060.0 total, 4060.0 free, 0.0 used. 3373.1 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 33373 root 20 0 18720 1748 1460 R 49.5 0.0 415:05.87 sha1sum 33374 root 20 0 18720 1756 1464 R 32.9 0.0 412:58.33 sha1sum 33377 root 20 0 18720 1860 1568 R 16.3 0.0 411:03.12 sha1sum
760 root 20 0 416620 28540 15296 S 0.3 0.7 0:10.23 tuned
1 root 20 0 186328 14108 9484 S 0.0 0.4 0:02.00 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthread ... Show more
Note All processes run on a single CPU for clear illustration. The CPU weight applies the same principles when used on multiple CPUs.
Notice that the CPU resource for the PID 33373, PID 33374, and PID 33377 was allocated based on the 150, 100, and 50 weights you assigned to the child cgroups. The weights correspond to around 50%, 33%, and 16% allocation of CPU time for each application.