Cgroup2: Confusing Error When `linux.resources.cpu.shares=1`
Introduction
In the world of containerization, managing resources is crucial for efficient and effective container deployment. One of the essential resources is CPU, which can be allocated to containers using the cpu-shares
parameter in the container configuration. However, when dealing with cgroup2, a newer version of the Linux control group framework, things can get confusing. In this article, we will explore the issue of a confusing error message when linux.resources.cpu.shares=1
is specified in the container configuration.
The Problem
When a container is created with a cpu-shares
value that is out of range, the runc runtime reports an error message. However, on a cgroup2 host, the error message does not clearly indicate that the issue is with the cpu-shares
configuration. Instead, it mentions a numerical result out of range, which can be misleading.
Example Error Message
Here is an example error message that can be seen when running a container with an out-of-range cpu-shares
value on a cgroup2 host:
$ docker run --rm --cpu-shares 1 hello-world
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: failed to write "70369281052672": write /sys/fs/cgroup/docker/4139b57a20695ff4f62dda799b1c4052791f59bac4611534cb3a213a3e446add/cpu.weight: numerical result out of range: unknown.
As you can see, the error message does not clearly indicate that the issue is with the cpu-shares
configuration.
Why This is a runc Bug
The runc runtime is responsible for mapping the cpu-shares
value in the container configuration to a cpu-weight
value for cgroups v2. However, the ConvertCPUSharesToCgroupV2Value
function is documented to only take inputs in the range [2, 262144]. Calling the function with values outside that range is therefore a bug in the caller, by definition.
The runc runtime calls the conversion function without validating that its argument is within the valid range. Therefore, the bug is in runc for not validating the input range.
Expected Behavior
The expected behavior is that runc returns an error message that clearly flags the cpu-shares
configuration as being out of range, irrespective of the host system's configuration.
Conclusion
In conclusion, the confusing error message when linux.resources.cpu.shares=1
is specified in the container configuration is a bug in the runc runtime. The runc runtime should validate the input range of the cpu-shares
value and return an error message that clearly indicates the issue.
Recommendations
To avoid this issue, it is recommended to:
- Always validate the input range of the
cpu-shares
value before passing it to the runc runtime. - Use a value within the valid range [2, 262144] for
cpu-shares
parameter in the container configuration. - Update the runc runtime to validate the input range of the
cpu-shares
value and return an error message that clearly indicates the issue.
Future Work
In the future, it would be beneficial to:
- Improve the error message returned by the runc runtime to clearly indicate the issue with the
cpu-shares
configuration. - Update the runc runtime to validate the input range of the
cpu-shares
value and return an error message that clearly indicates the issue.
References
- Runtime-Spec
- runc
- cgroups
- utils.go
cgroup2: Confusing Error Whenlinux.resources.cpu.shares=1
- Q&A ===========================================================
Q: What is the issue with the error message when linux.resources.cpu.shares=1
is specified in the container configuration?
A: The error message does not clearly indicate that the issue is with the cpu-shares
configuration. Instead, it mentions a numerical result out of range, which can be misleading.
Q: Why is this a bug in the runc runtime?
A: The runc runtime is responsible for mapping the cpu-shares
value in the container configuration to a cpu-weight
value for cgroups v2. However, the ConvertCPUSharesToCgroupV2Value
function is documented to only take inputs in the range [2, 262144]. Calling the function with values outside that range is therefore a bug in the caller, by definition.
Q: What is the expected behavior of the runc runtime?
A: The expected behavior is that runc returns an error message that clearly flags the cpu-shares
configuration as being out of range, irrespective of the host system's configuration.
Q: How can I avoid this issue?
A: To avoid this issue, it is recommended to:
- Always validate the input range of the
cpu-shares
value before passing it to the runc runtime. - Use a value within the valid range [2, 262144] for
cpu-shares
parameter in the container configuration. - Update the runc runtime to validate the input range of the
cpu-shares
value and return an error message that clearly indicates the issue.
Q: What are the implications of this bug?
A: The implications of this bug are that it can lead to confusion and difficulty in debugging issues related to CPU resource allocation in containers. It can also lead to incorrect assumptions about the behavior of the runc runtime.
Q: How can I report this bug?
A: You can report this bug by filing an issue on the runc GitHub repository. Please provide a clear and concise description of the issue, including any relevant logs or error messages.
Q: What is the current status of this bug?
A: The current status of this bug is that it is acknowledged as a bug in the runc runtime. However, it has not yet been fixed. We recommend keeping an eye on the runc GitHub repository for updates on this issue.
Q: What are the next steps for resolving this bug?
A: The next steps for resolving this bug are to:
- Update the runc runtime to validate the input range of the
cpu-shares
value and return an error message that clearly indicates the issue. - Test the updated runc runtime to ensure that it correctly handles out-of-range
cpu-shares
values. - Release the updated runc runtime to the public.
Q: How can I get involved in resolving this bug?
A: You can get involved in resolving this bug by:
- Filing an issue on the runc GitHub repository to report the bug.
- Contributing to the runc runtime by updating the code to validate the input range of the
cpu-shares
value. - Testing the updated runc runtime to ensure that it correctly handles out-of-range
cpu-shares
values.
Q: What are the benefits of resolving this bug?
A: The benefits of resolving this bug are that it will:
- Improve the accuracy and clarity of error messages returned by the runc runtime.
- Reduce the complexity and difficulty of debugging issues related to CPU resource allocation in containers.
- Enhance the overall reliability and stability of the runc runtime.