Reports

Your problem may be resultant of: number of threads vs CPU Limits vs vCPU (Allocatable CPUs on K8s node).

K8s CPU Limits are enforced by Linux Kernel cgroups CFS Scheduler, CFS CPU Bandwidth Control or CFS Quota which.

The time consumed on CPUs by a cgroup (i.e. a Container in K8s) is accounted within each CFS Quota period = wall-clock PERIOD of 100 ms (milliseconds). K8s Limit (i.e. CFS Quota) is also enforced within one CFS Quota period.

So, if Your container has limit of 1 CPU it may consumes 100 millis of CPU time during 100 ms CFS Period, respectively if your CPU Limit is "4000m" (4000 millicores = 4 vCPUs) it may consumes no more than 400 milliseconds of CPU time during 100 millis of wall-clock time (which is CFS Quota Period, or roughly said: 4 CPUs during 1 second), e.g. 4 threads running in parallel on different CPUs, but what if there are more than 4 threads and more than 4 available CPUs? See the picture:

During 20 millis of wall clock time 20 threads each running in parallel (if there are >=20 vCPUs on node) will consume altogether (20 parallel threads * 20 CPU millies per thread) CPU Quota of 400 millis spent on CPU(s) (which came from K8s CPU limit), so for the rest of 80 millis Workload (Container processes and threads) will be throttled - no CPU Time and this may explains high latency and unresponsiveness.

PIC (adjusted) is taken from this nice link: https://medium.com/omio-engineering/cpu-limits-and-aggressive-throttling-in-kubernetes-c5b20bd8a718

More details on CPU Limits, Multi-tasking (multiple processes or threads within a container) - also read here: https://aws.amazon.com/blogs/containers/using-prometheus-to-avoid-disasters-with-kubernetes-cpu-limits/

79320951