Cannot limit CPU in OpenNebula

We are running OpenNebula 5.8 and getting an issue of CPU limiting. Our assign CPU limits cross by the VM as shown in the attached image. We already added CGGROUP but performance not improve. How can we solve the issue??

image

Hi Mosharaf,

Have you tried this fix (broken upgrade of libvirt that you need to revert back):

Broken cpuset.cpus cgroups in libvirt 4.5.0-23 (CentOS 7.7.1908)

It might solve your problem…

Hi all,

I dont think it is the case. The linked fix is for the case when libvirt fail to start the VMs at all.
IMHO this is different case.

@mosharaflink do you oversubscribe the CPUs? Could you check the CPU steal time in the VM? Run top in the VM and look the last column on %Cpu(s): line

Best Regards,
Anton Todorov

@atodorov_storpool do you imply that the CPU steel value, if positive, would make this CPU curve display above the configured limit ?

Hmm. Taking a second thought this monitoring probe does not take in account the steal time. I’ve found vms with similar pattern in our cluster so i’ll take a closer look. At first glance it is the qemu-kvm process running in the host. But I’ll take a closer look tomorrow

BR,
Anton

It looks like this patch is related to your case.

could you apply the patch - on the frontend edit /var/lib/one/remotes/vmm/kvm/poll
edit the function from

    def self.number_of_processors
        %x{nproc}.to_i
    end

to

    def self.number_of_processors
        `grep -c processor /proc/cpuinfo`.to_i
    end

then sync the change to the hosts

su - oneadmin -c 'onehost sync --force'

Edit: I’ll try to understand the calculations and how are they affected when cpuset cgroup is used (we are using cgroups in all of our clusters)

I have applied the patch and lets see what the chages waiting to us. I will update soon.

This also happens for Windows VM. is there any way to get CPU steal time in windows vm?? please let me know.

I didn’t find any change after patch applied. Still facing the CPU cannot limit problem.