How to force a specific CPU type/feature on VMs?


(Kai 'wusel' Siering) #1

Due to Meltdown/Spectre Linux kernel changes, the PCID CPU feature needs to be exposed to the VMs. None of my existing VMs have a <cpu> tag in their deployment.xml files, they all list “QEMU Virtual CPU version 2.5+” as their CPU type (which lacks the PCID feature).

If I’m not mistaken, one/src/vmm/LibVirtDriverKVM.cc is responsible for creating the deployment.xmls, and there seems to be no way (up to release 5.4) to push any CPU specific data into the generated file?

Basically I need a way to add this to my provisioning templates:

<cpu mode='custom' match='exact'>
    <model fallback='allow'>kvm64</model>
    <feature policy='require' name='pcid'/>
</cpu>

Is there a way to achive this without hacking …/LibVirtDriverKVM.cc and rebuild ONe?


(Tino Vazquez) #2

Please refer to the following blog post, you can set the needed parameters in the RAW default configuration file for KVM:
https://opennebula.org/mitigating-meltdown-performance-penalty/


(Kai 'wusel' Siering) #3

Played around with that option already, came up with:

<cpu mode='custom' match='minimum'><model>Nehalem</model><feature policy='optional' name='pcid'/><feature policy='optional' name='invpcid'/></cpu>

Tried my first live migration just now and ended up with this error message:

Mon Jan 15 16:36:23 2018 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/kvm/migrate 'one-53' 'host08' 'host07' 53 host07
Mon Jan 15 16:36:23 2018 [Z0][VMM][E]: migrate: Command "virsh --connect qemu:///system migrate --live  one-53 qemu+ssh://host08/system" failed: error: internal error: unable to execute QEMU command 'migrate': State blocked by non-migratable device 'cpu'
Mon Jan 15 16:36:23 2018 [Z0][VMM][E]: Could not migrate one-53 to host08
Mon Jan 15 16:36:23 2018 [Z0][VMM][I]: ExitCode: 1
Mon Jan 15 16:36:23 2018 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_failmigrate.
Mon Jan 15 16:36:23 2018 [Z0][VMM][I]: Failed to execute virtualization driver operation: migrate.
Mon Jan 15 16:36:23 2018 [Z0][VMM][E]: Error live migrating VM: Could not migrate one-53 to host08

The hosts in question are identical, so there should be no issue live-migrating.

Offline migration won’t work either :frowning:

Mon Jan 15 16:45:26 2018 [Z0][VM][I]: New LCM state is SAVE_MIGRATE
Mon Jan 15 16:45:27 2018 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/kvm/save 'one-53' '/var/lib/one//datastores/0/53/checkpoint' 'host07' 53 host07
Mon Jan 15 16:45:27 2018 [Z0][VMM][E]: save: Command "virsh --connect qemu:///system save one-53 /var/lib/one//datastores/0/53/checkpoint" failed: error: Failed to save domain one-53 to /var/lib/one//datastores/0/53/checkpoint
Mon Jan 15 16:45:27 2018 [Z0][VMM][I]: error: internal error: unable to execute QEMU command 'migrate': State blocked by non-migratable device 'cpu'
Mon Jan 15 16:45:27 2018 [Z0][VMM][E]: Could not save one-53 to /var/lib/one//datastores/0/53/checkpoint
Mon Jan 15 16:45:27 2018 [Z0][VMM][I]: ExitCode: 1
Mon Jan 15 16:45:27 2018 [Z0][VMM][I]: Failed to execute virtualization driver operation: save.
Mon Jan 15 16:45:27 2018 [Z0][VMM][E]: Error saving VM state: Could not save one-53 to /var/lib/one//datastores/0/53/checkpoint
Mon Jan 15 16:45:28 2018 [Z0][VM][I]: New LCM state is RUNNING
Mon Jan 15 16:45:28 2018 [Z0][LCM][I]: Fail to save VM state while migrating. Assuming that the VM is still RUNNING (will poll VM).

OS: Ubuntu 16.04.3 LTS, virsh: 1.3.1, ONe: 5.2.1

EDIT: It turned out that my cpu-line led to -cpu Broadwell,+invtsc,+abm,[…], and INVTSC prohibits (live-) migration. So, better disable invtsc on the guest:

<cpu mode='custom' match='minimum'><model>Nehalem</model><feature policy='optional' name='pcid'/><feature policy='optional' name='invpcid'/><feature policy='disable' name='invtsc'/></cpu>

Stopped VM, changed the option, restarted VM and live-migrated it — which was only partially successful: migrated VM was listed as paused on the new hypervisor, resume wasn’t possible. Had to hard reboot the VM :frowning:


(Kai 'wusel' Siering) #4

FTR: This …

<cpu mode='custom' match='exact'><model>Broadwell</model><feature policy='optional' name='pcid'/><feature policy='optional' name='invpcid'/></cpu>

… now works for me, including VM migration. match='minimum' seems to add too many features that conflict with VM migration :frowning: