Hi all
We have the following question, PCI PT feature works great in OpenNebula and you get the right device just requesting the VENDOR/CLASS/DEVICE
values but we found a problem in the scheduler using this procedure for some use cases.
In our case we want to use Infiniband PCI devices with an HA setup, these cards have several virtual functions assigned, in fact the IB mellanox device looks like this running onehost command:
5e:00.1 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
5e:00.2 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
5e:00.3 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
5e:00.4 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
5e:00.5 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
5e:00.6 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
5e:00.7 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
5e:01.0 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
In this use case we have included a second Infiniband card (same vendor and class) so we also get the values from onehost command and lspci
but with a different address:
d8:00.0 15b3:1013:0207 MT27700 Family [ConnectX-4]
d8:00.1 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
d8:00.2 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
d8:00.3 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
d8:00.4 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
d8:00.5 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
d8:00.6 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
d8:00.7 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
d8:01.0 15b3:1014:0207 MT27700 Family [ConnectX-4 Virtual Function]
We want to add 2 PCI IB devices to our VM to use a HA setup with 2 different cards (these cards are connected to different switches just in case if one network goes down). That is fine because we can add several PCI sections to our VM template, but the problem is that is not possible to choose two different cards if they have the same VENDOR/CLASS/DEVICE
values (https://docs.opennebula.org/5.6/deployment/open_cloud_host_setup/pci_passthrough.html) .
from the OpenNebula code it looks like the plugin just executes the lspci
command and then the scheduler just picks up the first available address from the list (if it is not used by any VM).
It would be possible to use a round-robin mechanism for the PCI PT scheduler?
or just force the usage of an specific card just requesting the ADDRESS
value directly from the PCI section as well (instead/plus vendor/class… values)?
This will help a lot for these HA cases.
I know that is not a regular use case, maybe only oneadmin should be able to request this but it could help for some use cases like this. When you have several cards in your hypervisor and you want to use a round-robin mechanism to use them from your VMs (as OpenNebula scheduler does to deploy VMs into different hyps)
Thanks a lot in advance!
Álvaro