CentOS 7 network restart and Open vSwitch

Hello!
When I setup the OVS with the network-scripts I pasted below, everything works just fine (opennebula adds ports to the switch, removes them, etc.), but if somehow the network script is restarted, all the created ports are removed and we are back to the “bridge with only eth0 port” OVS config.
Looks like the networking-scripts destroy the OVS and recreates, so all the ports created by OpenNebula are lost, the only way to have them back is running “ovs-vsctl add-port” with the old port name.
Did you have that issue? How did you solved it?
Is adding the OVS config to the network scripts the right approach?

Thanks!

=========> ifcfg-br0 <=========
DEVICE=br0
TYPE=OVSBridge
BOOTPROTO=static
IPADDR=xx.xx.xx.xx
NETMASK=255.255.255.0
ONBOOT=yes


=========> ifcfg-eth0 <=========
# added by KwikStart
DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
PEERDNS=no
NETWORKING_IPV6=no
NOZEROCONF=yes
HWADDR=xx:xx:xx:xx
OVS_BRIDGE=br0
TYPE=OVSPort
DEVICETYPE=ovs
HOTPLUG=no

==================
Open vSwitch Version: 2.3.2
CentOS Linux release 7.2.1511 (Core)

Hi @GermanG

We found the same issue running CentOS 7.3 hyps with OVS 2.5.2, and is quite easy to reproduce, if you restart the network service OVS also removes all the ports connected to the current bridge and creates a new one from scratch running this command:

ovs-vsctl -t 10 – --if-exists del-port br100 em2 – add-port br100 em2

So before the network restart you have something like this:

$ ovs-vsctl list-ports br100
em2
one-405-0
one-405-1
one-527-0
one-534-0

but after a service network restart your bridge ports are gone:

$ ovs-vsctl list-ports br100
em2

So you lost the VM network connection, the VMs are running but the port connection is not there…

The only workaround that I found in that case is just reboot the VMs or migrate the VMs to another hypervisor, the ports and VLANs are attached again due to the VNM network driver and everything comes back to normal.

I was also wondering how to solve this issue automatically from OpenNebula without this workaround (sometimes you don’t want to reboot the VM or you cannot migrate the VM to another hyp).

Maybe the OpenNebula developers could give us some clue how to fix this for OVS networks… it should be done by the OpenNebula OVS driver, I think that the Post hook is executed after VM instantiation so in this case we should force to execute the Post hook again, this should execute the ovs-vsctl set Port one-<id>-0 tag=<vlan> again to recover the VMs network, is that possible? anyone else had the same issue?

Cheers and thanks!
Álvaro

Hi

I also found that the VM SUSPEND action also restores and executes the VNM Post script so the ports are recovered after a resume but I’m still wondering how to execute the VNM Post script without changing the VM RUNNING status.

Cheers
Álvaro

Sounds interesting. We will review this here:
https://dev.opennebula.org/issues/5130

hi @jmelis

Thanks a lot! for the moment what we did was just include a homemade package called ifup-ifdown-local in our hyps that is executed during network restart to keep the OVS ports. The package only includes 2 scripts /sbin/ifdown-local and /sbin/ifup-local to keep everything after any network restart (VXLAN networks are not affected by this btw).

This is just a workaround but I think that it could be a good idea if OpenNebula is able to check the hyp network to recover the ports, and is also quite generic, it should be valid for vxlan or any other network plugin, just execute the post script if the ports are missing. In our case is just to look for one-VMID-NICID ports.

Cheers
Álvaro

1 Like