HA Failover Scenario

I need to understand the failover scenario in OpenNebula.

Let’s say I have two physical hosts:

host01

  • 4TB Local LUN
  • 4TB Gluster FS ( brick w/ host02, host03 )

host02

  • 4TB Local LUN
  • 4TB Gluster FS ( brick w/ host01, host03 )

host03

  • 4TB Local LUN
  • 4TB Gluster FS ( brick w/ host02, host01 )
  1. Takedown the Bond 0 NIC on the master: ip link set bond0 down
  2. Create a VM on the remaining responding nodes within the shared or local LUN that’s attached to any of the physical host above.

What can I expect will happen? Is this one of the tested scenarios with OpenNebula and if so, I’m curious what were the results?

For the purpose of this scenario, feel free to please substitute GlusterFS for any other distributed FS that it might have been tested on.

Reg,

Hi, it is difficult to fully analyze because there is some info missing. Apparently, you are not interested on the orchestrator’s HA or the VM’s HA (migrating VMs from a failed host to a healthy host). Your concern seems to be more on the Gluster side.
OpenNebula is agnostic to the underlying storage it is using when file system drivers are used (like in this case). Also, you mix two different storage systems, you have gluster configured on the system plus a local LUN. There are several combinations you could do with those storages, and the answer will vary on each case. Most likely scenario, you are using your GlusterFS for the images datastores and perhaps also for a system datastore, and your concern I think is more on the Gluster side than on the OpenNebula side. Bottom line on OpenNebula, if the storage goes down, the datastore goes down.

To answer your question on the Gluster side, it depends on your Gluster’s volume configuration. If you are using a replicated volume (using one brick per server), nothing will happen if a node goes down since you are using replica 3. If you are using a distributed volume, if you lose one node you can loose your data (most likely). You can not configure a distributed-replicated volume with three bricks so I will leave this case out

Thanks Sergio!

You did partly answered what I was looking for. I was interested in finding out how the environment behaves under that setup.

Each one of those storages will be visible on each host. The difference is, if I have a VM on the local LUN of any host, I understand there can’t be any failover to another node. On the other hand, if a VM is created on the GlusterFS, I can bring the VM up on another host and I’m good.

In the above post, I was interested in understanding how failure is handled if one of the hosts and it’s storage goes offline.

OpenNeubula being agnostic to the underlying storage is a positive. I am interested in the Orchestrators HA and the VM HA if you don’t mind going over that in the context of the above scenario.

Edit:
Does OpenNebula also support hot swap CPU and Memory? I’m aware KVM already supports this but not sure if OpenNebula takes advantage of it.

Thx,
TK