Automatically restart VMs after host restart


(pentium100) #1

Hello,

After a host reboots (be it gracefully or after some failure), virtual machines that were running on that host remain shut down. Is there a way to make them automatically start after the host boots up (without opennebula, I’d just do “virsh autostart vm-name”)? Otherwise the VMs remain off until someone manually starts them…


Vm autostart after hardware node reboot
(Daniel Dehennin) #2

pentium100 forum@opennebula.org writes:

Hello,

Hello,

After a host reboots (be it gracefully or after some failure), virtual
machines that were running on that host remain shut down. Is there a
way to make them automatically start after the host boots up (without
opennebula, I’d just do “virsh autostart vm-name”)? Otherwise the VMs
remain off until someone manually starts them…

There is an issue[1] opened for this feature.

Regards.

Footnotes:
[1] http://dev.opennebula.org/issues/1290


#3

config hook in one.conf

VM_HOOK = [
name = “autostart_hook”,
on = “CUSTOM”,
state = “POWEROFF”,
lcm_state = “LCM_INIT”,
command = “/usr/bin/env onevm resume”,
arguments = “$ID” ]


(Edouard (Madko)) #4

With this hook how do you choose which VM is in autostart mode?


#5

ref: Using Hooks


(Daniel Dehennin) #6

clm_sky forum@opennebula.org writes:

ref: [Using Hooks][1]

Yes, but this does not permit to not start VM POWEROFF by a user.

So, each time a user POWEROFF a VM, your hook resume it automatically,
right?

Regards.


(Edouard (Madko)) #7

That’s more my question in fact. I know where the doc is but still do not understand how you tell OpenNebula to autostart specific VMs, and not one that are POWEROFF for a good reason.


(Marc) #8

I tried writing a custom hook which is activated when state = active and LCM state = unknown to trigger onevm resume (the states show up in sunstone after host reboot). Its not working. According http://dev.opennebula.org/issues/1639#note-7 the issue should be solved but on the other hand issue 1290 is still open.

I tried setting LIVE_RESCHEDS = 1 in sched.conf - no improvement.

One solution could be modifying one.conf host hook name=error which calls ft/host_error.rb. The present config (in 4.6.2 / ubuntu package on 14.04) does recreate a VM from image/template. I don’t quite understand whats the point behind that in a cluster with shared storage - migrating the vm to some other host would be sufficient and not that destructive.

Is this issue solved in the latest version? I’d really love to migrate from VMware to OpenNebula/KVM :smile:


(Ruben S. Montero) #9

Hi,

There are a couple of scenarios to consider here:

  • The host fails and the VM needds to be restarted in other host. This is achieved with the HA hook, which hooks on Host ERROR state not in the VM states. After a Host goes to error the VMs on that host would be in UNKNOWN, and could be “restarted”. Note that the original host is down and there is no hope to contact the hypervisor in that host so no live migration… cold migration can work using a shared system datastore.

  • The host fails no recovery action needs to be taken, but once the host reboots the VMs running on the host need to be restarted. This can be achieved by configuring the hypervisor (e.g. on_failure attribute in Xen and so on). This can be triggered from opennebula, again hooking on the host states. For example on ON, get the VMs in the host (e.g. onehost show) check if they are in unknown and then resume them in the same host.

Cheers


(Marc) #10

Hi,

thanks for the hint. I tried the following as a proof of concept:

HOST_HOOK = [
    name      = "hook_host_on",
    on        = "ON",
    command   = "/bin/date >> /var/lib/one/log/host_on.log",
    arguments = "",
    remote    = "yes" ]

but there ist no file created in the shared nfs directory /var/lib/one/log. In the documentation for host hooks only the states CREATE, ERROR, DISABLE are mentioned.

Regards


(Ruben S. Montero) #11

My bad, you are right I mixed the VM and Host triggers :frowning:

So, if the host fails and no recovery action needs to be taken, but once
the host reboots the VMs running on the host need to be restarted; you
should resort to the hypervisor capabilities, (for example add virsh
autostart in de deploy script should be straight forward)


(Marc) #12

I guess I found a solution which is not hypervisor based. The hook hint lead me to the idea to trigger a hook on vm state = UNKNOWN.

oned.conf:

VM_HOOK = [
   name      = "hook_vm_on_unknown",
   on        = "UNKNOWN",
   command   = "ft/hook_vm_on_unknown.sh",
   arguments = "$ID $PREV_STATE $PREV_LCM_STATE" ]

hook_vm_on_unknown.sh:

#!/bin/sh

if [ $2 = "ACTIVE" ] && [ $3 = "RUNNING" ]; then
    onevm boot $1
fi

Hope this helps someone.


(Jeovanevs) #13

Hi everyone. It doesn’t worked to me. The variables $PREV_STATE and $PREV_LCM_STATE after a failure host reboot are POWEROFF <8> and NONE <0> respectively (https://docs.opennebula.org/5.4/integration/system_interfaces/api.html). Any other solution hipervisor independent?


(Bogdan Stoica) #14

Since I was unable to find a solution to this and even with OpenNebula 5.5.90 I see that there is no option for autostart virtual machines in case of a opennebula node/server restart/reboot or power failure or whatever, I came up with this… Probably not the best way to achieve but as long as it works is just fine.

So I have OpenNebula 5.5.90 installed on a CentOS 7.x server. I have added this piece of code to the

/etc/rc.local file

automatically start mandatory KVM vms

IS_MOUNTED=df | grep "/var/lib/one"

ON_SERVICE=“opennebula”
ON_WEB=“opennebula-sunstone”

echo $IS_MOUNTED

ALWAYS_ON_KVMS=“2 3 4 5 6 7 8 9 10 11 12”

sleep 60

if [ -z “$IS_MOUNTED” ]
then
echo “Storage partition is not mounted on the server(s)”
else
echo “Storage partition is mounted on the server(s)”
for i in $ALWAYS_ON_KVMS; do onevm resume $i; done
fi

You might not need that 60 seconds delay, in my case is necessary since there 2 opennebula servers which share the same partition which is in fact a DRBD resource and it’s mounted by pacemaker/corosync

The only issue with this solution is that, if you create more vms, then you have to update the file with their ids.

Hope it helps!

Note: Pay attention to quotes etc since the forum seems to be reformating the text and some special chars like ` " are not displayed correctly