HA. VM stuck in BOOT_POWEROFF state

binbash · February 20, 2017, 11:47am

In case one of hosts goes down HA for VMs works incorrect in several specific cases.

For example:

when we stop one of hosts ONE waiting for several monitoring cycles befor use stonith script;
host goes to shutdown and already inaccessible from ONE but VM’s still in RUNNING;
in this step we’ve send POWEROFF to vm;
until host is down VMs in SHUTDOWN state;
when host starting VMs going to POWEROFF state;
then we try to start follow VMs we take:

Mon Feb 20 12:14:47 2017 [Z0][VM][I]: New LCM state is SHUTDOWN_POWEROFF
Mon Feb 20 12:21:00 2017 [Z0][LCM][I]: VM reported SHUTDOWN by the drivers
Mon Feb 20 12:21:00 2017 [Z0][VM][I]: New state is POWEROFF
Mon Feb 20 12:21:00 2017 [Z0][VM][I]: New LCM state is LCM_INIT
Mon Feb 20 12:22:22 2017 [Z0][VM][I]: New state is ACTIVE
Mon Feb 20 12:22:22 2017 [Z0][VM][I]: New LCM state is BOOT_POWEROFF
Mon Feb 20 12:22:22 2017 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/68/deployment.2

feldsam · February 20, 2017, 6:11pm

Hello, you can adjust in /etc/one/oned.conf monitoring driver settings

#-------------------------------------------------------------------------------
#  KVM UDP-push Information Driver Manager Configuration
#    -r number of retries when monitoring a host
#    -t number of threads, i.e. number of hosts monitored at the same time
#    -w Timeout in seconds to execute external commands (default unlimited)
#-------------------------------------------------------------------------------
IM_MAD = [
      NAME          = "kvm",
      SUNSTONE_NAME = "KVM",
      EXECUTABLE    = "one_im_ssh",
      ARGUMENTS     = "-r 3 -t 15 kvm" ]

“stonith” scrpt should auto migrate VMs to another host

https://docs.opennebula.org/5.2/advanced_components/ha/ftguide.html#host-failures

pianziva · December 17, 2018, 8:02am

hello,

i already enable the KVM UDP Push, but still getting state POWEROFF on the VM if i turned off one of the host.

Any suggestion or solution ?

feldsam · December 17, 2018, 8:49am

Hello,

when we stop one of hosts ONE waiting for several monitoring cycles befor use stonith script;

host goes to shutdown and already inaccessible from ONE but VM’s still in RUNNING;

Why you do this?

in this step we’ve send POWEROFF to vm;

until host is down VMs in SHUTDOWN state;

when host starting VMs going to POWEROFF state;

Are you implemented proper fencing mechanism to host error hook?

pianziva · December 17, 2018, 8:59am

hi Feldsam,

I was trying to setup a High availability for host failures.

https://docs.opennebula.org/5.2/advanced_components/ha/ftguide.html#host-failures

and when i test to enable fencing or disable from oned.conf file to get feature of HA of host failure, this vm wont migrate to other host and still getting up state of POWEROFF.

pianziva · December 17, 2018, 9:03am

Hi,

Fyi this is the schema our cloud design:

host1, front end HA
host2, node kvm
host3, node kvm

and for the shared storage i using glusterfs on each host node.

Topic		Replies	Views
VMs with wrong state (RUNNING) after host reboot General	3	1084	December 15, 2020
Capture “VM running but monitor state is POWEROFF” Community Support	2	1147	October 4, 2016
Random VMs incorrectly in POWEROFF state after upgrade to 5.12 General	1	413	September 3, 2021
VM goes from RUNNING state to POWEROFF state even though it is running on cluster/hypervisor Community Support	4	2381	February 19, 2022
After update to 5.4 VM turn poweroff (random) Community Support	4	926	August 22, 2017

HA. VM stuck in BOOT_POWEROFF state

Related Topics