Oneacct shows VMs that does not exist

Hello there,

I’m just going to write a script for my Accounting and came across some inconsistencies.

I have some VM corpses that no longer exist. See onevm list 0
But the oneacct command says that a few other VMs are active.

Please have a look at VID55. As you can see in onevm list it is not running atm, but one acct command does not show an END_TIME.

root@nebula-bross:/var/lib/one# oneacct -u 0 -s '2015/09/22' -e '2015/09/22'
Showing active history records from 2015-09-22 00:00:00 +0200 to 2015-09-23 00:00:00 +0200

# User 0

 VID HOSTNAME        ACTION           REAS     START_TIME       END_TIME MEMORY CPU NET_RX NET_TX
  30 kvm1-bross      undeploy-hard    none 08/13 17:05:51              -     4G   1     0K     0K
  33 kvm2-bross      poweroff         user 09/18 16:37:44 09/22 12:56:12     4G   1 351.8K 235.8K
  33 kvm2-bross      none             none 09/22 15:44:40              -     4G   1     6K   4.9K
  55 kvm2-bross      undeploy         none 08/17 09:12:37              -     2G   2 206.9K     0K
  82 kvm1-bross      delete           user 09/22 12:55:44 09/22 12:58:32   512M   1   0.9K  16.4K
  83 kvm1-bross      none             none 09/22 13:01:14              -   512M   1 680.6K 643.2K

root@nebula-bross:/var/lib/one# onevm list 0
    ID USER     GROUP    NAME            STAT UCPU    UMEM HOST             TIME
    33 oneadmin oneadmin test-33         runn    0      4G kvm2-bross  39d 22h41
    83 oneadmin oneadmin Linux Test Syst runn    7  535.9M kvm1-bross   0d 03h49

Hi,

That looks bad… Do you have anything out of the usual in the VM 55 log file? What OpenNebula version are you using?

Hey,
I’m using version 4.12.1
The VM 55 was an unsuccessful attempt to bring Windows 7 to run as a VM.
I’m not even reached the installation part of Windows 7, because Windows has said that the NTFS qcow2 Image is empty. Of course.

This is the logfile of VM 55:

Mon Aug 17 09:12:37 2015 [Z0][DiM][I]: New VM state is ACTIVE.
Mon Aug 17 09:12:38 2015 [Z0][LCM][I]: New VM state is PROLOG.
Mon Aug 17 09:12:39 2015 [Z0][LCM][I]: New VM state is BOOT
Mon Aug 17 09:12:39 2015 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/55/deployment.0
Mon Aug 17 09:12:39 2015 [Z0][VMM][I]: ExitCode: 0
Mon Aug 17 09:12:39 2015 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Mon Aug 17 09:12:41 2015 [Z0][VMM][I]: ExitCode: 0
Mon Aug 17 09:12:41 2015 [Z0][VMM][I]: Successfully execute virtualization driver operation: deploy.
Mon Aug 17 09:12:41 2015 [Z0][VMM][I]: ExitCode: 0
Mon Aug 17 09:12:41 2015 [Z0][VMM][I]: Successfully execute network driver operation: post.
Mon Aug 17 09:12:41 2015 [Z0][LCM][I]: New VM state is RUNNING
Mon Aug 17 10:04:59 2015 [Z0][LCM][I]: New VM state is SHUTDOWN_UNDEPLOY
Mon Aug 17 10:05:44 2015 [Z0][LCM][I]: New VM state is CLEANUP.
Mon Aug 17 10:05:45 2015 [Z0][DiM][I]: New VM state is DONE
Mon Aug 17 10:10:08 2015 [Z0][VMM][W]: Ignored: LOG I 55 Command execution fail: /var/tmp/one/vmm/kvm/shutdown ‘one-55’ ‘kvm2-bross’ 55 kvm2-bross

Mon Aug 17 10:10:08 2015 [Z0][VMM][W]: Ignored: LOG E 55 Timed out shutting down one-55

Mon Aug 17 10:10:08 2015 [Z0][VMM][W]: Ignored: LOG I 55 ExitCode: 255

Mon Aug 17 10:10:08 2015 [Z0][VMM][W]: Ignored: LOG I 55 Failed to execute virtualization driver operation: shutdown.

Mon Aug 17 10:10:08 2015 [Z0][VMM][W]: Ignored: SHUTDOWN FAILURE 55 Timed out shutting down one-55

Hey,

I could reproduce the problem.
I had created a Windows 7 VM.
Then I wanted the VM to shutdown (not hard).
The VM at this point had no ACPI functionality and OpenNebula said “STATUS: SHUTDOWN”.
Then I performed a SHUTDOWN HARD. OpenNebula said that this action was not allowed due to wrong status. Than I remowed the VM and the VM disappeared from onevm list.

Logged on to host and entered

“virsh --connect qemu: /// system”

virsh # list
Id Name State

14 one-55 running
29 one-33 running
30 one-34 running

Thanks for reporting it, we’ll try to reproduce it here and fix it:
http://dev.opennebula.org/issues/4000

1 Like

Hi @FunTec,

I’ve uploaded an updated version of the fsck tool that will try to fix those ETIME. Can you please try the commit replacing your /usr/lib/one/ruby/onedb/fsck.rb file with the following one, and running onedb fsck?

Hey @cmartin

in the meantime I updated to version 4.14.0

root@nebula-bross:~# onedb fsck --sqlite /var/lib/one/one.db
Version mismatch: fsck file is for version
Shared: 4.11.80, Local: 4.11.80
Current database is version
Shared: 4.11.80, Local: 4.13.85

That’s fine, fsck 4.12 won’t work with ONE 4.14. For your new installation the right fsck with the bug fix is this one. Can you please give it a try and see if it fixes the ETIME?

I have restored the one.db with the errors and and those could be fixed.

root@nebula-bross:~# onedb fsck --sqlite /var/lib/one/one.db
Sqlite database backup stored in /var/lib/one/one.db_2015-10-12_9:34:57.bck
Use ‘onedb restore’ or copy the file back to restore the DB.

History record for VM 30 seq # 1 is not closed (etime = 0), but the VM is in state DONE
History record for VM 55 seq # 0 is not closed (etime = 0), but the VM is in state DONE
History record for VM 87 seq # 0 is not closed (etime = 0), but the VM is in state DONE

Total errors found: 3
Total errors repaired: 3
Total errors unrepaired: 0
A copy of this output was stored in /var/log/one/onedb-fsck.log

Perfect. Thank you for testing it!