5.3.85 & vcenter 6.5 deployment from template


(Ciro Iriarte) #1

Hi!,

I’m testing Opennebula on my lab running vSphere 6.5, I’ve succesfully imported 3 linux templates but at deployment time the VMs are created without disks.

Tested deploying from template directly in vcenter, and the VM is correctly assigned a disk (copy from template).

Anybody has seen something like this?.


(Jesus Malena) #2

I have the same setup as you and I’m not having this problem. Can you provide logs, screenshots of your templates, or how you imported them? I had an issue with expanding disks with instantiating VMs from a linked clone template.


(Ciro Iriarte) #3

Sure, anything in particular of importance/use?.

Converted vm to template with vcenter. Imported templates to Opennebula
using the template import feature (connects to vcenter and fetches
available non known templates). Clicked on “import” and nothing else.


(Ciro Iriarte) #4

instance logs:

Thu Jul  6 18:58:03 2017 [Z0][VM][I]: New state is ACTIVE
Thu Jul  6 18:58:03 2017 [Z0][VM][I]: New LCM state is PROLOG
Thu Jul  6 18:58:03 2017 [Z0][VM][I]: New LCM state is BOOT
Thu Jul  6 18:58:03 2017 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/38/deployment.0
Thu Jul  6 18:58:03 2017 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Thu Jul  6 18:58:03 2017 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Thu Jul  6 18:58:46 2017 [Z0][VMM][I]: Successfully execute virtualization driver operation: deploy.
Thu Jul  6 18:58:46 2017 [Z0][VMM][I]: Successfully execute network driver operation: post.
Thu Jul  6 18:58:46 2017 [Z0][VM][I]: New LCM state is RUNNING

Logs from scheduler:

Thu Jul  6 18:58:03 2017 [Z0][SCHED][D]: System DS 0 discarded for VM 38. It does not fulfill SCHED_DS_REQUIREMENTS.
Thu Jul  6 18:58:03 2017 [Z0][SCHED][D]: Match Making statistics:
        Number of VMs:            1
        Total time:               0s
        Total Cluster Match time: 0s
        Total Host Match time:    0.00s
        Total Host Ranking time:  0.00s
        Total DS Match time:      0.00s
        Total DS Ranking time:    0.00s

Thu Jul  6 18:58:03 2017 [Z0][SCHED][D]: Scheduling Results:
Virtual Machine: 38

        PRI     ID - HOSTS
        ------------------------
        -1      0

        PRI     ID - DATASTORES
        ------------------------
        1       106
        1       104
        0.00868814      100
        0.00849336      102


Thu Jul  6 18:58:03 2017 [Z0][SCHED][D]: Dispatching VMs to hosts:
        VMID    Host    System DS
        -------------------------
        38      0       106

Events from vcenter related to the VM:

CreatedTime         UserName                    FullFormattedMessage
-----------         --------                    --------------------
06/07/2017 18:59:20 User                        Warning message on one-38-test04 on bigiron.lan in Dpto: No operating system was found. If you have an operating system installation di...
06/07/2017 18:58:47                             Alarm 'vSphere HA virtual machine failover failed' on one-38-test04 changed from Gray to Green
06/07/2017 18:58:46 VSPHERE.LOCAL\Administrator Reconfigured one-38-test04 on bigiron.lan in Dpto.  ...
06/07/2017 18:58:46 VSPHERE.LOCAL\Administrator Task: Reconfigure virtual machine
06/07/2017 18:58:46 VSPHERE.LOCAL\Administrator one-38-test04 on  bigiron.lan in Dpto is powered on
06/07/2017 18:58:46 VSPHERE.LOCAL\Administrator one-38-test04 on host bigiron.lan in Dpto is starting
06/07/2017 18:58:46 VSPHERE.LOCAL\Administrator Task: Power On virtual machine
06/07/2017 18:58:46 VSPHERE.LOCAL\Administrator Reconfigured one-38-test04 on bigiron.lan in Dpto.  ...
06/07/2017 18:58:46 VSPHERE.LOCAL\Administrator Task: Reconfigure virtual machine
06/07/2017 18:58:45 VSPHERE.LOCAL\Administrator Reconfigured one-38-test04 on bigiron.lan in Dpto.  ...
06/07/2017 18:58:45 VSPHERE.LOCAL\Administrator Task: Reconfigure virtual machine
06/07/2017 18:58:45 VSPHERE.LOCAL\Administrator Template opensuse-leap-42-x86_64 deployed on host bigiron.lan
06/07/2017 18:58:45 VSPHERE.LOCAL\Administrator Reconfigured one-38-test04 on bigiron.lan in Dpto.  ...
06/07/2017 18:58:04 VSPHERE.LOCAL\Administrator Deploying one-38-test04 on host bigiron.lan in Dpto from template opensuse-leap-42-x86_64

(Ciro Iriarte) #5

Getting the full detail, ONE is deleting the disk after succesful template deployment. The setup is pretty much stock, did you setup something in special to avoid that operation?.

06/07/2017 18:59:20 Warning message on one-38-test04 on bigiron.lan in Dpto: No operating system was found. If you have an operating system installation disc, you can insert the disc into the system's CD-ROM drive and restart the virtual machine.

06/07/2017 18:58:47 Alarm 'vSphere HA virtual machine failover failed' on one-38-test04 changed from Gray to Green
06/07/2017 18:58:46 Reconfigured one-38-test04 on bigiron.lan in Dpto.

Modified:

config.tools.toolsVersion: 10272 -> 0;

config.tools.toolsInstallType: "guestToolsTypeOpenVMTools" -> "guestToolsTypeUnknown";

config.extraConfig("vmware.tools.internalversion").value: "10272" -> "0";

config.extraConfig("opennebula.vm.running").value: "no" -> "yes";

 Added:

 Deleted:


06/07/2017 18:58:46 Task: Reconfigure virtual machine
06/07/2017 18:58:46 one-38-test04 on  bigiron.lan in Dpto is powered on
06/07/2017 18:58:46 one-38-test04 on host bigiron.lan in Dpto is starting
06/07/2017 18:58:46 Task: Power On virtual machine
06/07/2017 18:58:46 Reconfigured one-38-test04 on bigiron.lan in Dpto.

Modified:

config.hardware.device(100).device: (500, 12000, 1000, 15000) -> (500, 12000, 1000, 15000, 4000);

 Added:

config.hardware.device(4000): (dynamicProperty = <unset>, key = 4000, deviceInfo = (label = "Network adapter 1", summary = "test"), backing = (deviceName = "test", useAutoDetect = false, network = 'vim.Network:07f559bb-961b-4a59-b54e-5b0b889ebd42:network-291', inPassthroughMode = <unset>), connectable = (startConnected = true, allowGuestControl = true, connected = false, status = "untried"), slotInfo = null, controllerKey = 100, unitNumber = 7, addressType = "manual", macAddress = "02:00:c0:a8:68:64", wakeOnLanEnabled = true, resourceAllocation = (reservation = 0, share = (shares = 50, level = "normal"), limit = -1), externalId = <unset>, uptCompatibilityEnabled = true);

config.extraConfig("remotedisplay.vnc.port"): (key = "remotedisplay.vnc.port", value = "5938");

config.extraConfig("guestinfo.opennebula.context"): (key = "guestinfo.opennebula.context", value = "IyBDb250ZXh0IHZhcmlhYmxlcyBnZW5lcmF0ZWQgYnkgT3Blbk5lYnVsYQpE
SVNLX0lEPScwJwpFVEgwX0NPTlRFWFRfRk9SQ0VfSVBWND0nJwpFVEgwX0RO
Uz0nMTkyLjE2OC4xMDQuMScKRVRIMF9HQVRFV0FZPScxOTIuMTY4LjEwNC4x
JwpFVEgwX0dBVEVXQVk2PScnCkVUSDBfSVA9JzE5Mi4xNjguMTA0LjEwMCcK
RVRIMF9JUDY9JycKRVRIMF9JUDZfUFJFRklYX0xFTkdUSD0nJwpFVEgwX0lQ
Nl9VTEE9JycKRVRIMF9NQUM9JzAyOjAwOmMwOmE4OjY4OjY0JwpFVEgwX01B
U0s9JzI1NS4yNTUuMjU1LjAnCkVUSDBfTVRVPScnCkVUSDBfTkVUV09SSz0n
MTkyLjE2OC4xMDQuMCcKRVRIMF9TRUFSQ0hfRE9NQUlOPScnCkVUSDBfVkxB
Tl9JRD0nMzA0JwpFVEgwX1ZST1VURVJfSVA9JycKRVRIMF9WUk9VVEVSX0lQ
Nj0nJwpFVEgwX1ZST1VURVJfTUFOQUdFTUVOVD0nJwpORVRXT1JLPSdZRVMn
ClNTSF9QVUJMSUNfS0VZPScnClRBUkdFVD0naGRhJwo=
");

config.extraConfig("remotedisplay.vnc.ip"): (key = "remotedisplay.vnc.ip", value = "0.0.0.0");

config.extraConfig("remotedisplay.vnc.enabled"): (key = "remotedisplay.vnc.enabled", value = "TRUE");

 Deleted:


06/07/2017 18:58:46 Task: Reconfigure virtual machine
06/07/2017 18:58:45 Reconfigured one-38-test04 on bigiron.lan in Dpto.

Modified:

config.hardware.device(1000).device: (2000) -> ();

 Added:

 Deleted:

config.hardware.device(2000): (key = 2000, deviceInfo = (label = "Hard disk 1", summary = "31,457,280 KB"), backing = (fileName = "ds:///vmfs/volumes/37826dad-08db7a3d/one-38-test04/one-38-test04.vmdk", datastore = 'vim.Datastore:07f559bb-961b-4a59-b54e-5b0b889ebd42:datastore-201', backingObjectId = "", diskMode = "persistent", split = false, writeThrough = false, thinProvisioned = true, eagerlyScrub = <unset>, uuid = "6000C298-5f08-486c-e9d4-051d8ceafce1", contentId = "daf62cf580302423edbcaff9791d865b", changeId = <unset>, parent = null, deltaDiskFormat = <unset>, digestEnabled = false, deltaGrainSize = <unset>, deltaDiskFormatVariant = <unset>, sharing = "sharingNone", keyId = null), connectable = null, slotInfo = null, controllerKey = 1000, unitNumber = 0, capacityInKB = 31457280, capacityInBytes = 32212254720, shares = (shares = 1000, level = "normal"), storageIOAllocation = (limit = -1, shares = (shares = 1000, level = "normal"), reservation = 0), diskObjectId = "82-2000", vFlashCacheConfigInfo = null, iofilter = <unset>, vDiskId = null);


06/07/2017 18:58:45 Task: Reconfigure virtual machine
06/07/2017 18:58:45 Template opensuse-leap-42-x86_64 deployed on host bigiron.lan
06/07/2017 18:58:45 Reconfigured one-38-test04 on bigiron.lan in Dpto.

Modified:

 Added:

config.extraConfig("opennebula.vm.running"): (key = "opennebula.vm.running", value = "no");

 Deleted:


06/07/2017 18:58:04 Deploying one-38-test04 on host bigiron.lan in Dpto from template opensuse-leap-42-x86_64

(Tino Vazquez) #6

Could you please check with the recently published RC?

https://opennebula.org/software/


(Ciro Iriarte) #7

Updated to 5.3.90, now it’s not even deployed:

Tue Jul 11 11:38:18 2017 [Z0][VM][I]: New state is ACTIVE
Tue Jul 11 11:38:18 2017 [Z0][VM][I]: New LCM state is PROLOG
Tue Jul 11 11:38:18 2017 [Z0][VM][I]: New LCM state is BOOT
Tue Jul 11 11:38:18 2017 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/41/deployment.0
Tue Jul 11 11:38:18 2017 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Tue Jul 11 11:38:18 2017 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Tue Jul 11 11:38:19 2017 [Z0][VMM][I]: Command execution fail: /var/lib/one/remotes/vmm/vcenter/deploy '/var/lib/one/vms/41/deployment.0' 'clu-home_[vcsa.lan-Dpto]_433637b63c70' 41 clu-home_[vcsa.lan-Dpto]_433637b63c70
Tue Jul 11 11:38:19 2017 [Z0][VMM][E]: Deploy of VM 41 on vCenter cluster clu-home_[vcsa.lan-Dpto]_433637b63c70 with /var/lib/one/vms/41/deployment.0 failed due to "Cannot clone VM Template: CannotAccessFile: Unable to access file ds:///vmfs/volumes/3b5071da-65c11ff9/one-41-test90
Tue Jul 11 11:38:19 2017 [Z0][VMM][E]: ["/usr/lib/one/ruby/vendors/rbvmomi/lib/rbvmomi/vim/Task.rb:11:in `wait_for_completion'", "/usr/lib/one/ruby/vcenter_driver/virtual_machine.rb:1123:in `clone_vm'", "/var/lib/one/remotes/vmm/vcenter/deploy:61:in `<main>'"]"
Tue Jul 11 11:38:19 2017 [Z0][VMM][E]: ["/usr/lib/one/ruby/vcenter_driver/virtual_machine.rb:1126:in `rescue in clone_vm'", "/usr/lib/one/ruby/vcenter_driver/virtual_machine.rb:1119:in `clone_vm'", "/var/lib/one/remotes/vmm/vcenter/deploy:61:in `<main>'"]
Tue Jul 11 11:38:19 2017 [Z0][VMM][I]: ExitCode: 255
Tue Jul 11 11:38:19 2017 [Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
Tue Jul 11 11:38:19 2017 [Z0][VMM][E]: Error deploying virtual machine: Deploy of VM 41 on vCenter cluster clu-home_[vcsa.lan-Dpto]_433637b63c70 with /var/lib/one/vms/41/deployment.0 failed due to "Cannot clone VM Template: CannotAccessFile: Unable to access file ds:///vmfs/volumes/3b5071da-65c11ff9/one-41-test90
Tue Jul 11 11:38:19 2017 [Z0][VM][I]: New LCM state is BOOT_FAILURE

(Tino Vazquez) #8

Could you send us the “onevm show -x 41” output?


(Ciro Iriarte) #9

Here you go:

onevm-41.txt (5.2 KB)

Other two tests:

onevm-43.txt (6.4 KB)

onevm-44.txt (6.3 KB)


(Tino Vazquez) #10

Thanks for the files. From the output we see that no new disks are added to the VM Template, so the cloning of the VM disks is performed by vCenter.

The error message indicates that the clone operation cannot create a folder in the specified datastore (we only have the uuid, which is “3b5071da-65c11ff9”).

In order to debug this, we need to find out:

  • If the user that OpenNebula uses to interface with vCenter can write into that datstore. The most comprehensive test would be to instantiate the same VM Template into the same datastore

  • Is there any error message in the vCenter or ESX logs?


(Ciro Iriarte) #11

Being this a home lab and me lazy, I used Administrator@vsphere.local, who should have access to everything.

According to the logs, I see a clone operation but there’s no much detail about it:

12/07/2017 12:28:24 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged out (login time: Wednesday, 12 July, 2017 16:28:23, number of API invocations: 4, user agent: Ruby)
12/07/2017 12:28:23 Task: Clone virtual machine
12/07/2017 12:28:23 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged in as Ruby
12/07/2017 12:28:23 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged out (login time: Wednesday, 12 July, 2017 16:28:23, number of API invocations: 2, user agent: Ruby)
12/07/2017 12:28:23 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged in as Ruby
12/07/2017 12:27:46 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged out (login time: Wednesday, 12 July, 2017 16:27:43, number of API invocations: 10, user agent: Ruby)
12/07/2017 12:27:43 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged in as Ruby
12/07/2017 12:27:43 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged out (login time: Wednesday, 12 July, 2017 16:27:43, number of API invocations: 2, user agent: Ruby)
12/07/2017 12:27:43 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged out (login time: Wednesday, 12 July, 2017 16:27:43, number of API invocations: 2, user agent: Ruby)
12/07/2017 12:27:43 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged out (login time: Wednesday, 12 July, 2017 16:27:43, number of API invocations: 2, user agent: Ruby)
12/07/2017 12:27:43 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged in as Ruby
12/07/2017 12:27:43 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged in as Ruby
12/07/2017 12:27:43 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged in as Ruby
12/07/2017 12:27:43 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged out (login time: Wednesday, 12 July, 2017 16:27:43, number of API invocations: 2, user agent: Ruby)
12/07/2017 12:27:43 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged in as Ruby
12/07/2017 12:27:43 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged out (login time: Wednesday, 12 July, 2017 16:27:43, number of API invocations: 2, user agent: Ruby)
12/07/2017 12:27:43 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged in as Ruby
12/07/2017 12:27:43 User VSPHERE.LOCAL\Administrator@10.2.0.205 logged out (login time: Wednesday, 12 July, 2017 16:27:42, number of API invocations: 2, user agent: Ruby)

PowerCLI C:\> $entity = Get-Template -Name ubuntu-16.04-x86_64
PowerCLI C:\> Get-VIEvent -Entity $entity -MaxSamples 1000 | %{Write-Host $_.CreatedTime $_.FullFormattedMessage}
12/07/2017 12:28:23 Task: Clone virtual machine
12/07/2017 12:24:53 Task: Clone virtual machine
12/07/2017 12:23:53 Task: Clone virtual machine
11/07/2017 11:53:50 Task: Clone virtual machine
11/07/2017 11:46:49 Task: Clone virtual machine
11/07/2017 11:38:18 Task: Clone virtual machine

Also, checking the VM list, there’s no new VM created after the process ends. The lab has two repositories, one read only for ISO files and another for VM (same NFS server, with enough space).

Also, tried to deploy a VM from template using vcenter, on the same “Templates” folder and it worked without issues.


(Ciro Iriarte) #12

Hmm, the ID you mentioned is the read-only ISO NFS share:

ds:///vmfs/volumes/3b5071da-65c11ff9/

The VM datastore should be:

ds:///vmfs/volumes/37826dad-08db7a3d/

I don’t see an option to choose where to deploy the VM (host or datastore), I can only choose the network and that seems to define to host/cluster to use.


(Ciro Iriarte) #13

OK, per the documentation, each DS is imported with IMAGE and SYSTEM instances (two copies per each real DS).

If I disable the SYSTEM instance of the readonly ISO datastore, the VM is successfuly deployed on the correct (only other?) datastore:

  • Is this expected?
  • what’s the correct/expected way to proceed/setup?
  • should I disable/delete the SYSTEM instance for all the RO datastores?
  • If I have N datastores, should I expect the VM DS to be randomly chosen?

(Tino Vazquez) #14

Yes, the readonly SYSTEM DS should be disabled in OpenNebula. We added a note in the documentation, a will address this in a more automatic way in future releases:

https://dev.opennebula.org/issues/5243

In the meantime, please disable the SYSTEM DS for all readonly datastores.

The scheduler choses the optimal DS for a VM following predefined scheduling policies that can be configured:

http://docs.opennebula.org/5.2/operation/host_cluster_management/scheduler.html#pre-defined-storage-policies


(Ciro Iriarte) #15

Thanks Tino, I’ll keep testing. Can I assume that given two clusters with
free resources, if I create a VM connected to network “red” and if the
network is available at both clusters the VM will end up at available any
of them?.

I know is not related to the topic, but how can you arbitrary locations
(EC2/outside or site A vs Site B)?


(Tino Vazquez) #16

That is correct, the OpenNebula scheduler first filters out all the hosts that doesn’t meet hte requirements of a particular VM, like for instance the availability of the referenced networks.

In order to select placement you can check the Scheduling tab of the VM Template update dialog, which fills the SCHED_REQUIREMENTS and SCHED_RANK attributes to instruct the OpenNebula match making algorithm. You can construct boolean expressions to filter and prioritize hosts. More info: http://docs.opennebula.org/5.2/operation/references/template.html#template-placement-section