Lxd hypervisor on 'Red Hat Virtualization' virtual machines: container instantiation FAILURE

Please, describe the problem here and provide additional information below (if applicable) …

We are trying to deploy OpenNebula on our corporate Red Hat Virtualization using VMs and based on LXD, but we cannot instantiate containers from Marketplace.

-Platform: Red Hat Virtualization
* hypervisors: Red Hat Enterprise Linux release 7.4
* manager: Red Hat Virtualization Manager Version: 4.1
-Test environment: 1 OpenNebula frontend VM + 1 OpenNebula host VM


Versions of the related components and OS (frontend, hypervisors, VMs):

  • Red Hat Virtualization’s VMs installed using default parameters (apart from disks, nics etc…)
  • OS frontend and host: ubuntu 18.04
  • OpenNebula: 5.8.1
  • OpenNebula hypervisor: LXD

Steps to reproduce:

  1. downloaded ‘ubuntu_xenial - LXD’ from MarketPlace
  2. Instantiated a container based on ‘ubuntu_xenial - LXD’ template
  3. Status. FAILURE
    Note: the same results apply for any apps

Current results:

  1. Log copied from browser ‘Log’ tab (sunstone):
Tue Jun 11 10:40:32 2019 [Z0][VM][I]: New state is ACTIVE
Tue Jun 11 10:40:32 2019 [Z0][VM][I]: New LCM state is PROLOG
Tue Jun 11 10:40:39 2019 [Z0][VM][I]: New LCM state is BOOT
Tue Jun 11 10:40:39 2019 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/25/deployment.0
Tue Jun 11 10:43:10 2019 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Tue Jun 11 10:43:10 2019 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/lxd/deploy '/var/lib/one//datastores/0/25/deployment.0' 'host1' 25 host1
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: deploy: Using raw filesystem mapper for /var/lib/one/datastores/0/25/disk.0
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: deploy: Mapping disk at /var/lib/lxd/storage-pools/default/containers/one-25/rootfs using device /dev/loop14
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: deploy: Mounting /dev/loop14 at /var/lib/lxd/storage-pools/default/containers/one-25/rootfs
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: deploy: Mapping disk at /var/lib/one/datastores/0/25/mapper/disk.1 using device /dev/loop28
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: deploy: Mounting /dev/loop28 at /var/lib/one/datastores/0/25/mapper/disk.1
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: deploy: Name: one-25
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: Remote: unix://
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: Architecture: x86_64
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: Created: 2019/06/11 08:43 UTC
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: Status: Stopped
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: Type: persistent
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: Profiles: default
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]:
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: Log:
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]:
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: lxc one-25 20190611084314.867 WARN conf - conf.c:lxc_setup_devpts:1616 - Invalid argument - Failed to unmount old devpts instance
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: lxc one-25 20190611084314.896 ERROR start - start.c:start:2028 - No such file or directory - Failed to exec "/sbin/init"
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: lxc one-25 20190611084314.896 ERROR sync - sync.c:__sync_wait:62 - An error occurred in another process (expected sequence number 7)
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: lxc one-25 20190611084314.896 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:842 - Received container state "ABORTING" instead of "RUNNING"
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: lxc one-25 20190611084314.897 ERROR start - start.c:__lxc_start:1939 - Failed to spawn container "one-25"
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: lxc 20190611084314.898 WARN commands - commands.c:lxc_cmd_rsp_recv:132 - Connection reset by peer - Failed to receive response for command "get_state"
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]:
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: deploy: Using raw filesystem mapper for /var/lib/one/datastores/0/25/disk.0
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: deploy: Unmapping disk at /var/lib/lxd/storage-pools/default/containers/one-25/rootfs
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: deploy: Umounting disk mapped at /dev/loop14
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: deploy: Unmapping disk at /var/lib/one/datastores/0/25/mapper/disk.1
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: deploy: Umounting disk mapped at /dev/loop28
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: /var/tmp/one/vmm/lxd/client.rb:101:in `wait': {"type"=>"sync", "status"=>"Success", "status_code"=>200, "operation"=>"", "error_code"=>0, "error"=>"", "metadata"=>{"id"=>"43589008-a585-4775-9c58-75750c9acf53", "class"=>"task", "description"=>"Starting container", "created_at"=>"2019-06-11T10:43:14.615529499+02:00", "updated_at"=>"2019-06-11T10:43:14.615529499+02:00", "status"=>"Failure", "status_code"=>400, "resources"=>{"containers"=>["/1.0/containers/one-25"]}, "metadata"=>nil, "may_cancel"=>false, "err"=>"Failed to run: /usr/lib/lxd/lxd forkstart one-25 /var/lib/lxd/containers /var/log/lxd/one-25/lxc.conf: "}} (LXDError)
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:429:in `wait?'
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:441:in `change_state'
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:184:in `start'
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/deploy:75:in `<main>'
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: ExitCode: 1
Tue Jun 11 10:43:17 2019 [Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
Tue Jun 11 10:43:17 2019 [Z0][VMM][E]: Error deploying virtual machine
Tue Jun 11 10:43:17 2019 [Z0][VM][I]: New LCM state is BOOT_FAILURE
  1. lxc log (/var/log/lxd/one-25/lxc.log):
lxc one-25 20190611084314.867 WARN     conf - conf.c:lxc_setup_devpts:1616 - Invalid argument - Failed to unmount old devpts instance
lxc one-25 20190611084314.896 ERROR    start - start.c:start:2028 - No such file or directory - Failed to exec "/sbin/init"
lxc one-25 20190611084314.896 ERROR    sync - sync.c:__sync_wait:62 - An error occurred in another process (expected sequence number 7)
lxc one-25 20190611084314.896 ERROR    lxccontainer - lxccontainer.c:wait_on_daemonized_start:842 - Received container state "ABORTING" instead of "RUNNING"
lxc one-25 20190611084314.897 ERROR    start - start.c:__lxc_start:1939 - Failed to spawn container "one-25"
lxc 20190611084314.898 WARN     commands - commands.c:lxc_cmd_rsp_recv:132 - Connection reset by peer - Failed to receive response for command "get_state"
  1. /var/log/lxd/one-25/{console,forkstart}.log are emtpy

Expected results:

On my personal environment based on “VirtualBox 6” with the same configuration, all containers are instantiated successfully in the ‘RUNNING’ state.

So we would like to know if RedHat VMs (based on kvm) are compatible or need some tweaking of creation’s parameter or what else?

Thanks

The LXD virtualization nodes are only supported on Ubuntu distros

Sorry, I probably haven’t been very clear. Let me clarify.
RedHat Virtualization is a similar platform to VMware/V-sphere.
We installed OpenNebule on 2 virtual machines on this platform, following all the instructions in detail ( information about Redhat Virtualization platform has been reported only for completeness and I think it can be ignored in this context).
So we have a system with 2 VMs in which we have installed OpenNebula based on the LXD containers.
Errors and logs refer to the containers within these VMs.

Furthermore, as I said, I installed a similar system on 2 VMs based on Oracle’s VirtualBox v. 6, and everything worked there without finding the errors reported.
I hope I was clearer.
Bye

The issue here then, is

lxc one-25 20190611084314.896 ERROR start - start.c:start:2028 - No such file or directory - Failed to exec "/sbin/init"
The import procedure from the marketplace failed somehow, you need to look for the image on the datastre after it is imported and check its filesystem.

You can mount /var/lib/one/datastores/<image_datastore_id_where_the_marketplace_container_image_has_been_imported_into>and look for unusual things with the image file. Maybe paste the output of the container rootfs and we’ll get more info.

For information purposes, https://github.com/OpenNebula/one/issues/3049 this leads to other marketplace issues where you can get more ideas/solutions/headaches

Hi, I followed your directions and executed these commands (hopefully correctly):

$ oneimage list

  ID USER       GROUP      NAME            DATASTORE     SIZE TYPE PER STAT RVMS
  12 oneadmin   oneadmin   ubuntu_xenial - default      1024M OS    No used    1

$ oneimage show 12

IMAGE 12 INFORMATION                                                            
ID             : 12                  
NAME           : ubuntu_xenial - LXD 
USER           : oneadmin            
GROUP          : oneadmin            
LOCK           : None                
DATASTORE      : default             
TYPE           : OS                  
REGISTER TIME  : 06/11 10:32:43      
PERSISTENT     : No                  
SOURCE         : /var/lib/one//datastores/1/29df886c8038daf60c1c9c2ccb56c55d
PATH           : "lxd://https://uk.images.linuxcontainers.org//images/ubuntu/xenial/amd64/default/./20190610_07:42/rootfs.tar.xz?size=1024&filesystem=ext4&format=raw"
SIZE           : 1024M               
STATE          : used                
RUNNING_VMS    : 1                   

PERMISSIONS                                                                     
OWNER          : um-                 
GROUP          : ---                 
OTHER          : ---                 

IMAGE TEMPLATE                                                                  
DEV_PREFIX="hd"
DRIVER="raw"
FORMAT="raw"
FROM_APP="69"
FROM_APP_MD5="4997da90d6528e69191fd3dffbd5c600"
FROM_APP_NAME="ubuntu_xenial - LXD"

VIRTUAL MACHINES

    ID USER     GROUP    NAME            STAT UCPU    UMEM HOST             TIME
    25 oneadmin oneadmin ubuntu_xenial - fail    0      0K host1        2d 01h38

$ sudo mount /var/lib/one//datastores/1/29df886c8038daf60c1c9c2ccb56c55d /mnt

$ ls -al /mnt/

total 24
drwxr-xr-x  3 oneadmin oneadmin  4096 giu 11 10:37 .
drwxr-xr-x 24 root     root      4096 giu  6 06:47 ..
drwx------  2 root     root     16384 giu 11 10:37 lost+found

so it seems an empty image.
I forgot to mention that we are behind a corporate proxy, and configured in opennebula in this way:

In "/etc/one/oned.conf":
...
MARKET_MAD = [
    EXECUTABLE = "one_market",
    ARGUMENTS  = "-t 15 -m http,s3,one,linuxcontainers --proxy http://10.0.255.2:80"
]
...

In "/etc/one/sunstone-server.conf":
...
:proxy: http://10.0.255.2:80: 
...

I don’t know if this can affect the downloading of images, otherwise it seems to work: I downloaded successfully (through browser with the same proxy):

https://uk.images.linuxcontainers.org//images/ubuntu/xenial/amd64/default/./20190610_07:42/rootfs.tar.xz?size=1024&filesystem=ext4&format=raw

bye

mmm, it could be possible, can you open an github issue if this is the case, the importer failing when proxy is active ?

Opened: https://github.com/OpenNebula/one/issues/3427
bye