Unable to create VM snapshot using Ceph

Hi everyone,

In my department, we are trying to bring up an Opennebula environment with a Ceph storage to handle our virtualization needs. We have already installed and configured the frontend and attached a couple of nodes.

Most of the features are working as expecting but we have struggled with the snapshots and we haven’t been able to make them work. We have been reading a lot about how to integrate Ceph with Opennebula but finally, we decide to ask
here for help.

So, here is the problem, when I try to create a snapshot of a persistent VM, the following error mesage came up:

Tue Mar 19 10:01:23 2019 [Z0][VM][I]: New LCM state is HOTPLUG_SNAPSHOT
Tue Mar 19 10:01:23 2019 [Z0][VMM][I]: Command execution failed (exit code: 255): 'if [ -x "/var/tmp/one/vmm/kvm/snapshot_create" ]; then /var/tmp/one/vmm/kvm/snapshot_create one-13 0 13 nebula-202; else exit 42; fi'
Tue Mar 19 10:01:23 2019 [Z0][VMM][I]: error: unsupported configuration: internal snapshot for disk vda unsupported for storage type raw
Tue Mar 19 10:01:23 2019 [Z0][VMM][E]: Could not create snapshot for domain one-13.
Tue Mar 19 10:01:23 2019 [Z0][VMM][E]: Error creating new VM Snapshot: Could not create snapshot for domain one-13.
Tue Mar 19 10:01:23 2019 [Z0][VM][I]: New LCM state is RUNNING

I’ve been looking for changing the disk format, changing the datastore configuration, changing the VM template but nothing works…

Here is the Image Datastore and the System Datastore configuration :

Ceph System Datastore

DATASTORE 102 INFORMATION                                                       
ID             : 102                 
NAME           : ceph_system         
USER           : oneadmin            
GROUP          : oneadmin            
CLUSTERS       : 100                 
TYPE           : SYSTEM              
DS_MAD         : -                   
TM_MAD         : ceph                
BASE PATH      : /var/lib/one//datastores/102
DISK_TYPE      : RBD                 
STATE          : READY               

DATASTORE CAPACITY                                                              
TOTAL:         : 22T                 
FREE:          : 22T                 
USED:          : 11G                 
LIMIT:         : -                   

PERMISSIONS                                                                     
OWNER          : um-                 
GROUP          : u--                 
OTHER          : ---                 

DATASTORE TEMPLATE                                                              
ALLOW_ORPHANS="mixed"
BRIDGE_LIST="..."
CEPH_HOST="..."
CEPH_SECRET="..."
CEPH_USER="oneadmin"
CLUSTER="100"
DISK_TYPE="RBD"
DS_MIGRATE="NO"
POOL_NAME="one"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
SHARED="YES"
TM_MAD="ceph"
TYPE="SYSTEM_DS"

IMAGES

Ceph Image Datastore

DATASTORE 103 INFORMATION                                                       
ID             : 103                 
NAME           : ceph_image          
USER           : oneadmin            
GROUP          : oneadmin            
CLUSTERS       : 100                 
TYPE           : IMAGE               
DS_MAD         : ceph                
TM_MAD         : ceph                
BASE PATH      : /var/lib/one//datastores/103
DISK_TYPE      : RBD                 
STATE          : READY               

DATASTORE CAPACITY                                                              
TOTAL:         : 22T                 
FREE:          : 22T                 
USED:          : 11.1G               
LIMIT:         : -                   

PERMISSIONS                                                                     
OWNER          : um-                 
GROUP          : u--                 
OTHER          : ---                 

DATASTORE TEMPLATE                                                              
ALLOW_ORPHANS="mixed"
BRIDGE_LIST="..."
CEPH_HOST="..."
CEPH_SECRET="..."
CEPH_USER="oneadmin"
CLONE_TARGET="SELF"
CLONE_TARGET_SHARED="SELF"
CLONE_TARGET_SSH="SYSTEM"
DISK_TYPE="RBD"
DISK_TYPE_SHARED="RBD"
DISK_TYPE_SSH="FILE"
DRIVER="raw"
DS_MAD="ceph"
LN_TARGET="NONE"
LN_TARGET_SHARED="NONE"
LN_TARGET_SSH="SYSTEM"
POOL_NAME="one"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
TM_MAD="ceph"
TM_MAD_SYSTEM="ssh,shared"
TYPE="IMAGE_DS"

IMAGES         
17             
18

Hi @chromamaster ,

It`s not possible take VM snapshot using Ceph. Instead this, you can take snapshot of the VM disk.

Maybe some Opennebula developer can explaine-us why not possible this…

Hello,

Ceph RBD prefers RAW format, and it seems that is how ONe feeds ceph.
To confirm this, I downloaded a marketplace image which has format qcow2. I exported the relevant RBD and inspected the file with qemu-img info. The format of this export is actually raw.

From the ceph manual QEMU and Block Devices — Ceph Documentation :

The raw data format is really the only sensible format option to use with RBD. Technically, you could use other QEMU-supported formats (such as qcow2 or vmdk ), but doing so would add additional overhead, and would also render the volume unsafe for virtual machine live migration when caching (see below) is enabled.

Conjecture :
qemu snapshots RAW disk as qcow2 files, and qemu appends QCOW2 disk with a snapshot. In ONe + Ceph, the VM disk is an RBD, so ceph is handling the snapshot process, which means there are no qcow2 snapshots of the VM. So active memory and state of the VM are not getting snapshotted since only the disk is getting snapshotted.

From end-user usability standpoint, would be great if VM snapshotting section would at least snapshot all the disks at the same time

1 Like