Contribution: OpenNebula backup script for QCOW2 datastores


(Kristian Feldsam) #1

Hello, I just release script for backing up QCOW2 datastores with live snapshot support.


Backup live VMs with snapshots (--quiesce --atomic)
Feature Request : VM Backup & Disk resize
(Kristian Feldsam) #2

Hi all, addon was updated with new features and fixes.

Actual version 1.2.0

Changes since first version:

  • Live snapshotting added fallback in case that FS freeze fails
  • Fix backup persistent used images if VM is not in ACTIVE state
  • new option -k --insecure for use rsync with weakest but fastest SSH encryption
  • other rsync options adjustments to improve speed
  • new option -n --netcat for use netcat instead of rsync to transfer main image (*.snap dirs still use rsync)
  • extended -i --image option to support multiple image IDs separated by comma

(Ulrich P.) #3

Hallo Kristian,

thanks a lot for this contribution. I believe this is a very important piece for the OpenNebula project.

I did some tests and found a problem with this addon in my setup.
In general I followed the standard installation instructions to install OpenNebula 5.4.10 on CentOS 7. The only difference is that I installed the Enterprise version of qemu-kvm (Package qemu-kvm-ev) in order to have access to some features not present in the standard package.
Besides this everything is according standard instructions.
The point is that in this setup libvirt is not enabled for tcp connections (I get a connection refused when your script tries to create the snapshot by running virsh). And to enable libvirt for tcp with proper authentication adds some complexity to my setup.

Question:
I understand that the virsh command is anyhow executed locally on the remote kvm node. Your are using
virsh -c qemu+tcp://localhost/system … for snapshot and blockcommit.
And I found out that
virsh -c qemu:///system … works in my setup without any other changes in my setup.
Do you see a possibility to change the virsh command in one-image-backup.js to virsh -c qemu:///system?

Thanks and Best Regards
Uli


(Kristian Feldsam) #4

Hello @uli, thank you for praise. I forgot mention this in docs. I am using tuned KVM driver. Look on KVM Driver docs - Tuning & Extending

EDIT: Of course, I can add it to config.


(Ulrich P.) #5

Ok, thanks. That link to the Tuning Guide makes it very clear, although I have some security related doubts with this setup (non-authenticated and non-encrypted). Network isolation should be bullet proof. Just my opinion…

Anyway I would appreciate it if you can add it to the config. Keep up the good work!


(Kristian Feldsam) #6

Hi all, I just released version 1.4.1.

Changes since version 1.2.0

New features:

  • new config option libvirtUri to configure custom hypervisor connection URI
  • new option -S --start-image <image_id> image id to start backup from

Other changes:

  • rsync use inplace option because we use custom tmp files
  • don’t update images backup info in opennebula when dry run is used
  • removed netcat option from backup.sh script due to experimental nature of that feature
  • backup main images to tmp file and after copy replace original one

(Ulrich P.) #7

Hi Kristian,

thanks a lot for the quick update. I may have another feature request, if I may ask.
Similar to the -i option I would love to see a --datastore option to include all images from the datastores in a comma separated listed.
This is similiar to the way we handle “protected” and “unprotected” datastores in our vSphere environment, because we do not want to backup all images. The placement of vm-image decides if it is backuped or not.

Thanks
Uli


(Kristian Feldsam) #8

Hello, I pushed updated version to develop branch. Please test it. If all goes ok, them I’ll publish new release.


(Ulrich P.) #9

Yesterday I tested the develop branch with the -a (datastores) option and your addon backuped all images from the specified datastores. Brilliant!

One last idea came to my mind to further enhance the addon:
Backup all datastores and images with a specific label (e.g. ‘protected’). This logic would be simple and similar to your ansible inventory script and would give users much more control over the backup in sunstone.
But for now I am already very happy with the datastores option. Thanks Kristian!


(Kristian Feldsam) #11

New release 1.6.1!

Changes since version 1.4.1:

New features:

  • new option -a --datastore to backup from specific datastore(s)
  • new option -l --label to backup image(s) and/or datastore(s) by specific label(s)

Other changes:

  • code enhancements

(Ulrich P.) #12

I can confirm both options are working in our environment (also with nested labes like “images/protected”).
Great addon.


(Martin) #13

Hi @feldsam ,
I just want to ensure myself that I understand how your addon works. I wonder how your script will backup my VM images. I have image type DATABLOCK with “Persistent” setting set to “no”, this image is used by VMs as OS/DATA disk.

VM disk image looks like this :
qemu-img info 0
image: 0
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 9.7G
cluster_size: 65536
backing file: /var/lib/one/datastores/103/cdfd110a37e569165315b96f913293a2
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false

Can you explain please if VM will be snapshoted, than backed up and block commited, or not ?
Thanks.


(Kristian Feldsam) #14

Hello, in OpenNebula there are two types of images, as you already know:

  • non-persistent
  • persistent

Non-persistent

You can deploy many VMs using same non-persistent image. On the background, oned creates new qcow2 image which have backing image of that non-persistent one. That new image is deployed to system datastore.
When you terminate VM instance, that image deployed to system datastore will be deleted. So from nature of this functionality, VMs deployed from non-persistent images have character of instances of base image and should not persist data and should not be backuped. This is ideal for applications, which can run several times at once for horizontal scaling. These instances usually connects to same shared database server and/or shared file server (nfs) to persist data. When VM instance need to persist data, them there should be attached persistent datablock or VM should be instatiated as persistent.

Persistent

Persistent image can be attached to only one VM and changes made in VM instance persists after undeploys/terminates. This images are backuped and also, if attached to running VM, live snapshoted.


(Martin) #15

This is kind of pitty because I use these non-persistent images for all VMs. In my environment after undeploy action machine goes to shutdown and than undeployed state (it free up resources on node) but disks or machine image files stays on hypervisor datastore untouched, so it could be backed-up. (this behaviour is not the same as you described - dunno why) I use GlusterFS (via libgfapi) to access disk images. Termination state deletes VM and related files of course.

So to use your script I have to make all disks of all VMs persistent. (now images are presented as non-persistent images backed by image from image datastore). Do you think this is good approach or I should rewrite your script in the way it will backup also non-persistent images with same logic as persistent ?

Thanks for any suggestions in advance.


(Kristian Feldsam) #16

Hmm, also name “non-persistent” have character of “temporary” disk, where data doesn’t matter. Base images are of course backuped, but not non-persistent part of attached image.

I think, that every VM, which have data, that needs to be persistent, have to use persistent images. On instantiate tab, you can select “instantiate as persistent”, or you can already instatiated non-persistent VMs “save as” and make them persistent. When you need terminate VM, them you should delete also VM Template with All disks, so you free up space.

Of cource, I can add new option to backup also non persistent VMs, but from nature of non-persistent vs persistent feature, you should decide rework your setup.


(Martin) #17

I will do some research and let you know, because probably this is related to GlusterFS and OpenNebula driver for Gluster itself. This is probably why my non-persistent VMs and their images stayed untouched after undeploy operation.

Anyway thanks for help for now :slight_smile:

EDIT: Also non-persistent images as I am using them saving space on storage because base system is not copied multiple times for every VM, it can also be “super cached” to some hot-tier cache because multiple VMs using same file to work (base image)


(Ulrich P.) #18

From the official documentation (https://docs.opennebula.org/5.4/operation/vm_management/vm_instances.html):
“Undeploy -> The VM is shut down. The VM disks are transfered to the system datastore. The VM can be resumed later.”

I understand that undeployed VMs still hold their “non-persistent” data. But as soon as you terminate a VM with non-persistent disks all changed data (to the base image) is lost.


(Anton Todorov) #19

Undeploy and Terminate are two separate actions. The transaction of the states is as follow

Undeploy: --> [epilog] --> [undeployed]
Terminate: --> [epilog] --> [done]


(Kristian Feldsam) #20

@uli u a right, I updated my prev post.


(Mirko) #21

First of all thank you @feldsam for your work.
I think that a backup mech on opennebula is a must.
When i first use ON i’ve searched for hours an integrated backup.
Integration with sunstone will be the next step :wink:.

I support the question of “Snowman Martin”.

I’m using ON like a cloud ISP and in this situation all VM are non persistent.
My customer create VM from a template, use it for months or years and then terminate VM.
In the last case the non persistent disk is deleted.
But in between non-persistent disk are to be backupped and ARE IN USE.
In this situation the process of “snapshot(+quiesce)+backup+blockcommit” are IMHO necessary.

Or i’m using ON wrong :thinking:?

Thank you very much in advance.
Mirko