we are using it on our OpenNebula powered platform (NodeWeaver), you can just mount /var/lib/one on all nodes to the LizardFS root and you’re good to go, just as it was a NFS share, but way more scalable.
Very reliable and we achieved great performance with a bit of SSD chaching and some tuning.
check this results:
A great advantage of LizardFS is that you can use it with just little modifications from the shared filesystem, and host everything ONE-related in a single reliable datastore (the TM for moosefs and lizardfs is here: http://wiki.opennebula.org/ecosystem:moosefs )
On a two node (2 rotational devices+2 EnhanceIO SSD caches) we got 11K write IOPS, and we easily reach 90MB/sec within the VMs.
Another advantage is the copy-on-write snapshot capability, that greatly enhances what you can do with OpenNebula for thinly provisioned images, without performance problems.
Hello Carlo,
Short question. Did you do any evaluation of using other caching systems? (flashcache, bcache, dm-cache e.g.).
I am currently testing different setups - actually it is bcache and LSI cachecade, but bcache looks promising. Perhaps you made similar tests and can share your experiences.
Tried also gluster (which was from the performance point of view very good, especially on 10GB networks, but usability is not yet very nice).
And on the last ONE conf everybody was fine with CEPH, but that needs a more expensive footprint in terms of hardware.
Sheepdog was horrendously unstable when I last used it. Sometimes, simple storage host reboots would destroy all data in the cluster.
I’ve been using ceph with KVM on Ubuntu for a few years now (since the argonaut release) and have had very few problems with it. I’ve only recently added Open Nebula to the mix but it dropped right in with no changes needed to my ceph config.
The biggest drawback is that ceph’s a network hog. Make sure you have lots and lots of bandwidth for it. If you don’t you might start seeing sufficient IO lag on your VMs to cause problems.
We are using bcache backed glusterfs bricks. Don’t expect miracles in benchmarks but there is a measurable increase in IOPS performance, but you have to consider that for KVM at least there are quite a few tunables that should be addressed first before looking at SSD caching as a performance enhancer.
I’m also not a fan of the qemu-glusterfs integration. It doesn’t feel complete yet, there is some work that needs to be done. Also keeping the shared filesystem layer separate from the hypervisor is easier for support. We are using glusterfs-fuse backed shared fs and it’s working great so far with qemu images.
We are using sheepdog for non-productional cluster. It is much stable in version 0.9.1. Also you need to use qemu version 1.7 or higher, for autofailover support.
We are using GlusterFS with shared storage and the performance is good. The bad point is that sometimes some node hangs and must be rebooted (and the documentation is quite poor IMHO).
great thread! I am just looking into the same issue. So far i am looking forward to try GlusterFS first. @nico_opennebula_org : Could you point out the storage hardware you use?
I am running on IBM PureFlexNodes and a storwize storage solution which is attached via 10 Gbit FCoE.
we are using a very simple architecture: Every of our cluster consists of 2 nodes. They have two network cards, one connected to the public network, one connected to the other host.
The hosts are only using replicated mechanism and we build n of these gluster clusters.
Hardware wise they are mid range servers (16-128 GiB RAM, 8-32 cores, 1-12TB).
I finally managed to set Gluster up in my setup. I use three IBM Pureflex nodes which are each attached via 10 Gbit FCoE uplink to their own volume managed by an IBM storwize v7000. These volumes are managed by GlusterFS Servers on each node and accessed by OpenNebula from each node respectively. The volume has a combined size of 6 TB and runs in standard distributed mode (that is: no replication here).
Results from first tests:
Overall Performance is pretty good (will conduct benchmarks later)
Deplyoing 40 VMs very quick (<4 minutes)
In 40 simultaneous deployments, 6 - 10 VMs fail to deploy and have to re-deployed
Does anybody have the failing deployment with GlusterFS as well?
What options are you using? We’ve had no issues with Glusterfs 3.4.3 (our current version) and we have tested massive loads (50-60 Gbps) across our distributed replicate clusters. The options on the vms and the storage matters. I would hold back on the bleeding edge (glusterfs 3.6.X) if possible. Are you sure your deployment issues are storage related?
And these are my volume infos, in cluding the options set from the ‘virt’ group, which you are advised to set when installing from the opennenbula documentation:
Volume Name: one
Type: Distribute
Volume ID: e64309f5-88d8-4d55-9272-16611acebe25
Status: Started
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: molokai:/data/gluster/brick
Brick2: lanai:/data/gluster/brick
Brick3: maui:/data/gluster/brick
Options Reconfigured:
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: on
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
storage.owner-gid: 9869
storage.owner-uid: 9869
server.allow-insecure: on
As for the VM options: What do you mean? I use the qcow2 model on the images, if the images are in qcow2 format. What other option could I set on the VMs regarding GlusterFS?
I’m just wondering if we are exposing all the needed parameters for tuning VM performance. Are you currently relying on RAW? Could we benefit from exposing some of these parameters in Sunstone as advanced options?
Wed Mar 11 12:35:57 2015 [Z0][TM][I]: Command execution fail:
/var/lib/one/remotes/tm/shared/clone
molokai:/var/lib/one/datastores/114/3fbc702cdff9fdced57c7b95c33b2459
lanai:/var/lib/one//datastores/120/169/disk.0 169 114
Wed Mar 11
12:35:57 2015 [Z0][TM][I]: clone: Cloning
/var/lib/one/datastores/114/3fbc702cdff9fdced57c7b95c33b2459 in
lanai:/var/lib/one//datastores/120/169/disk.0
Wed
Mar 11 12:35:57 2015 [Z0][TM][E]: clone: Command "cd
/var/lib/one/datastores/120/169; cp
/var/lib/one/datastores/114/3fbc702cdff9fdced57c7b95c33b2459
/var/lib/one/datastores/120/169/disk.0" failed: Warning: Permanently
added 'lanai,141.22.29.23' (ECDSA) to the list of known hosts.
Wed Mar 11 12:35:57 2015 [Z0][TM][I]: sh: line 3: cd: /var/lib/one/datastores/120/169: No such file or directory
Wed
Mar 11 12:35:57 2015 [Z0][TM][I]: cp: cannot create regular file
'/var/lib/one/datastores/120/169/disk.0': No such file or directory
Wed
Mar 11 12:35:57 2015 [Z0][TM][E]: Error copying
molokai:/var/lib/one/datastores/114/3fbc702cdff9fdced57c7b95c33b2459 to
lanai:/var/lib/one//datastores/120/169/disk.0
Wed Mar 11 12:35:57 2015 [Z0][TM][I]: ExitCode: 1
Wed
Mar 11 12:35:57 2015 [Z0][TM][E]: Error executing image transfer
script: Error copying
molokai:/var/lib/one/datastores/114/3fbc702cdff9fdced57c7b95c33b2459 to
lanai:/var/lib/one//datastores/120/169/disk.0
So the problem results from a directory no created. I checked it, the directory really does not get created.