VM disk size problem on deploy

When a VM is created, the size of the HD is the same as the original image, not the one set in the Sunstone instantiate section.

After upgrading from OpenNebula 5.4.1 to 5.6.1, instantiating VM does not change the size of the HD of the new VM, either from Sunstone or through the “onetemplate instantiate” command.

If the VM is created and a resize of the HD is performed, it can be seen through the “qemu-img info” that the image has been modified, and after poweroff and on the VM, the system has configured the new system size correctly. But it does not work when creating a VM with a different size than the original image.

I don’t know if it could be due to some bad configuration after the update or what might be happening.

Hello @vjjuidias,

I’ve tested it in OpenNebula 5.8.5 and everything seems to works fine. The only failure I’ve seen is if you use the slider but just in one specific case. I’ve open a GitHub issue for that (https://github.com/OpenNebula/one/issues/3868).

Could you try to upgrade to the latest version?

Also could you check if the //DISK/SIZE attribute have the correct value inside the VM body?

Hi @Christian_Gonzalez,

First, thanks for the reply.

On the slider, when creating the VM I write directly the new disk size, I do not use the slider. I have also tried to put the size in both Gigas and Megas, in case it was a problem with that field.

The VM template shows the size it should be:

DISK = [
  CLONE = "YES",
  CLONE_TARGET = "SYSTEM",
  CLUSTER_ID = "",
  DATASTORE = "",
  DATASTORE_ID = "",
  DEV_PREFIX = "hd",
  DISK_ID = "0",
  DISK_SNAPSHOT_TOTAL_SIZE = "0",
  DISK_TYPE = "FILE",
  DRIVER = "qcow2",
  IMAGE = "ubuntu",
  IMAGE_ID = "523",
  IMAGE_STATE = "2",
  IMAGE_UNAME = "",
  LN_TARGET = "NONE",
  ORDER = "1",
  ORIGINAL_SIZE = "10240",
  READONLY = "NO",
  SAVE = "NO",
  SIZE = "20480",
  SOURCE = "/var/lib/one//datastores/XXX/850374986d22c4d4a463b64aa308c724",
  TARGET = "hda",
  TM_MAD = "shared",
  TYPE = "FILE" ]

However, the image (disk.0) is not modified with respect to the original image.

qemu-img info disk.0
image: disk.0
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 2.6G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false

I will deploy a machine in 5.8 to see if the problem is solved, although it will take some time.

Again thanks for the support

OK, Please let us know if the error persist in 5.8.

Hi again @Christian_Gonzalez,

I have installed a new machine in 5.8.5 and the problems persist.

I don’t know if I have any problem with OpenNebula configuration or with the images, but before 5.6.1 updating it worked fine.

Now, in version 5.8.5 it doesn’t always work to do the resize once the VM is instantiated, which in 5.6.1 it had worked.

Could you share the drivers you are using (TM_MAD and DS_MAD) and also the image information (oneimage show -x <img_id>)

Also, just to make it clear, the problem is that you resize the disk size during the VM instantiation and this change is not reflected in the VM (disk information) when it starts, am I right?

Hi again,

TM_MAD = [
    EXECUTABLE = "one_tm",
    ARGUMENTS = "-t 15 -d dummy,lvm,shared,fs_lvm,qcow2,ssh,ceph,dev,vcenter,iscsi_libvirt"
]

There is no driver that responds to the name DS_MAD, only configurations for the different drivers (lvm, ceph, libvirt, dummy, etc. as DS_MAD_CONF), that’s why I copied DATASTORE_MAD. If it’s not what you expect let me know.

DATASTORE_MAD = [
    EXECUTABLE = "one_datastore",
    ARGUMENTS  = "-t 15 -d dummy,fs,lvm,ceph,dev,iscsi_libvirt,vcenter -s shared,ssh,ceph,fs_lvm,qcow2,vcenter"
]

Image information:

<IMAGE>
  <ID>542</ID>
  <UID>21</UID>
  <GID>0</GID>
  <UNAME>xxxx</UNAME>
  <GNAME>oneadmin</GNAME>
  <NAME>CentOS 7.6</NAME>
  <PERMISSIONS>
    <OWNER_U>1</OWNER_U>
    <OWNER_M>1</OWNER_M>
    <OWNER_A>0</OWNER_A>
    <GROUP_U>0</GROUP_U>
    <GROUP_M>0</GROUP_M>
    <GROUP_A>0</GROUP_A>
    <OTHER_U>0</OTHER_U>
    <OTHER_M>0</OTHER_M>
    <OTHER_A>0</OTHER_A>
  </PERMISSIONS>
  <TYPE>0</TYPE>
  <DISK_TYPE>0</DISK_TYPE>
  <PERSISTENT>0</PERSISTENT>
  <REGTIME>1562665508</REGTIME>
  <SOURCE><![CDATA[/var/lib/one//datastores/xxx/c61632cb586d2f44daae44b33baf60a1]]></SOURCE>
  <PATH><![CDATA[]]></PATH>
  <FSTYPE><![CDATA[qcow2]]></FSTYPE>
  <SIZE>10240</SIZE>
  <STATE>2</STATE>
  <RUNNING_VMS>2</RUNNING_VMS>
  <CLONING_OPS>0</CLONING_OPS>
  <CLONING_ID>-1</CLONING_ID>
  <TARGET_SNAPSHOT>-1</TARGET_SNAPSHOT>
  <DATASTORE_ID>100</DATASTORE_ID>
  <DATASTORE>xxx</DATASTORE>
  <VMS>
    <ID>4518</ID>
    <ID>4625</ID>
  </VMS>
  <CLONES/>
  <APP_CLONES/>
  <TEMPLATE>
    <DEV_PREFIX><![CDATA[hd]]></DEV_PREFIX>
    <DRIVER><![CDATA[qcow2]]></DRIVER>
  </TEMPLATE>
  <SNAPSHOTS>
    <ALLOW_ORPHANS><![CDATA[NO]]></ALLOW_ORPHANS>
    <NEXT_SNAPSHOT><![CDATA[0]]></NEXT_SNAPSHOT>
  </SNAPSHOTS>
</IMAGE>

Finally, the problem in 5.6.1 is that you comment: when instantiating the VM and specifying a different disk size than the original image, this new size is not reflected on the VM disk (disk.0) on Sunstone as well as the VM template shows the size that has been defined.

In 5.6.1, when resizing the disk once the VM is running, the modification is successful.

In addition to the above, in 5.8.5 I have found that with this version, the disk resize after the VM is in a running state does not always work. For the image from which I copied the information does not work.

Thanks in advance.

Hello @vjjuidias

I need to know the specific DS_MAD and TM_MAD used in the image datastore where the image resides and the TM_MAD of the system datastore where the VM is deployed.

Hi @Christian_Gonzalez,

I am not sure that it is the information you are requesting. According to the information in the image, being qcow2 I think the TM_MAD:

TM_MAD_CONF = [
    NAME = "qcow2", LN_TARGET = "NONE", CLONE_TARGET = "SYSTEM", SHARED = "YES",
    DRIVER = "qcow2"
]

In the case of DS_MAD I am not sure which one would use the image in question, how could I know which one exactly uses each image?

DS_MAD_CONF = [
    NAME = "dev", REQUIRED_ATTRS = "DISK_TYPE", PERSISTENT_ONLY = "YES"
]
DS_MAD_CONF = [
    NAME = "fs", REQUIRED_ATTRS = "", PERSISTENT_ONLY = "NO",
    MARKETPLACE_ACTIONS = "export"
]
DS_MAD_CONF = [
    NAME = "lvm", REQUIRED_ATTRS = "DISK_TYPE,BRIDGE_LIST",
    PERSISTENT_ONLY = "NO"
]

I presume it will be one of the above, because neither iscsi, ceph, vcenter use. In case it is not this, could you tell me how or where to look at that data.

Hello again @vjjuidias,

You can retrieve this information by just running this command onedatastore show <ds_id> | grep MAD or just check the datastore information in Sunstone.

Hi again @Christian_Gonzalez,

Considering the output of the command

onedatastore show xxx | grep -i mad
DS_MAD         : fs                  
TM_MAD         : shared              
DS_MAD="fs"
TM_MAD="shared"


TM_MAD_CONF = [
    NAME = "shared", LN_TARGET = "NONE", CLONE_TARGET = "SYSTEM", SHARED = "YES",
    DS_MIGRATE = "YES", TM_MAD_SYSTEM = "ssh", LN_TARGET_SSH = "SYSTEM",
    CLONE_TARGET_SSH = "SYSTEM", DISK_TYPE_SSH = "FILE"
]


DS_MAD_CONF = [
    NAME = "fs", REQUIRED_ATTRS = "", PERSISTENT_ONLY = "NO",
    MARKETPLACE_ACTIONS = "export"
]

thank you for the support again.

Hello again @Christian_Gonzalez ,

I tell you the tests I have done:

The first installation was an update from 5.6.1 to 5.8.5, now I am working with a clean installation of 5.8.5. In both cases I import the DB of the machine in production (5.6.1 updated and upgraded from 5.4.1), which updates and corrects some errors.

For some test, I created a clean OS image in 5.8.5 and tried to deploy with a new size, but the problem persists.

When doing the resize once the VM is running it does the qemu-img resize.

Do you have any idea what might be happening in the VM deployment? Any test I can perform?

For now I will try to go back to 5.4.1 (the latest version where it was working fine) and check that it is still working.

Again, thanks for the support.

Hi Víctor,

Could you check the CLONE_CMD variable in /var/lib/one/remotes/tm/shared/clone? If it is

CLONE_CMD="cd ${DST_DIR}; \
    rm -f ${DST_PATH}; \
    cp ${SRC_PATH} ${DST_PATH} \
    ${RESIZE_CMD}"

Try changing it to - note the semicolon(’;’) at the end of the cp command

CLONE_CMD="cd ${DST_DIR}; \
    rm -f ${DST_PATH}; \
    cp ${SRC_PATH} ${DST_PATH}; \
    ${RESIZE_CMD}"

or better use the current(from 5.8.x) variant:

CLONE_CMD=$(cat <<EOF
    set -e -o pipefail
    cd ${DST_DIR}
    rm -f ${DST_PATH}
    cp ${SRC_PATH} ${DST_PATH}

    ${RESIZE_CMD}
EOF
)

Hope this helps.

Best Regards,
Anton Todorov

Hi @atodorov_storpool ,

Thanks for the help. I have checked the CLONE_CMD variable and apparently it is like in the version that uses “cat”.

I have looked at the machine in production (5.6.1). I have also tried to put the old format by adding the semicolon.

In both cases the image resize is not done.

For testing I tried to execute the RESIZE_CMD command from a console and it does the resize correctly.

Thanks again for the support.

Hi Víctor,

I’d guiess is that the RESIZE_CMD is probably empty,
Can you try adding a log just before the call of ssh_exec_and_log

log "SIZE=$SIZE ORIGINAL_SIZE=$ORIGINAL_SIZE RESIZE_CMD=$RESIZE_CMD"

Then try to instantiate a VM. There should be a line logged showing the variables related to the resize operation and the exact command…

BR,
Anton

Hi again @atodorov_storpool,

I have added the line for the log, but I don’t see where that log goes.

log "SIZE=$SIZE ORIGINAL_SIZE=$ORIGINAL_SIZE RESIZE_CMD=$RESIZE_CMD"
ssh_exec_and_log $DST_HOST "$CLONE_CMD" \
    "Error copying $SRC to $DST"

Not even in the VM log file in /var/log/one/VMID.log or in /var/log/one/sched.log have I found anything.

Well,

Let’s move to plan B if it is not in /var/log/one/oned.log too :slight_smile:

Replace/add the following line to have it logged by syslog then :slight_smile:

logger -t shared_cp -- "SIZE=$SIZE ORIGINAL_SIZE=$ORIGINAL_SIZE RESIZE_CMD=$RESIZE_CMD"

If this file is called there should be a line logged in /var/log/messages or /var/log/syslog depending on the OS…

BR,
Anton

You were right, after logging in messages, you see that the variables are empty:

Oct 25 12:58:44 node2201-1 shared_cp: SIZE= ORIGINAL_SIZE= RESIZE_CMD=
Oct 25 12:58:50 node2201-1 shared_cp: SIZE= ORIGINAL_SIZE= RESIZE_CMD=

So far so good :slight_smile:

Now we should figure-out why these values are empty…

The following piece of code is responsible for this

DISK_ID=$(basename ${DST_PATH} | cut -d. -f2)

XPATH="${DRIVER_PATH}/../../datastore/xpath.rb --stdin"

unset i j XPATH_ELEMENTS

while IFS= read -r -d '' element; do
    XPATH_ELEMENTS[i++]="$element"
done < <(onevm show -x $VMID| $XPATH \
                    /VM/TEMPLATE/DISK[DISK_ID=$DISK_ID]/SIZE \
                    /VM/TEMPLATE/DISK[DISK_ID=$DISK_ID]/ORIGINAL_SIZE \
                    /VM/HISTORY_RECORDS/HISTORY[last\(\)]/TM_MAD)

SIZE="${XPATH_ELEMENTS[j++]}"
ORIGINAL_SIZE="${XPATH_ELEMENTS[j++]}"
TM_MAD="${XPATH_ELEMENTS[j++]}"

Lets add a piece that saves the VM’s XML to tmp. for this replace this line

done < <(onevm show -x $VMID| $XPATH \

with

done < <(onevm show -x $VMID | tee /tmp/vm-${VMID}-${DISK_ID}.xml | $XPATH \

and add to the logger the VMID and DISK_ID variables:

logger -t shared_cp -- "VMID=$VMID DISK_ID=$DISK_ID SIZE=$SIZE ORIGINAL_SIZE=$ORIGINAL_SIZE RESIZE_CMD=$RESIZE_CMD"

Then, when you instantiate a VM we will know each log line for which VM-disk is called and the VM’s xml in /tmp/vm-${VMID}-${DISK_ID}.xml.

Then you can take a look at the XMLs and look are there the size related variables…

BR,
Anton

1 Like

I have modified the code as you suggest, but the VM xml is empty:

Oct 25 13:24:37 node2201-1 shared_cp: VMID=4635 DISK_ID=0 SIZE= ORIGINAL_SIZE= RESIZE_CMD=

-rw-r--r-- 1 oneadmin oneadmin 0 oct 25 13:24 /tmp/vm-4635-0.xml