Questions about ONe's internal scheduling logic

parak · November 4, 2016, 11:32am

Hello everyone,

my colleagues are trying to pick apart the ONe scheduler and understand its internal logic. They asked me a few questions and I couldn’t answer some of them. So, I’m turning to you, the community

What happens when a datastore isn’t assigned to a specific cluster? Does it
mean this datastore is available on all hosts (across all clusters), on hosts
without any cluster, or nowhere? Does this logic apply to all objects which can
be assigned a cluster?

Why is it necessary to check the image datastore capacity before
matching the datastore for a pending virtual machine?
In the current scheduler implementation, each disk of an active VM
consumes storage from an image datastore. VMs that require more
storage than there is currently available are filtered out (will
remain in the pending state).

Is it true that if a datastore is shared then its capacity can be found directly
in the datastore metadata but if it is not shared then its capacity must be checked
directly in the host monitoring information under DATASTORES?
What about the MAX_DISK/HOST_SHARE element, sometimes the capacity does
not match the capacity in the DATASTORES element. What is the right
way to check the capacity of datastores?

Regarding Q#2, we managed to identify a few key pieces of the code (see below). However, it didn’t help us understand what’s going on there.

Code where the test_image_datastore_capacity is called:

github.com

OpenNebula/one/blob/master/src/scheduler/src/sched/Scheduler.cc#L1107


      
          else//No datastores assigned, let's see why
          {
              if (n_error == 0)//No syntax error
              {
                  if (datastores.size() == 0)
                  {
                      vm->log("Cannot dispatch VM: No system datastores found to run VMs");
                  }
                  else if (n_auth == 0)
                  {
                      vm->log("Cannot dispatch VM: User is not authorized to use any system datastore");
                  }
                  else if (n_fits == 0)
                  {
                      ostringstream oss;
                      oss <<  "Cannot dispatch VM: No system datastore with enough capacity for the VM";
          
                      vm->log(oss.str());
                  }
                  else if (n_matched == 0)
                  {

The test_image_datastore_capacity method:

github.com

OpenNebula/one/blob/master/src/scheduler/src/pool/VirtualMachineXML.cc#L406


      
                  sr.topology = vm_template->get("TOPOLOGY");
              }
          
              if ( memory == 0 || cpu == 0 )
              {
                  sr.cpu  = 0;
                  sr.mem  = 0;
                  sr.disk = 0;
          
                  return;
              }
          
              sr.cpu  = (int) (cpu * 100); //100%
              sr.mem  = memory * 1024;     //Kilobytes
              sr.disk = system_ds_usage;   //MB
          }
          
          /* -------------------------------------------------------------------------- */
          
          void VirtualMachineXML::add_capacity(HostShareCapacity &sr)
          {

Here is the definition of ImageDatastorePoolXML:

github.com

OpenNebula/one/blob/master/src/scheduler/include/DatastorePoolXML.h#L75


      
          
          /* -------------------------------------------------------------------------- */
          /* -------------------------------------------------------------------------- */
          
          class ImageDatastorePoolXML : public DatastorePoolXML
          {
          public:
              ImageDatastorePoolXML(Client* client):DatastorePoolXML(client){};
          
          protected:
              int get_suitable_nodes(std::vector<xmlNodePtr>& content) const override
              {
                  return get_nodes("/DATASTORE_POOL/DATASTORE[TYPE=0]", content);
              };
          };
          
          #endif /* DATASTORE_POOL_XML_H_ */

And the test_capacity method:

github.com

OpenNebula/one/blob/master/src/scheduler/src/pool/DatastoreXML.cc#L98


      
              string shared_st;
              this->xpath(shared_st, "/DATASTORE/TEMPLATE/SHARED", "YES");
          
              shared = one_util::icasecmp(shared_st, "YES");
          
              ObjectXML::paths     = ds_paths;
              ObjectXML::num_paths = ds_num_paths;
          }
          
          /* -------------------------------------------------------------------------- */
          /* -------------------------------------------------------------------------- */
          
          bool DatastoreXML::test_capacity(long long vm_disk_mb, string & error) const
          {
              bool fits = (vm_disk_mb < free_mb) || (vm_disk_mb == 0);
          
              if (!fits)
              {
                  if (NebulaLog::log_level() >= Log::DDEBUG)
                  {
                      ostringstream oss;

Any help or advice will be greatly appreciated.

Thanks!

ruben · November 7, 2016, 6:55pm

What happens when a datastore isn’t assigned to a specific cluster? Does it
mean this datastore is available on all hosts (across all clusters), on hosts
without any cluster, or nowhere? Does this logic apply to all objects which can
be assigned a cluster?

In OpenNebula 5.0 each resource is assigned to a cluster, the “default” (ID:0) cluster is used if no one is specified. By default all resources are assigned to this cluster.

Note that in OpenNebula 4.14 we have the cluster none (ID:-1) that was used to share elements in all clusters (for some resources). Fortunately we got rid of this inconsistency.

Why is it necessary to check the image datastore capacity before
matching the datastore for a pending virtual machine?
In the current scheduler implementation, each disk of an active VM
consumes storage from an image datastore. VMs that require more
storage than there is currently available are filtered out (will
remain in the pending state).

This is needed for datastores with the attribute CLONE_TARGET=“SELF”.For example in Ceph VMs are cloned to the same pool, thus using space for the image datastore. VMs using other storage does not add disk size to the ds_usage variable. So, only disks from a datastore which TM driver sets the SELF flag adds to the image datastore capacity checks.

Is it true that if a datastore is shared then its capacity can be found directly
in the datastore metadata but if it is not shared then its capacity must be checked
directly in the host monitoring information under DATASTORES?

True, in the first case the information is obtained through the datastore monitoring script (new in 5.0) while the local datastore is obtained from the host monitoring.

What about the MAX_DISK/HOST_SHARE element, sometimes the capacity does
not match the capacity in the DATASTORES element. What is the right
way to check the capacity of datastores?

This refers to the space available under to /var/lib/one/datastores, this is only considered when there is no VM deployed in the host and the datastore directory does not exisits. It may differ from the datastore size when it is mounted from other device.

The logic would be,

Get the TM configuration for the datastore driver from oned.conf
If SHARED=“YES” you can use the datastore metadata for all hosts
Otherwise you need to check the DS size in each host

Hope it helps

Cheers!

parak · November 7, 2016, 7:30pm

Thanks! That’s exactly what we needed.

Rahul_Sharma · November 11, 2021, 7:31pm

Hello @ruben
I am working on XML-RPC methods and having following queries:

one.vm.action method takes two parameters id and action, so how can we add scheduled action to it using XML-RPC call.
CLI provides a command to do the same "onevm poweroff 0 --schedule “09/23 14:15” "
How can we add scheduled time for action using API call?

Please suggest and guide on the above queries.
Looking for some solution approach as haven’t got much deep in any of the document for these queries.
Thanks in advance!!

Topic		Replies	Views
Datastore (0) / host not in same cluster Community Support	7	1787	January 6, 2017
Confusion regarding datastores Community Support	0	438	March 20, 2017
Scheduler does not distpatch VMs with RDM datastore images Storage	7	475	March 16, 2020
Question about Raw Device Mapping (RDM) Datastores Storage	4	471	January 30, 2020
Opennebula 5.0 cluster Community Support	2	717	June 21, 2016

Questions about ONe's internal scheduling logic

Related Topics