Why VSAN and VVOLs are Not Software Defined

Why VSAN and VVOLs are Not Software Defined

Chris EvansStorage

Every so often another attempt is made by the vendor community to put a definition to what we call “software defined storage” or SDS.  The current discussion is on VMware Virtual SAN (VSAN) and VVOLs, software features focused on storage within the vSphere hypervisor ecosystem. This seems like a good opportunity for me to review what I’ve written in the past and see if there’s an improvement to be made to my existing definitions.  It’s also an opportunity to explain why I don’t think VSAN or vVOLs are truly software defined.

Definitions

Let’s start with some definitions.  We can split functionality into what’s commonly known as the control plane and the data plane.  The control plane references the management features of the storage product, platform, or array.  The data plane describes the components that move data back and forth between the host and the platform.

 Within these definitions we need to go into more detail. At the control plane level, SDS at a minimum should cover the ability to configure, orchestrate or otherwise control the provisioning and consumption of storage resources.  

This can mean a LUN, a file share, an object or a VVOL.  These functions should be managed through an API, enabling the platform to be integrated into management and orchestration frameworks and removing human intervention.  Ideally, the initial configuration of storage resource layout or characteristics (such as protection levels and availability) should also be included and some platforms allow this, however I don’t consider it essential and I look at this as more like a one-time installation process. At the data plane level, we should expect hardware abstraction.  

By this I mean splitting the dependence on hardware characteristics in delivering storage to a host, which should be based on policy metrics including performance.  To use performance as an example, policy should be applied to the platform object (say a LUN or volume) such that performance characteristics like minimum/maximum IOPS and latency can be independently attributed to each entity.  Basically each volume has a set of prescribed performance characteristics and it’s up to the platform (e.g. an array) to deliver this.  

What should not happen in SDS is for objects to experience a change in performance levels when the hardware configuration is changed, for instance to add more disk or replacing disks with faster models.  Assuming a platform has sufficient performance capacity, delivery of I/O performance shouldn’t change if more entities are created and consumed – e.g. more LUNs are created and accessed. Why would we want this level of abstraction?  Server virtualisation provides us a clue.  A virtual machine should perform equally whether on a server with 4 cores or 24 cores.  VM performance is simply measured in processor cycles.  Similarly storage should (subject to sufficient resources) be attributed to each volume based on logical ideals such as IOPS and latency. 

SDS and VSAN

 So why is VMware Virtual SAN not software defined?  At the control plane, the feature provides the ability to assign storage resources to virtual machines within a programmable framework, so this ticks the box.  VSAN also allows storage resources to be auto discovered and brought into the framework, so again it ticks the box.  However when we look at performance, VSAN is 100% dependent on the underlying hardware with respect to the I/O performance virtual machines receive.  

In fact the VSAN 6.0 Design and Sizing Guide goes into specific detail about sizing cache to workload and the characteristics of various different types of spinning disk.  It also talks about assigning physical resources to VM groups through parameters like FlashReadCacheReservation, rather than assigning policy metrics like IOPS or latency.  Now, just to be clear vSphere 6 allows IOPS limits to be applied to a VM, but this is a configuration rather than a policy option, so changing VMs en-masse means changing each VM in turn (while powered down I believe) rather than simply updating a policy definition. So VSAN fails the performance abstraction test and isn’t therefore software defined. 

SDS and VVOLs

 Virtual Volumes or VVOLs allow external storage arrays to work at a granularity level of the VM rather than a LUN or volume (which typically matches to a datastore).  Without VVOLs, policies applied to a LUN/datastore affect all VMs on that LUN, so even if an array supports features such as Quality of Service (QoS), the QoS policy would apply all VMs within that datastore at the same time.  Implementing multiple policies required multiple datastores, which is time consuming to administer and potentially wastes resources. On block-based arrays, VVOLs are essentially still LUNs but are a specific SCSI LUN type known as a dependent LUN.  

A dependent LUN in SCSI terms is a device associated with a primary LUN, which in VVOL terms is our protocol endpoint.  Protocol endpoints and dependent LUNs are needed to overcome the ESXi restrictions on LUN device addressing, which is quite a low number at 256 (iSCSI and FC) – also the maximum number of protocol endpoints.  It would have been perfectly possible for storage array vendors to implement VVOLs by making each VVOL a LUN, however very quickly the addressing problem would have been reached and increasing the addressing limit would have led to SCSI device discovery performance issues. 

So,  VVOLs are just LUNs, albeit special ones.  Going back to our definitions, VVOLs can be administered through an API, so tick that box.  However when it comes to performance, there’s no intrinsic treatment of performance abstraction for a VVOL over a LUN.  Therefore if the underlying storage platform doesn’t support features such as QoS, then VVOLs on that platform doen’t get performance based metrics applied to them and so aren’t Software Defined.  SDS only applies when the underlying array offers that feature. 

The Architect’s View®

 The current VVOL implementation was a convenient workaround to fix the problem of ESXi addressing many thousands of potential LUNs on a single storage system.  However unless the array vendor provides SDS-based features, then VVOLs are nothing more than a clever mapping system.  VSAN is still stuck in the weeds of hardware specifics and doesn’t surface abstracted  performance metrics to the VM.  Calling both of these features Software Defined Storage is therefore perhaps overstepping the mark. 

Copyright (c) 2009-2022 – Brookend Ltd, first published on https://www.architecting.it/blog, do not reproduce without permission.