This is the fourth in a series of posts looking at the StorPool software-defined storage platform. In this post, we will look at Kubernetes and provisioning storage to containerised applications.
- StorPool Review – Part 1 – Installation & Configuration
- StorPool Review – Part 2 – Performance
- StorPool Review – Part 3 – Connectivity & Scripting
Containerisation gained significant popularity with the development of Docker and the framework that was built up around the creation and deployment of application containers. Kubernetes has now surpassed Docker and become the de-facto standard for container-based application deployments.
As with many new and emerging technologies, storage tends to lag behind in development. We saw this problem with both Docker and OpenStack, where applications initially started out stateless and transitioned to stateful workloads as the platforms matured. Kubernetes is no different in that the storage components have taken time to evolve to a useful degree of maturity. However, modern containerised applications demand persistent storage and can’t offer enterprise-class capabilities without it.
Container Storage Interface
The Kubernetes community has addressed the management of persistent storage through the Container Storage Interface or CSI. The design of CSI provides a pluggable framework for vendors to integrate their storage solutions without having to update the base Kubernetes platform. CSI offers the capability for multiple vendor support and for ongoing addition of features and functionality as new versions of Kubernetes are released.
Kubernetes v1.13 (announced 15th January 2019) was also the general availability release for CSI. The current release of CSI is v1.5.0. More details can be found here.
CSI and StorPool
StorPool provides block storage to Kubernetes clusters. As we examined in previous posts, a StorPool cluster can be run in HCI mode, where each node contributes and consumes storage, or with a client only mode where the node consumes resources but doesn’t contribute. (see post 1 and figure 1 for more information).
Both client and storage nodes can form part of a Kubernetes cluster, enabling, for example, a 1:1 relationship between StorPool storage cluster nodes and Kubernetes nodes, or a Kubernetes cluster with only client nodes and storage delivered from elsewhere. A StorPool cluster also supports multiple Kubernetes clusters, so any combination of client and storage node configurations are supported within the requirements (listed here) that show the specific installation processes and pre-requisites.
StorPool requires an additional Kubernetes management process to run on each Kubernetes node running kubectl, as well as the deployment of the CSI plugins. The process is straightforward, so not documented here.
StorPool, CSI and iSCSI
StorPool supports a second deployment model using iSCSI. As detailed in post #3, StorPool enables access to LUNs for non-Linux hosts via iSCSI. This mechanism also provides the ability for Kubernetes clusters not running StorPool data services to access a StorPool cluster.
The use of the iSCSI model is helpful when, for example, scaling workloads past the capability of a single cluster or deploying StorPool as a dedicated storage solution.
Testing StorPool & CSI
So, how do CSI and StorPool work together? The installation process for the StorPool CSI driver creates a StorageClass, effectively a definition for how persistent volumes will be created and assigned to Kubernetes pods. An example StorageClass definition for StorPool is shown below (figure 1).
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: storpool-csi # namespace: kube-system # annotations: # storageclass.kubernetes.io/is-default-class: "true" provisioner: csi-driver.storpool.com reclaimPolicy: Delete parameters: template: "nvme" volumeBindingMode: WaitForFirstConsumer allowVolumeExpansion: true
The “provisioner” in this instance is the StorPool CSI driver, which talks to the StorPool cluster to make storage volumes available to Kubernetes pods. StorageClass definitions have several parameters. The reclaimPolicy determines what happens to volumes once a pod is terminated and storage claims are removed. In this case, the data will be deleted with the Pod, and a construct called a PVC or Persistent Volume Claim (more on this in a moment). The volumeBindingMode parameter determines how a volume will be created. StorPool only supports “WaitForFirstConsumer” and not “Immediate”, which effectively means a volume will be created once a pod is scheduled, rather than when the request for a volume is created (a Persistent Volume Claim).
The parameters section provides the capability to pass configuration parameters to the StorPool CSI provisioner. In this StorageClass, we indicate the StorPool template used should be “nvme”.
Figure 2 shows three StorageClass definitions in our test cluster and the corresponding template on the StorPool cluster. There’s no need to add further parameters to the CSI StorageClass definitions, as these can all be accommodated through a template. This includes the ability to place QoS restrictions on volumes, as previously shown in post 2 of this series.
Creating a Volume
Kubernetes uses a two-stage process for volume provisioning. A Persistent Volume Claim (PVC) represents a request for storage via StorageClass (and therefore from a CSI storage provider). The StorageClass used (as we highlighted earlier) specifies the binding mode of volumes as either immediate or “just in time” (WaitForFirstConsumer). As we can see from figure 3, for the eight volume claims created, all are in pending state because no pods have yet requested to use the claim. No volumes are mapped to the claims until a pod attempts to use them.
In figure 4, both spvol1 and spvol2 have been attached to a pod and now have bound physical devices. Figure 5 shows the two devices in the StorPool CLI. The first volume is fully allocated, while the second volume has no data written to it. Both volumes have a system generated name (starting with ~) and a tag that maps back to the persistent volume claim.
How does StorPool performance compare to running StorPool natively with applications? We re-ran the performance tests from post #2 in this series, this time within a Kubernetes pod, similar to the performance tests run in this recent report.
The results (see figures 6 to 10) show the performance between native fio and fio run within a container is negligible. Whilst we would expect this, because there’s very little overhead on container-based environments, this is confirmation that software-defined storage solutions are a good fit for containerised environments, compared to dedicated storage area networks.
The Architect’s View™
Support for Kubernetes is now table stakes. Storage vendors have to offer a CSI plugin that automates the provisioning process. The challenge for all vendors, including StorPool, is providing the right level of integration between the consumption and provisioning platform. StorPool’s use of templates enables the offload of storage attributes to the storage layer. These are then dynamically configurable even after the StorageClass is created, and can be extended as a superset of the features offered in CSI.
I’d like to see StorPool extend their Kubernetes support to integrate the creation of snapshots directly into the CSI functionality. Another useful feature would be to seed new volumes from snapshots, although that could be achieved with templates.
In our next posts, we will cover:
- Post 5 – Failure modes, managing device failures and integrity checking
This work has been made possible through sponsorship from StorPool.
Copyright (c) 2007-2021 – Post #317f – Brookend Ltd, first published on https://www.architecting.it/blog, do not reproduce without permission.