Last month, HYCU, Inc. announced the availability of HYCU Protégé for Kubernetes. This new feature provides data protection capabilities for Kubernetes workloads, initially with the Google Kubernetes Engine (GKE). We take a brief look at why this evolution is important and how it continues the unified backup strategy we’ve already seen from the HYCU team.
Stateful workloads are becoming more popular within the Kubernetes ecosystem. The introduction of CSI and more mature persistent volumes means it’s now possible to build out long-living applications with on-premises or cloud-managed Kubernetes clusters.
As we’ve said many times, backup is your responsibility, even in the public cloud. That applies to applications running under container-based clusters too. Any infrastructure is subject to failure, even if it’s built with cluster resiliency. We also have to remember that backup is there to protect against logical corruption (bad code) or human error (like deleting the wrong cluster).
Data protection for container-based applications can be broken into two parts. There is the protection of the data itself, typically on persistent volumes. Second, the metadata associated with the cluster and the application must be protected, as this provides the association between the application and PVCs (it also enables recovery of the application definitions). The default naming of persistent volumes generally provides no understanding of the source application, and the current implementation of CSI doesn’t mandate the level of essential metadata needed.
Protégé for Kubernetes
HYCU has implemented data protection for Kubernetes as a service offering on Google Cloud Platform (GCP), extending the capabilities already available for protecting instances and SAP HANA databases. Last year, we looked at both the options for moving virtual machines into GCP and the protection of workloads in Azure.
The GCP implementation has the same look, feel and operation as the previous two implementations. Customers sign up, then access the Protégé portal through a web endpoint that links directly to their GCP account. There’s no additional installation or configuration from the customer’s perspective. Figure 1 shows a screenshot from the dashboard of the HYCU demo system on GCP.
The Protégé platform scans for new applications every 15 minutes or can be prompted to re-scan from the dashboard. Figure 2 shows a screenshot of a list of discovered applications. I’ve clicked on the postgresql application in the fin-cluster cluster to highlight specific backup details on the lower half of the screen.
GKE applications are assigned the “GKE Application” type; however, it’s worth noting here that other application types are also available, specifically SAP HANA database backups, shown in the screenshot. We discussed the need for managed backup of managed databases in a recent blog post (here), which aligns with the capability seen in this release of Protégé. Hopefully, over time HYCU will add further managed databases to the capabilities of Protégé on GCP and other platforms.
The targets and policies for data protection in Protégé on GCP follow the same standards as all other HYCU implementations. This design is essential because it enables the consistent application of data protection rules across disparate platforms, such as on-premises applications, Protégé in Azure or Protégé in GCP. These definitions can be downloaded or applied via API, which is more practical than manually re-entering definitions via the GUI.
Data recovery from Protégé backups has three options (figure 3). We can restore an entire application, just the persistent storage or parts of the cluster with individual objects. Figure 4 shows an expanded selection to restore part of a cluster application set. In this case, I’ve highlighted the postgresql application from earlier.
As we can see, Protégé provides recovery of both data and metadata components, to a granular level.
The Architect’s View
This quick example of the Protégé capability for Kubernetes on GCP highlights how easy it can be to ensure data protection is in place for all data types. At present, the implementation enables backup and restore to and from a Kubernetes cluster. I can’t (for example) restore a PVC into a virtual instance and access the data that way. However, as all the metadata is in a single location with Protégé for GCP, then this is entirely possible to achieve; it just hasn’t been written this way – yet.
The same applies to cross-platform backup and restore. As we showed in the Protégé test run highlighted above (here), it’s already possible to back up data from on-premises and restore it into GCP (and vice versa).
There are two changes that would be useful at this point. Firstly, to enable cross-platform replication of backup data from any existing implementation to any other. Second, to have a universally unique attribute to describe applications so any backup from any platform can be listed as a single entity, showing all the historical backups that have been taken, wherever they’re from. This is important if and when applications start to become more mobile and move around a dynamic infrastructure.
HYCU is moving inexorably to a unified data protection model, where local implementations provide the performance and agility for that platform. At the same time, the metadata rolls up into a unified data protection view. This final step will be an enabler for dynamic and flexible multi-cloud application deployments.
In future posts, we will be digging a bit deeper into the mechanics of Kubernetes backup and restore, with a practical demonstration of application cloning in GCP. We will also be updating our data protection eBook to include details of container backup solutions.
Copyright (c) 2007-2021 – Post #6bdd – Brookend Ltd, first published on https://www.architecting.it/blog, do not reproduce without permission.