HYCU, Inc. has announced R-Cloud, a paradigm shift in the way data in SaaS applications are protected. Why do we need to address issues with SaaS protection, and how has R-Cloud changed the game?
Software-as-a-Service (SaaS) solutions are used widely across businesses. Within Architecting IT, we use about a dozen services that cover aspects as varied as banking, accounting & finance, time management, email & collaboration, CRM, process & project management and, of course, social media.
Whether your business is a start-up or well-established, SaaS will be used in some form. In a podcast we recorded in December 2022, HYCU CEO Simon Taylor highlighted the availability of around 16,000 SaaS applications in the US alone. This number grows daily.
Unlike on-premises systems, IaaS, or PaaS (managed databases, for example), SaaS solutions generally obfuscate their implementation from the customer. Interaction with SaaS tools occurs through a GUI or structured (and hopefully well-documented) CLI or API commands. If a SaaS provider replaces the back-end database, the customer doesn’t need to know. SaaS solutions can evolve, add new features, and implement efficiency improvements, all without any change to the customer’s experience.
However, the hidden nature of a SaaS implementation means the SaaS provider must either provide integrated data protection or expose APIs and functionality that enables the customer to implement 3rd party data protection. Two prominent examples of this requirement are Microsoft 365 and Salesforce. Neither company provides granular data recovery, committing only to restoring service in the event of a platform outage (such as a hardware failure). As a result, many 3rd party vendors now offer to protect the data in these solutions.
Data Loss Prevention
If the SaaS vendor implements platform recovery, why is external backup needed? Clearly, data loss scenarios aren’t limited to hardware failures. Software solutions have bugs that corrupt data, there are malicious actors looking to steal or encrypt data, plus there’s the perennial “fat finger” issue when someone innocently and inadvertently deletes your entire customer database. It’s also always a good idea to logically air gap backup data from production by storing it on a separate platform or remote system, as the unfortunate customers of OVH know only too well.
But why is SaaS different to the application deployment models used elsewhere? First, we need to look at how data protection operates in a diverse landscape. Backup works in a similar fashion to the way data is extracted from production systems to be used in data warehouses. We’ve extended the ETL (Extract, Transform, Load) paradigm with an Index option and created ETIL, as shown in figure 1.
All data protection solutions extract data from production systems for storage elsewhere. At the end of the millennium, this process was achieved using agents deployed onto physical servers. Some data extraction processes use snapshots (which are then copied elsewhere), while another route is to use application-specific tools, like RMAN. Server virtualisation introduced the capability to collect data from APIs like VADP (now called VMware vSphere Storage APIs – Data Protection). We highlighted this different approach in a blog post from five years ago that discussed HCI data protection.
Once data is extracted from the primary system, backup software will transform it, removing duplication, parsing snapshots for individual files, and discarding anything that doesn’t need to be stored. The indexing step tracks the content, providing a timeline for the future recovery of individual files, systems, or virtual instances. Finally, these secondary copies of data are stored in a repository that could be an object store, a dedicated backup platform, or good old-fashioned tape.
The common theme across all these protection methods is the work required by the backup vendor. Agents are written and supported by the vendor for a wide range of platforms and operating systems. The backup vendor implements code to interface with RMAN and the hypervisor APIs. In fact, any new platform requires the data protection vendor to deliberately and specifically choose to implement support for backup by writing software to extract data from that primary system.
Once the vendor starts supporting a system, any updates or changes to the platform could require updates to agents or APIs. Most backup administrators will know the ongoing challenge of keeping agents compatible with operating systems and the effort involved in upgrading backup platforms. Server virtualisation certainly helped reduce the burden, but the trigger process is still the same – the platform vendor changes its system, and the backup vendor then reacts and builds in support for new features or functionality.
As we describe how the “Extract” process is implemented, it becomes clear that SaaS support has a problem. With 16,000 applications in play, the typical vendor approach is to apply the 80/20 rule or attack the “low-hanging fruit”. Adding support for Salesforce or Microsoft 365 is an easy way to address the backup of terabytes of customer data. But data in SaaS applications has a long tail. Which backup vendor can claim one application is more important than another? Only the businesses using those applications can know. Some tiny SaaS application may be critical to the ongoing operation of a business. If there’s no scope to protect that data, then the business is at risk, or that application can’t be used.
This is where we see a disconnect in typical application onboarding. The vendor picks the solutions that address the greatest number of customers, tying onboarding support to revenue. In some respects, we can’t blame backup vendors for taking this approach. After all, they’re looking to increase the TAM for backup. However, even if the backup vendor could onboard 16,000 applications, the rate of change occurring across those systems would make ongoing support untenable due to the enormous cost.
What if we could turn this onboarding process on its head? Well, this is what R-Cloud has done. Rather than have the backup vendor code an API for each SaaS solution, the SaaS vendor codes data protection on R-Cloud. This approach may seem like it simply pushes the work somewhere else but remember that the SaaS vendor must provide some API for data protection, so adding support for R-Cloud should be relatively trivial. This approach also brings in other benefits.
- Scalability – rather than requiring the backup vendor to support 16,000 applications, 16,000 vendors code for one API. This process scales efficiently, as the support of each new vendor requires minimal interaction. Now there’s no need to target the “low-hanging fruit”.
- Compatibility – the SaaS vendor is now responsible for maintaining backup/restore compatibility and so can build this into the release timeline for new features. The vendor also has the responsibility of ensuring that new features don’t break backwards restore capability. Customers can then be sure that backup and restore will always work or be directed to take additional backup if systems materially change.
- Value-Add – for the SaaS vendor, integrated (and granular) backup can now be offered as an additional service to the customer. With a well-written API, HYCU doesn’t need to be involved in the onboarding of new customers, as they can come via the SaaS provider. For smaller SaaS vendors that might never be supported by the traditional backup vendors, the R-Cloud route can make them more attractive to the enterprise. Remember that HYCU is configured so the customer provides the storage space for the “load” function. This is another area where MSPs could add value.
- Unified view – for businesses that use SaaS solutions, R-Cloud can provide a unified view of applications data and existing protection status. In addition, as HYCU Protégé is implemented through standardised single-sign on platforms such as Okta and Atlassian, auto-discovery becomes much easier to deliver. It’s no surprise that recent investment in HYCU has come from the venture arms of both of these companies.
Naturally, the first question that springs to mind with R-Cloud is to ask, “how can a generic backup API support thousands of varied applications?” In the coming weeks, we’ll dig a little deeper into the specifics of the implementation. But for now, imagine how a single sign-on platform such as Active Directory can support business operational structures. This concept will be crucial to the success of R-Cloud – the ability to visualise and protect data with an application-centric rather than infrastructure view. Something we’ve been banging on about for years.
The Architect’s View®
Rarely does a new technology or concept emerge with the potential to change the approach to an entire segment of the IT landscape. If HYCU can pull it off, R-Cloud will represent the most significant paradigm shift in data protection for decades. The measure of success, of course, is how readily SaaS vendors adopt the technology. However, with much of the work in place (through the relationships with Okta and Atlassian), the chances look very high.
Copyright (c) 2007-2023 – Post #e3e5 – Brookend Ltd, first published on https://www.architecting.it/blog, do not reproduce without permission.