Geoffrey Moore predicts that successful companies should launch a new product or service every decade. Each new offering needs to initially contribute 10% or higher to revenue and eventually become a significant income generator to the business. Pure Storage started with FlashArray, then introduced FlashBlade in a textbook “Zone to Win” process. Is the partnership with Cohesity set to be the next decade’s money-spinner?
Successful companies have to expand their portfolio of offerings continually. EMC did this in the 1990s and 2000s through acquisition. This process put the company in a strong position to offer enterprise (Symmetrix), midrange (Clariion) and secondary storage (Data Domain) to customers.
Enterprises want to buy portfolio solutions because it offers a single purchasing point, a single set of contacts and (hopefully) greater integration and fewer risks. Buying more from a single vendor introduces the opportunity for cost savings through price reductions – “buy all your stuff from us, and we’ll give you an extra 10% off”.
When Pure Storage introduced ObjectEngine, it was clear that the solution was developed as a portfolio product. Pure has always competed directly against EMC (now Dell EMC), and ObjectEngine was a deliberate Data Domain competitor. The origins of the product were from StorReduce, a smart technology to optimise AWS S3 object gateways by inserting a deduplication “proxy” in front of S3 buckets. ObjectEngine did much the same for FlashBlade and provided a way to optimise a platform with no native deduplication. Customers were already using FlashBlade for backup, so ObjectEngine offered a solution to reduce the costs of FlashBlade and provide a portfolio product to compete against Data Domain.
As reported by Chris Mellor in July this year, ObjectEngine has been quietly discontinued. Perhaps this wasn’t a surprise to many. We recorded a Storage Unpacked podcast with Brian Schwarz in July last year (embedded here). Brian (who was our primary contact for ObjectEngine) left Pure for Google in the May/June timeframe, which fits with the demise of ObjectEngine.
Remember that ObjectEngine is merely a deduplication engine using FlashBlade as the data repository. It’s not a full backup solution. Pure customers still need to have some software to perform backups and write to ObjectEngine as a target. Data protection partners handle that process.
Referring back to our portfolio analogy, Pure Storage needs a fully-featured solution to offer customers. ObjectEngine was good for customers that had already committed to a backup vendor; however, owning the customer means owning the platform solutions entirely. Buying a data protection platform is expensive, so partnering offers the next best thing. However, this strategy needs to create a solution that is greater than the sum of the parts. Otherwise, there’s no benefit in partnering.
Pure Storage and Cohesity have announced a partnership and a new solution called FlashRecover (not to be confused with the async replication feature that used the same name). The FlashRecover platform is a complete solution for data protection, built from FlashBlade and Cohesity compute nodes and software.
Looking deeper into the architecture, we can see that FlashBlade and the Cohesity nodes are connected via NFS. Now, looking back to the origins of the Cohesity platform, one of the strongest features of the solution is the distributed file system layer that spans all nodes. Back in June 2015 when Cohesity was just getting started, I had a briefing with CEO Mohit Aron in which we spent an hour going over the benefits of OASIS (Open Architecture for Scalable Intelligent Storage) the software layer that delivers the file system. This includes SnapTrees, which provides infinite numbers of snapshots without snapshot chains.
There are many great Tech Field Day presentations that cover OASIS (see the suggestions below).
However, this screenshot shows the complexity of the implementation, which includes metadata management, journaling, self-healing (via MapReduce) and TOWS, the Tier Optimised Write Scheme. I’ve also embedded a platform deep-dive video.
Clearly, the heart of the Cohesity platform can’t be ripped out to be replaced by FlashBlade. In practice, FlashBlade is simply a blob storage repository for data written to the Cohesity file system. This design has several implications.
First, the Cohesity server layer continues to need a minimum of three nodes (with four recommended). This is required to provide resiliency to the components of OASIS that still reside at the Cohesity layer.
Second, the data written to FlashBlade isn’t consumable outside of the Cohesity platform. This means customers can’t do any secondary analytics against the data without going through the Cohesity software layer.
Effectively, FlashBlade simply replaces the disks and SSDs that used to reside in the Cohesity nodes. However, the solution does provide scalability for compute and storage separately (a disaggregated architecture). It also enables existing FlashBlade customers to use their system as a Cohesity “target”.
Performance, Simplicity, Scalability
So what exactly is Pure Storage offering as the selling points for this solution? There are three claims; the solution offers higher performance (3x compared to disk-based solutions), simplicity with the ease of management and integration between Cohesity and FlashBlade and scalability, which we’ve already mentioned. One important aspect to highlight is the ability to recover data quickly. Recovery speed is much more important than time to backup because data restores only occur when there’s a problem. Restoration from flash will undoubtedly be quicker than disk.
There are also the initial business benefits we discussed. The solution is offered by Pure Storage and supported by Pure, with Cohesity DataProtect software now on the Pure Storage price list. Note that the DataPlatform file services features are not available in this offering – customers are expected to use FlashBlade for their NFS and SMB needs.
The Architect’s View
I have to say I was a little disappointed with the details behind this first release. If I was an existing FlashBlade customer, then there’s some benefit in using the shared storage as a scalable backup repository. However, if I’m a highly distributed customer with many data centres, I wouldn’t be deploying FlashBlade in each location.
FlashRecover doesn’t provide any new analytics or access to content that couldn’t otherwise be achieved through a standard Cohesity deployment. So, to be competitive, customers need to see something more.
Perhaps the future will include the same Evergreen purchasing model available for storage, or possibly dedicated compute/analytics nodes that can find value in the secondary data. Even here, though I’m not convinced there is value, as the platform is only protecting production data, likely to be applications in virtual machines and on physical servers.
As a portfolio play, FlashRecover could be successful if the solution is priced appropriately. I think at this point we’re waiting for generation two or three before there’s any clear technical or business advantage for customers.
Copyright (c) 2007-2020 – Post #73D0 – Brookend Ltd, first published on https://www.architecting.it/blog, do not reproduce without permission.