Cloud Services – Build, Buy or Fork?

Cloud Services – Build, Buy or Fork?

Chris EvansCloud, Cloud Storage, Databases, Enterprise, Opinion, Storage

The subject of public cloud storage became a little more interesting this week with news that Google has acquired Elastifile, an Israeli start-up.  Storage services in public cloud have been a busy area as traditional vendors work with cloud service providers to deliver value-add storage solutions.  Looking wider afield, we’ve also seen CSPs using open source to deliver database platforms, although that hasn’t proved popular with some.  So, should CSPs build, buy or fork to deliver the next wave of new services and applications in the cloud?

Cloud Storage

Cloud storage is one area where arguably the implementations from cloud service providers haven’t been as mature as on-premises technology.  In an unpublished article I wrote a few years ago, I compared the maturity of AWS EFS (Elastic File System) to NetApp ONTAP.  It seemed reasonable to choose ONTAP as the “gold standard” for file system features.  However, the comparison wasn’t well received by AWS and only highlighted the point I was making.  Native file services in the public cloud simply weren’t mature enough for enterprise users.


The reason for the level of immaturity, of course, is that developing file services from scratch is hard.  Put that into an environment that needs to support multi-tenancy, reporting, billing and regional resiliency and we can quickly see the scale of the challenge. 

If we need a clear demonstration of the ongoing issues of delivering public cloud storage services, then look no further than AWS and support for SMB.  EFS only supports the NFS protocol.  EFS isn’t supported on Windows EC2 instances, only Linux VMs.  Customers who want to use SMB and Windows need to use a separate product, Amazon FSx for Windows File Server.  This offering is essentially one or more Windows servers running SMB storage, something that I spent many years attempting to remove from enterprise accounts!  Incidentally, the pricing structures for the two storage offerings are totally different, making it hard to work out which solution to use from a financial perspective.


In this instance, AWS has chosen to build a solution from existing technology.  Personally, other than the managed nature of the service, I don’t see a lot of value-add in this offering.  With no way to move data between Linux and Windows, it also means that cross-platform data sharing is impossible. 

Looking at the wider market, Microsoft Azure partnered with NetApp to bring us Azure NetApp Files, a fully managed file services offering with much greater performance than native solutions.  Native integration means being able to work with platform APIs and mesh storage with other cloud solutions.  The long-term benefit (if it is realised) will be the ability to move data between CSPs using a single consistent data platform (in this case, ONTAP).  ONTAP is also available on GCP and AWS.


Elastifile brought their cloud file system (ECFS) to GCP, with native support announced in December 2018.  As a pure software solution, ECFS can run in any GCP region without custom hardware.  Google has taken the next logical step and acquired Elastifile, giving GCP a solid scale-out file services platform.

The only negative scenario here is to consider what happens to on-premises deployments and how Google intends to support those.  It would make sense to continue to offer Elastifile on-premises because it follows a good multi-cloud story.  There may be many potential customers with no intention of moving wholesale to GCP but may want to move large volumes of data there and take advantage of GCP analytics tools. 

I think it’s unlikely NetApp will be acquired simply for their file services, but it does raise the question as to whether other storage start-ups could make good acquisition targets. 


WekaIO, for example, could be attractive to CSPs looking to target the HPC market or extend to use cases around analytics.  WekaIO Matrix already runs on AWS today.  Looking further afield, NVMe SDS solutions must also be targets.  Imagine companies like Lightbits Labs, Excelero or E8 Storage that offer software-based NVMe connectivity.  A CSP could use this technology to implement high-performance instances that don’t require SSDs directly connected to the hypervisor. 

Update: as of 31 July 2019, Globes reported that E8 Storage had been acquired by Amazon to be integrated into Amazon Web Services.

There are many other examples of storage solutions being acquired by CSPs that set a precedent for future acquisitions.  Microsoft acquired StorSimple and Avere Systems.  Google acquired Looker (analytics-focused) and Velostrata (cloud/storage migration).  IBM acquired Cleversafe for IBM Cloud and I’m sure there are many more.  So, acquisitions will continue to form a strategy for building out cloud services.


More controversially another approach is to simply fork an existing Open Source project and use this in the public cloud.  This idea isn’t new.  Just look at the underlying hypervisor platform used by AWS.  It’s based on open-source KVM and was previously implemented with Xen.

The controversy of using open source was highlighted last year as AWS released a MongoDB compatible platform called DocumentDB.  MongoDB Inc changed the ongoing licensing terms of their software requiring companies offering MongoDB as a service to release the source code of the management components that sit around it.  Effectively this forces companies like AWS to release proprietary code to the community – and their competitors.  It also means we can see exactly how services are delivered, warts and all. 

Update: 6 Dec 2019 – Andy Jassy at Re:invent 2019 highlights how ECS is the preferred container orchestration tool. This is a perfect example of forking infrastructure and heading at a tangent.

In the case of DocumentDB, it appears that AWS emulated the MongoDB 3.6 API, which obviously isn’t (or wasn’t) the current release at the time.  DocumentDB appears to offer only a subset of MongoDB features and isn’t a drop-in replacement by any means.


What about the wider database market?  AWS offers lots of compatible solutions that rely on open source, including MySQL, PostgreSQL, MariaDB and MongoDB.  Many of these are simply packaging of the database software with management tools for automated deployment and upgrades. 

Is this practice such a bad thing?  After all, many other companies offer open source solutions that get used in commercial deployments.  Those organisations may also choose to get involved in developing the software, but it’s not a requirement to do so.  Ultimately that’s always been a risk to Open Source unless some form of copyleft licensing terms is used.

The question to answer is whether with so much of IT moving to the public cloud that those service providers should be adding back to the community helping to support their business.  Personally, I think copyleft licensing for the core components of a platform is the right choice but expecting CSPs to release encapsulating management tools is a step too far.

To Fork or Align?

Is taking a fork of an existing platform a good strategy?   The implications of forking a software platform are the divergence in features and functionality that can arise going forward.  DocumentDB, for example, may diverge with a completely different set of features that make compatibility with MongoDB impossible in the future.  This creates some nice lock-in for the CSP, but a future headache for the customer.

This issue takes us right back to the beginning of this discussion, the acquisition of storage solutions and the lock-in that can result from using a solution on only one platform. 

The Architect’s View

Congratulations are well deserved for Elastifile in reaching acquisition.  However, this new lack of independence means ECFS is unlikely to be the ubiquitous storage layer that meshes all of the cloud providers together.  Designing for multi-cloud just got a little harder.  As we can see from the database market, CSPs want mature solutions, but in the long-term, we are going to risk our level of lock-in, as cloud providers look for ways to stop customers moving away from their ecosystems.

Disclaimer: WekaIO, NetApp and Elastifile have been customers of Brookend Ltd. Post #b3b9. No reproduction without permission.