Building Data Storage With Containers

Note: Updated on 18 January 2017 to include OpenEBS.

There’s an almost religious divide between those who see containers as entirely stateless objects and others taking a more pragmatic approach that says state and containers is an inevitable thing. In the stateless model, data is assumed to be replicated and protected by many container instances. So the loss of any individual container doesn’t lose data. In practical terms, this idea just doesn’t work, because in the Enterprise, we have to meet a set of standards around application availability, auditing and compliance. Assuming we want to containerise our databases (rather than relying on them remaining as virtual machine instances) and we surely will, then persistent data is as inevitable as death or taxes. However, what about a more contrary approach? How about building storage systems from containers?

Stateless Persistence

The persistence of data is due to the media we store it on, not the system through which we access it. As an example, many vendors provide the ability to perform head upgrades on their dual controller-based systems. This is because persistent data and configuration information is stored on the media (HDDs and SSDs) and in many cases the media is self-describing. This means if we have a software crash, theoretically metadata and configuration information can be re-read by parsing the data on disk. Taking this idea to its logical conclusion, we can use stateless processes like containers to create storage systems, if we ensure that state is stored on the physical media (and protected across that media). If the container running our storage platform crashes, then we simply respawn it and read configuration data back from disk.

Building Storage with Containers

We are starting to see containers edging into the deployment of storage solutions. There are a number of reasons this is a good thing. Firstly if we’re already running containers, then accessing storage on one of those containers provides a lightweight way to get to our data. Docker already implemented something like this with their data volume containers (see this link on Docker storage options, plus other references at the end of this post). Second, containerising storage means we can build storage features as separate microservices, making management, upgrading and patching much easier.

Other vendors are starting to bring products to the market with the idea of using containerised storage. Scality, an object storage vendor, recently released their S3 Server, a cut-down containerised version of the Scality RING platform written in node.js. This runs as a single container image and so has limited support/availability but provides a process to test S3 compatibility with Scality RING. We could imagine the offering could be extended in the future to have more functionality. Note: Scality did in fact extend the development to a new platform called Zenko.

StorageOS, a UK startup has built a storage platform that runs in containers, for containers. The container footprint is (at present) a mere 40MB, which is an amazing achievement, although I can see this increasing as more functionality is added. Dell EMC’s VNX platform uses containers to implement VDMs (Virtual Data Movers). Portworx also has a storage solution that is build from containers. Currently this is available as a Developer (PX-Developer) edition that can be downloaded from GitHub, or an Enterprise edition (PX-Enterprise). There’s also OpenEBS, an open source software solution for container-based storage also sold as a hardware solution through Cloudbyte.

The Architect’s View

The lines between storage and application are being blurred with the idea of using containers for data persistence. HCI (hyper-converged infrastructure) set the scene for the ability to run storage and application services on the same hardware, storage with containers takes this to another level. As with all storage solutions, one product doesn’t fit all requirements, so the idea of storage containers will (initially at least) have limited application. However expect to see more more solutions come to market as Software Defined Storage starts to find a true niche. Please let me know if you have any other examples of containers being used to deliver storage and I’ll add them to this post.

Stateless Persistence

Building Storage with Containers

The Architect’s View

Further Reading