SFD7 – Primary Data and Data Virtualisation

SFD7 – Primary Data and Data Virtualisation

Chris EvansData Management, Data Mobility, Data Practice: Data Management, Data Practice: Data Storage, Primary Data, Tech Field Day

Storage virtualisation is a technology that is well established in our industry.  In the current “x86” era, I’ve been using and talking about it for over 10 years, using it to reduce costs, extend the life of older assets and in data migration.

All of the original storage virtualisation solutions (with the exception of Invista) sat within the data path and used a “store and forward” method of writing to the virtualised resources.  The virtualising appliance would in some form represent a LUN or volume and do LUN/LBA mapping and readdressing.  These solutions were good, but of course introduced latency into the data path, or if they cached locally risked overwhelming the virtualised resource if there was an imbalance of  performance capabilities between the virtual layer and the storage layer.  The other downside of these solutions was that in most cases, the virtual layer couldn’t simply be removed as LUNs/volumes could be created from slices of storage spread across multiple physical entities.

Primary Data Inc, a startup founded in August 2013, exiting stealth in November 2014, has developed a solution that takes away some of the issues associated with the abstraction of the physical placement of data.  I recently received a briefing from the company and learned more about their technology at Storage Field Day 7.   Rather than go through slides, CTO David Flynn (formerly CTO/CEO of Fusion-io) presented the company technology as a whiteboard session layering on each of the issues involved and the product implementation as he went along.  You can find a link to the videos at the end of this post and I recommend spending the time to watch them.  Here’s a summary of how the technology works.

Data access can be thought of as working in the same way as a traditional SAN-based model, except that each client/host runs a software device driver in place of a physical HBA.  This device driver communicates directly with the storage (whether DAS, SAN, NAS or cloud) to store and retrieve information.  The mapping table that describes what data sits where is stored on data directors, a redundant pair of servers that don’t sit in the data path, but are the control plane in the configuration.  So far, so good.  At this basic level we have a distributed storage layer with metadata stored on the data directors and this is where things start to get interesting.  Abstracting the metadata away from each server allows lots of clever stuff to be done with the infrastructure.  For example, workloads can be rebalanced to take advantage of the fastest storage or to eliminate hotspots or bottlenecks.  The metadata can be used to create pointer-based snapshots or to perform data migrations.  This data and management plane effectively implements data mobility.

Now there are already solutions in the market today (such as EMC’s ViPR) that separate the data and control planes.  Unfortunately ViPR implements a static data plane, with little or no ability to dynamically relocate the data on the underlying hardware.  Instead it simply provides an easier provisioning experience.  Primary Data however are implementing data mobility to the level that allows for a much more dynamic infrastructure, with the flexibility that entails.  This includes policy-based management, linear scalability and a greater level of efficiency, eliminating “islands” of underused resources.

The Architect’s View®

The dream of a software defined data centre (SDDC) has much more chance of becoming reality once we have the ability to place data wherever we want it and without having to consider the individual configuration requirements of the storage platform.  Today virtual machines aren’t tied to a physical server; containers can be moved around at will, so why should data still be an issue?  Of course data is the persistent part of our IT infrastructure that we can’t afford to lose.  VM’s can be rebuilt; containers are designed to be transient, but storage has always been about preserving the asset of information.  Technologies like that from Primary Data will enable end users to abstract their compute solutions even further, giving us the opportunity of really making storage nothing more than a commodity.

Related Links

Disclaimer:  I was personally invited to attend Storage Field Day 7, with the event team covering my travel and accommodation costs.  However I was not compensated for my time.  I am not required to blog on any content; blog posts are not edited or reviewed by the presenters or Tech Field Day team before publication.  Connected Data provided all SFD7 attendees with a complimentary Transporter Personal device.

Copyright (c) 2009-2018 – Post #8828 – Brookend Ltd, first published on https://www.architecting.it/blog, do not reproduce without permission.