Pure1 META – Analytics for Pure Storage Arrays

Pure1 META – Analytics for Pure Storage Arrays

Chris EvansPure Storage, Storage Management

It seems storage array analytics solutions are like opinions – everybody has one!  Pure Storage is the most recent entrant into the platform analytics game, with the introduction of Pure1 META at Pure Accelerate last month.

META

META claims to analyse the data from 1000’s of storage arrays, with over 7 petabytes of stored data and greater than 1 trillion data points produced per day.  Predicting data profiles to optimise your storage solution is big business.  HPE recently acquired Nimble Storage, partly it seems to gain access to InfoSight, Nimble’s analytics solution. 

Tegile has Intellicare, while Kaminario has Clarity.  All of these solutions exploit the shared knowledge from hardware in the field to optimise internal data movement algorithms, highlight failing devices and otherwise keep storage in tip-top condition.

Blocks and More Blocks

Bearing in mind the platforms we’re looking at are mainly block-based storage, what exactly can we see from the telemetry coming back from storage arrays?  Well, before the days of “analytics”, enterprise storage vendors routinely collected data on their hardware.  Standard array installation consisted of power, networking and a dial-up phone line. Vendors could periodically dial into a storage array and pull back hardware stats on the configuration. 

Hardware

From my experience, most of this data was for hardware management.  It allowed the vendor to detect problems early and ship spares out as necessary, rather than the customer having to report a failing device.  Similarly, vendors could dial in and fix problems, look at the configuration and otherwise provide remote admin functions to assist the customer.  The remote access capability that EMC had on DMX, for example saved me a few times.

Modern Telemetry

These days we have Internet connectivity and no need for dial-up modems.  Vendors collect a wide range of data on how arrays are performing and how they are configured, which can be used to predict potential issues, highlight bottlenecks and otherwise make arrays run smoother. Vendors can automatically open tickets and pass them back to the customer, speeding up the operational management of the hardware and making the whole storage management process more scalable. 

There’s no longer a need to explain problems to a level 1 support agent when the vendor can get a level 3 person to call you directly with a solution to the problem (that you didn’t even know you had).  For further information, you can see what Nimble are doing with InfoSight over on the Tech Field Day website where there are lots of videos that can provide some extra background knowledge.

With block-based arrays where the storage array doesn’t really have any idea on the format of the content, what level of analysis can a storage array really do?  Obviously systems can look at throughput and latency figures.  They can look at the types of I/O, whether sequential or random; they can look at block sizes. 

There’s some ability to look into the data and make assumptions on what the content actually is, but with virtualisation, that becomes increasingly hard.  Instead, vendors need to integrate with higher levels of the application stack and take configuration data from the hypervisor and application in order to make informed decisions.  Getting better insight is always improved with more data sources.

Pure1 META

Let’s get back to our main topic of conversation, Pure1 META.  From what’s been announced, many of the functions offered by META seem to be similar to that of other vendors.  The platform highlights issues, finds known problems, raises tickets and so on.  However perhaps one more interesting angle is the ability to model workloads across systems and do “what if” scenarios for new applications. 

This can mean testing whether a new workload will fit on a specific platform, but also testing whether two sets of workloads will interact well together.  The logical conclusion to this is being able to optimise workloads across a range of existing deployments, then determine what is the best new purchase to make, without having to guess.  META is based on AI and machine learning, something all vendors are claiming.  However, without some kind of AI that has the ability to self-learn, analysing trillions of data points per day simply wouldn’t be possible.

The Architect’s View

It’s been impossible for storage administrators to manually tune storage arrays for some time.  With the ability for software to manage the process, we don’t need to bother anyway.  Tools like META allow administrators to focus on the data not the bits and bytes at the lowest level.  Vendors like Pure and Nimble are benefiting from the “wisdom of the crowd” in consolidating their understanding across many, not just one customer.  As META rolls out, I’d like to see Pure publish details on exactly how customers have saved time, money and avoided outages.  While it’s good to talk about analytics, let’s also talk about the tangible benefits to the customer.

Over time, analytics will reduce the storage admin burden, allowing more focus on delivering value around data.  One final thought; how will the analytics game play out for hyper-converged and SDS vendors (in particular open-source storage)?  Is there a similar data-sharing capability?  It’s been assumed that the traditional storage “SAN” market is on the decline.  Perhaps features like analytics can keep it relevant for a little while longer yet.

Further Reading

Disclaimer: I was personally invited to attend Pure Accelerate 2017.  My flights, accommodation and meals were paid for by Pure Storage.  However there is no commitment for me to blog on any subjects and Pure receive no rights of editorial before content is published.

Comments are always welcome; please read our Comments Policy first.  If you have any related links of interest, please feel free to add them as a comment for consideration.  

Copyright (c) 2009-2019 – Post #32ED – Chris M Evans, first published on https://www.architecting.it/blog, do not reproduce without permission.