Home | Uncategorized | Enteprise Computing: Automated Tiering – Why Move The Data?

Enteprise Computing: Automated Tiering – Why Move The Data?

0 Flares Twitter 0 Facebook 0 Google+ 0 StumbleUpon 0 Buffer 0 LinkedIn 0 Filament.io 0 Flares ×

There’s been a lot of talk lately about automated storage tiering (not least from myself) and there’s one piece of the puzzle I’m not completely sure has been explored in depth.  Many of the posts refer to physically moving data between tiers of internal (or external) storage.  In legacy environments where LUNs are comprised of disk slices – either whole slices or LUNs taken from RAID groups – then architecturally, moving all data for a specific LUN is a requirement.  Clearly block-level migration is auto-tiering nirvana, where only those specific hot blocks are moved onto faster disk.  But does this have to be achieved by physically moving that data?  The answer is no.

Making Use of Cache

All shared storage arrays use cache memory of some sort in order to smooth out the unpredictability of I/O response time from physical media like hard drives.  Read and write response times vary depending on the position of the drive read/write heads and the added overhead of latency.  In addition, the use of RAID to stripe data across physical spindles requires cache in order to prepare a RAID stripe for writing to disk, with methodologies like RAID-5/6 requiring in-memory calculations of parity before committing data to its physical resting place.

Writes of data are always cached and it’s possible reads are too, if the piece of data still resides in cache from a previous operation or has been pre-fetched.  So it is possible to simplify I/O operations as such:

  1. Write I/O – write to cache; confirm response to host; destage to physical disk asynchronously; leave I/O block in cache.
  2. Read I/O – read from cache if available; if not, read from disk; leave I/O block in cache

BRK - Read-Write CacheCache for Tiers

Cache can be used to move data between tiers without actually moving data.  Imagine all blocks (or chunks) of a LUN are tagged with a performance profile that determines which tier of disk the chunk should reside on.  During normal operations, the chunk will be read, re-written and destaged to disk as normal.  At some point, the chunk is marked to move to another service tier, say SSD rather than FC storage.  At this point, the next time the chunk is written to cache, it gets destaged to a new location on SSD.  Once completed, the old chunk is logically released from the FC drive.  Voila!  The data is moved without an additional I/O operation, but by simply utilising the normal I/O operation. 

Of course, I’m simplifying the whole process here.  In reality things are much more complicated.  Vendors have developed sophisticated algorithms to pre-stage and de-stage data to and from cache to minimise mechanical drive impact and to maximise performance.  RAID calculations have to be managed.  In addition, this concept works well for active data but inactive data moving down tiers would still require additional I/O.  Also, spare resources would need to be set aside to make sure chunks were readily available when data profiles changed.  Otherwise there would be a risk of resource starvation as demand for one tier of storage outstripped supply.

Whilst this is a simple illustration, it does show that storage platforms in which the underlying architecture is designed to handle I/O and LUN layout at the block level, will have a massive advantage over legacy platforms using (previously good) methologies for storing data.  I suspect vendors such as Hitachi/HP and EMC are having to spend a lot more time re-writing the fundamental operating principles of their enterprise storage products than they care to admit.  Why else would FAST for blocks be announced a full 12 months before a committed availability date?

About Chris M Evans

Chris M Evans has worked in the technology industry since 1987, starting as a systems programmer on the IBM mainframe platform, while retaining an interest in storage. After working abroad, he co-founded an Internet-based music distribution company during the .com era, returning to consultancy in the new millennium. In 2009 Chris co-founded Langton Blue Ltd (www.langtonblue.com), a boutique consultancy firm focused on delivering business benefit through efficient technology deployments. Chris writes a popular blog at http://blog.architecting.it, attends many conferences and invitation-only events and can be found providing regular industry contributions through Twitter (@chrismevans) and other social media outlets.
  • Craig

    I’m not an expert in storage design but Sun’s ZFS file system seems to utilise SSD’s in a similar manner to what you’ve described. Effectively ZFS lets you create Hybrid pools of storage allocating faster disk (think SSD) as a a very large read or write cache whilist using slow spinning drives for storage. The numbers look promising though I’ve yet to test the theory out in a lab yet..

  • Pingback: Twitter Trackbacks for Enteprise Computing: Automated Tiering – Why Move The Data? « The Storage Architect [thestoragearchitect.com] on Topsy.com()

  • http://www.ibm.com/developerworks/blogs/page/storagevirtualization Barry Whyte


    You have a good point here. What if you didn’t actually “move” the data, but simply made a temporary second copy on SSD.

    The problem with DRAM cache is time. A block will typically age out of cache (due to capacity) in seconds. A block moved to SSD will typically live there for days or weeks (again due to order of magnitude more capacity).

    But what if you could discard data in the SSD tier as quickly as you can cache data. So not ‘move’ it in the first place… but simply clone it… for as long as it was needed

  • http://thestorageanarchist.com the storage anarchist

    Chris – I see you’re having some fun speculating over FAST implementation and strategies…

    I don’t think you’re really thinking this all the way through, though.

    The objective of promotion to a faster type of storage isn’t as simple as to try to stick the most recently requested data onto the fastest storage – heck – that would frequentrly be a waste of good Flash, because lots of data is accessed once and never again.

    No, the objective of FAST is to get the data that most often (and repeatedly) results in long response times when it requested over a period of time promoted – and to get that data promoted BEFORE it is even requested. Thus, the secret of FAST is to PREDICT what needs to be in Flash, and to determine what can safely be demoted to slower storage without significantly impacting response times – and then to get that data there at the right time. I can’t get into all the details here (patents and all that), but the simple LRU scheme you describe simply isn’t sufficient for multi-tiered FAST. And in fact, the DRAM cache management algorithms themselves have to be adapted to accomodate the different performance characteristics of Flash, FC and SATA.

    That said, the actual implementation of sub-LUN FAST is indeed complex, but it is hardly a grounds-up rewrite. No, in fact, Symm Virtual Provisioning was architected from the very beginning to support FAST. And Symmetrix has a strong legacy of pre-fetch algorithmic experience – it just has to be adapted to the performance characteristics of Flash and SATA and the different temporal space of these slower storage media. From there, it is really just a simple matter of programming (and QA, and use-case validation, and algorithm optimizations, and regression testing, and performance tuning, dot-dot-dot).

    Remember, the only reason God ceated the earth and the heavens in just 6 days was because He didn’t have to deal with an installed base.

  • http://thestorageanarchist.com the storage anarchist

    And to clarify, FAST algorithms have to adjust for the fact that some data is more efficiently cached a) at the server, and b) in the array cache. Accomodating the fact that that these (faster) caches exist up-stream has a defiitive impact on the FAST management algorithms.

    It’s all about minimizing latency on predictable cache misses…

  • http://thestorageanarchist.com the storage anarchist

    You presume that Symmetrix does not enjoy the benefits of “newer architectures”, but in fact, Virtual Provisioning is indeed a considerably different infrastructure for data storage. It truly virtualizes the data storage layer beneath it, and in fact has many of the attributes of what you would consider “new architecture.”

    And indeed, as BarryW says, it’s not the movement (or avoidance of movement) that’s so hard…it’s figuring out what needs to be moved when and where.

    EMC has been working on FAST for even longer than the announced date, analyzing traces from literally thousands of installed arrays and modelling different prefetch algorithms. And an almost equivalent amount of time and effort has gone into the definition of the management interfaces for FAST, leveraging a half dozen or so customer Technical Advisory Panels where different approaches have been modelled and analyzed by future FAST users.

    And it really isn’t a race to see who’s first; I’ll safely predict that FAST-like implementations will form the foundation of virtually every storage platform within a few years. There will be differences of implementations, but the idea of distributing data across multiple different types of storage to optimize both performance and cost is inevitable.

    IMHO, anyway :)

  • Pingback: uberVU - social comments()

  • Chris Evans

    Barry, I understand what you say, however that’s why I specifically referenced the concept of a performance profile and a change in that profile affecting the status of the data. Agreed, things are much more complex than I explain. Let’s face it, I’m not a hardware engineer and I’m not privy to any vendor’s internal plans. However it makes good conversation to see how newer storage array architectures have the potential to be more flexible with automated tiering and other such features.

    However I’m more than happy if you *want* to ell me more! :-)

  • Chris Evans

    Craig, yes, I’ve evaluated the 7000 series and I liked it. Have a look back at previous posts.


0 Flares Twitter 0 Facebook 0 Google+ 0 StumbleUpon 0 Buffer 0 LinkedIn 0 Filament.io 0 Flares ×