Home | Storage | Build Your Own Scalable All-Flash Array With SVC
Build Your Own Scalable All-Flash Array With SVC

Build Your Own Scalable All-Flash Array With SVC

1 Flares Twitter 0 Facebook 0 Google+ 1 StumbleUpon 0 Buffer 0 LinkedIn 0 Filament.io 1 Flares ×

Jim Kelly (The Storage Buddhist blog) has an interesting post talking about using IBM’s San Volume Controller (SVC) in front of their new FlashSystem arrays to create a scale-out all-flash solution.  The overhead is a mere 100 microseconds, which as Jim indicates still makes a hybrid SVC/FlashSystem faster than many array solutions on the market.  It also adds extra functionality via the SVC.  However is scale-out capacity the only consideration with this design?

SAN Volume Controller

SVC is IBM’s storage virtualisation appliance, which takes traditional Fibre Channel storage arrays and abstracts their storage behind the SVC servers (or nodes), creating virtual LUNs which appear to the host to be part of the SVC itself.  Storage virtualisation is not new, with hardware solutions from Hitachi (VSP), HP (XP24000), EMC (VPLEX & VMAX) and NetApp (clustered ONTAP), some of which have been on the market for 10 years.  There are also a number of software only solutions available too, which convert standard Windows/Linux servers into virtualisation appliances.  The benefits are focused around operability; storage virtualisation enables physical resources to be replaced and reconfigured dynamically, enables transparent migrations (once the appliance is in place) and allows the re-use of older storage resources.  It’s even possible to accelerate cheaper storage using SVC with flash and their EasyTier feature, for example.

So while we can easily extol the virtues of  storage virtualisation, what are the disadvantages?  Well, there are a few:

  • Complexity – solutions can be more complex, with data existing in multiple levels of cache, making fault determination more difficult.
  • Support – support for the underlying storage has to be provided by the virtualisation vendor, which needs validation with each code upgrade (however this also simplifies the front-end connection support matrix).
  • Manageability – it’s easy for virtualisation solutions to get messy without adequate standards and that can also lead to performance problems.
  • Maturity – a virtualisation layer requires some key functionality (such as replication) to move up to the virtual layer.  Advanced features (tiering, replication, etc) are not supported across all solutions.

Ultimately when deciding to use storage virtualisation, it’s all about comparing the benefits against the disadvantages.  But what about SVC?  Jim’s article was specifically talking about scale-out and this where SVC has a particular problem.

Mdisks and Vdisks

SVC uses the concepts of extents to map physical storage (Mdisks) to virtual volumes (Vdisks).  Physical storage is divided into extents, deployed in pools and then recombined from the pools to create virtual volumes.  The extent is the feature that enables wide striping, thin provisioning, data migrations and tiering.  However, SVC is very limited in the number of extents it can support – only 2^22 or 4,194,304 – just over 4 million.  SVC copes with the issues of scale by simply deploying larger extents.  For example, with an extent size of 16MB, total capacity is limited to 64TB.  To support, say four FlashSystem 840 devices with a total capacity of 192TB, extents of 48MB or larger are needed, which means in practice using 64MB as this is the next available permitted size.  To achieve maximum LUN (volume) scalability of 8192 volumes would need an 8-node SVC cluster.  Each volume would need to be a minimum of 24GB in size simply to use the available capacity.

Now, these limits may not seem a problem, but remember an extent can only be assigned to one volume.  This means for space efficient (thin provisioned) volumes, the growth unit would be 64MB, a heavily wasteful size as host file system fragmentation occurs, requiring lots of host-based re-organisation to keep tidy.  So for our scale out solution we have an 8-node SVC to support four storage nodes using 64MB blocks – a totally imbalanced design.  One final thought on this; if the solution was required to scale further, then the extent size would have needed to be set at deployment time, meaning scale-out requirements would have to have been included in that initial planning and potentially making thin provisioning even more wasteful.

The Architect’s View

I’m a big fan of storage virtualisation and SVC has many great features.  However scale-out is not one of them.  The system may be capable of scaling capacity, but the poor number of logical volumes and extents it supports is severely limiting to many deployments.  I am surprised that this limit has been in place for many version releases of the software; it really needs an upgrade.  Pairing FlashSystem with SVC is a great idea, but the restrictions around configuration (and of course the additional cost) make the solution impractical compared to more mature all-flash solutions already on the market.  While SVC & FlashSystem may work as a technical solution, the practicalities and cost make no sense at all.


Comments are always welcome; please indicate if you work for a vendor as it’s only fair.  If you have any related links of interest, please feel free to add them as a comment for consideration.

Subscribe to the newsletter! – simply follow this link and enter your basic details (email addresses not shared with any other site).

Copyright (c) 2009-2014 – Chris M Evans, first published on http://blog.architecting.it, do not reproduce without permission.



About Chris M Evans

Chris M Evans has worked in the technology industry since 1987, starting as a systems programmer on the IBM mainframe platform, while retaining an interest in storage. After working abroad, he co-founded an Internet-based music distribution company during the .com era, returning to consultancy in the new millennium. In 2009 Chris co-founded Langton Blue Ltd (www.langtonblue.com), a boutique consultancy firm focused on delivering business benefit through efficient technology deployments. Chris writes a popular blog at http://blog.architecting.it, attends many conferences and invitation-only events and can be found providing regular industry contributions through Twitter (@chrismevans) and other social media outlets.
  • https://www.ibm.com/developerworks/community/blogs/storagevirtualization/ Barry Whyte
    • http://thestoragearchitect.com/ Chris M Evans

      Barry, thanks for joining in the discussion! A few things spring to mind.

      With extents (regardless of size) are you saying that the directory mapping that is maintained between extents and grains means that irrespective of where the (256KB) grain is written in the file system, they are written sequentially within one extent until the extent is consumed before another extent is taken?

      I noticed in the SVC best practices guide (v 6.2 seems to be the last version) section 6.1.4 recommends not using thin provisioned volumes for high performance. is that still the recommendation? If so, does that mean FlashSystem behind SVC should be a thick (1:1) provisioning? What is the recommendation for compression?

      One final comment. I mentioned that 8 SVC nodes to support four FlashSystem arrays seemed somewhat excessive in order to get volume count scalability. This was based on the assumption that customers might want 8192 LUNs out of their system. Now if this isn’t the case, then I agree a system could be deployed with less hardware. However, I don’t see anything wrong with providing the capacity to reach 8K LUNs in a system and that means a lot of extra SVC hardware (If that limit is reached).



      • https://www.ibm.com/developerworks/community/blogs/storagevirtualization/ Barry Whyte


        Yes, although an extent is a contiguous space on the mdisk, only the grain is contiguous on the vdisk, i.e. a single extent can contain grains from anywhere in the filesystem / volume. So if you write 4KB at LBA0 then a 256K grain is allocated in extent 0 that stores LBA0-512. Then you write 4KB at (last 256KB on volume) then grain 2 within extent0 stores LBAxxxx-xxxx+512)

        This is the main reason that we recommended for ultimate performance you didn’t use Thin, because you need to lookup the grain mapping from the vdisk to the mdisk, so in worst case 2 reads are needed for one host read – same for writes – however we have a cache in the Thin Provisioning layer, that caches meta-data, and has some very efficient predictive algorithms. For FlashSystem and SSD, it makes less of an impact, because any meta-data I/O is also at tens of microsecond latencies.

        Agreed, if you need to go beyond 2048 volumes, then yes you’d need more SVC hardware, but key thing being, from 40TB usable, do you need more than 2048 volumes. The main reason for having lots of volumes in the past was to wide stripe at the OS or application level – when one volume can provide you 100,000’s of IOPs, do many people need that many volumes? i.e. whats the volume to TB allocation if performance isn’t an issue?

  • Pingback: Build Your Own Scalable All-Flash Array With SVC()

1 Flares Twitter 0 Facebook 0 Google+ 1 StumbleUpon 0 Buffer 0 LinkedIn 0 Filament.io 1 Flares ×