This is the second post in a series looking at predictions for the storage industry in 2021. Find the first post here.
Storage systems have come a long way since EMC first introduced them in the form of Symmetrix and the ICDA (Integrated Cached Disk Array). After nearly 30 years of development, what can we expect in the coming decade?
The shared storage technology deployed in data centres today is very different from that introduced in the early 1990s. The first shared storage systems used SCSI interconnects, hard drives and mostly custom hardware. Storage frames or arrays were generally deployed as a single 19” rack that encompassed disks, controllers and management laptops. The provisioning of storage was complicated and companies like EMC made fortunes on the design, planning, installation and management processes.
However, what seems like a lifetime away is still pretty much what we’re deploying today, albeit with much greater simplification and improved software. Modern storage appliances are generally based on 2U/4U servers with multiple controllers, disks (HDD and SSD) and web-based management. Thirty years of server evolution has enabled storage systems to be built from commodity components and for most features to move into software.
We could argue that storage features were always in software. However, in early systems, upgrades or patching was a major exercise, updating microcode that needed downtime and an army of onsite support staff. Today, upgrades can be done remotely and with no disruption.
Modern storage appliances have standardised on two main designs.
- Scale-up – building out increased capacity through the addition of faster controllers (compute) and/or more storage capacity/drives.
- Scale-out – building out increased capacity through the addition of physical nodes that add both storage and compute in a clustered configuration.
We won’t go over the merits of each design (check this post out for more details), but simply highlight the level of standardisation that has occurred. There are, as always, some outliers where vendors have developed some degree of bespoke technology (think Pure Storage and IBM flash drives). But technology is mostly consistent and rarely totally bespoke.
The introduction of scale-out solutions also saw a diversification of protocols and data types. Where the first systems focused purely on block-based storage, we’ve seen the introduction of filers (NAS and SMB) and object stores, both supporting unstructured data. Arguably, demand growth is now greater in the unstructured than the structured part of the market.
What have we gained over 30 years of development?
Growth and Density – in parallel with the improvements in storage media, storage systems capacity has increased massively. It’s now easy (excluding the cost) to build petabyte systems, even as all-flash configurations. Scale-out technology allows almost infinite scaling, with geo-dispersed deployments that are efficient across multiple locations.
Features – The increase in processor performance has enabled storage systems vendors to add many new features in software, including compression, de-duplication and encryption. Features that previously required dedicated hardware can now all be implemented in software. Note: some vendors are making tactical use of hardware, for example, NetApp with Pensando cards for data reduction, but this approach isn’t widespread.
Ease of Use – storage systems from 20 years ago required considerable technical skills and understanding. Storage administrators were required to understand disk and RAID group layouts, spindle IOPS counts, the details of fibre channel and have good debugging skills. As storage software has evolved, most of these skills have been replaced by simplified management tools and SaaS-based analytics. Storage has become very much an administrative role and much less technical.
Of course, the hardware simplification and move to software have seen a big rise in software-defined storage (SDS). We’ll cover the details of that in part 3, but for now, highlight the increased competition that has developed over the last ten years.
The SDS movement was spearheaded by companies like Nexenta and Nutanix, with VMware introducing vSAN and eliminating the need for shared storage altogether. For many IT organisations, HCI (hyper-converged infrastructure) has been a great step forward, as it offers potential cost savings, simplified management and fewer storage skills.
Container-attached storage or CAS (to be discussed in part 4) represents another fragmentation of the storage market. In these solutions, storage is delivered from containers with infrastructure, including Kubernetes that also deliver the application. The implementation is very similar to HCI in that it removes the need for a dedicated storage appliance.
We also shouldn’t forget the impact of Open Source in all of these discussions. We’ll look at that more in part 5 of this series.
With so much fragmentation in the market, it may seem that the future doesn’t look that great for shared storage appliances. It’s true that the traditional appliance market has stagnated, but many of the vendors building those original solutions have also released software-only implementations and even cloud-native storage. The definition of a storage appliance is more challenging to determine, especially when the customer can custom-build platforms like object stores and pay by licensed capacity.
The market is changing and diversifying, but there’s still a need for shared storage.
IT organisations will continue to need a reliable and predictable place to store persistent data. For companies still building out on-premises or hybrid solutions, economies of scale and application diversification (virtualisation and containers) means some shared storage is always required. For some companies that may be a mix of large unstructured platforms and some block-based storage – or a single platform that deploys everything in one system.
What can we predict for the coming decade?
Declining costs and increased reliability. This is pretty obvious but does need to be called out. Cloud storage costs have plateaued and aren’t reducing further (in fact, cloud service providers are implementing more tiers instead). In contrast, the on-premises storage market is a cut-throat business, with many competing companies.
Service-based pricing. The coming year will see an increased push on buying using service-based models and consumption pricing. This means architectures need to be capable of incremental capacity growth, have remote management/monitoring and be highly commoditised. Vendors will need to become more transparent with their prices.
Transition to all-flash. We’ve talked about the all-flash data centre for years. As flash costs have reduced, so have hard drive $/GB prices, so for large volumes of inactive data, flash will never be competitive (or certainly not in the next decade). However, for active data, flash and persistent memory (3D-XPoint) will eliminate the hard drive. I expect within five years the hard disk will be an archive-only solution.
Feature stagnation. There’s not much left to add into shared storage solutions, in terms of new features. Dell EMC thinks we’ll run applications on our storage, but I think this is a corner case. We see some new architectures, such as Pavilion Data, but these just address the standard requirements of increased performance and greater capacity. With such thin margins for storage vendors, I expect to see fewer new features and little change in storage platforms in the next ten years. We will continue to see new architectures that exploit new media better, but there aren’t many improvements to make outside of that.
The end of enterprise arrays. The classic “monolithic” storage appliance of the 1990s is all but dead, save for the connection to mainframe. Midrange designs are good enough for most requirements and I expect this design will be the dominant block-based storage platform going forward.
Integration. Perhaps the most significant opportunity for shared storage is greater integration. If container-based frameworks like Kubernetes become as dynamic as containers themselves, then we will need another layer to manage persistence. There’s a role here for shared storage to offer persistence, resilience and integrated data protection. We’ve also seen other solutions like NVIDIA GPU Direct, which bypasses the need to move data via the processor. I expect we will see greater integrations in this area, where storage appliances act more autonomously from the connected server.
Rise of the SoHo devices. In a classic Innovator’s Dilemma scenario, I envisage that at the lower end of the market, new entrants will emerge, building solutions that started as home systems but can deliver to an ever-wider set of requirements. These vendors, including QNAP and Synology, are building systems “good enough” to meet entry needs of SMEs and some medium-size businesses. The lower-capacity end of the market will be increasingly challenged by these companies and by the public cloud.
The Architect’s View
Shared storage is a mature market, with a lot of competition between vendors. Margins are very tight, and as a result, I anticipate the number of appliance vendors to reduce significantly over the coming decade. We’ve already seen vendor consolidation and acquisitions. For 2021 onwards, I expect to see a lot more of this happening, with only a handful of storage appliance vendors remaining by 2030. The golden age of storage appliances is fading, with much more focus on software and cloud. These will be the topics for the remainder of this series of posts.
Copyright (c) 2007-2021 – Post #2fb3 – Brookend Ltd, first published on https://www.architecting.it/blog, do not reproduce without permission.