This is the sixth and final post in our series looking at the StorPool software-defined storage platform. In this post, we compare StorPool to other solutions in the marketplace.
- StorPool Review – Part 1 – Installation & Configuration
- StorPool Review – Part 2 – Performance
- StorPool Review – Part 3 – Connectivity & Scripting
- StorPool Review – Part 4 – Kubernetes Support
- StorPool Review – Part 5 – Resiliency and Recovery
The last twenty years has seen significant evolution in the storage market. At the turn of the new millennium, storage arrays or appliances were the primary way in which persistent storage resources were attached to multiple servers and applications. Fibre Channel drove the adoption of this architecture, which removed the distance and other physical challenges of directly attached SCSI connections. Storage Area Networks and storage arrays provided resilient, efficient, and more manageable storage than was achieved with hard drives in individual servers.
Of course, technology never stands still, and the benefits of shared storage began to represent challenges to the deployment of infrastructure. Fibre Channel was seen as complicated and expensive while representing a second physical network that had to be deployed just for storage.
Performance hard disk drives have been superseded by NAND flash, while NVMe has been adopted as a modern storage protocol to remove the overheads and bottlenecks of SAS and SATA. Storage vendors are now starting to adopt persistent memory as yet another tier in the storage hierarchy.
Probably the most significant change for the industry has been the move towards software-defined storage (SDS). All modern storage solutions are now essentially software-defined, with little or no dedicated firmware and a widespread use of commodity (or off the shelf) components. This transition means the capabilities of software are where new storage features are implemented, generally exploiting the capabilities of hardware and mitigating specific media characteristics (such as SMR and zoned NAND).
The move to software-defined hasn’t resulted in a widespread divergence of hardware and software that could have been expected a decade ago. The bifurcation of the storage array has allowed vendors to offer support for pre-validated hardware configurations while focusing on the software side as the place for feature evolution. New hardware components such as persistent memory mean that the interoperability between software and hardware is more important than ever. As a case in point, storage solution vendors have started a move towards software-only sales, with certified hardware delivered through partners. This is as much a financial sleight of hand as a technical one.
A decade ago, software-defined storage was almost a novelty and a technology for enthusiasts looking to save money by building their own solutions.
Today, there is no difference between the packaged appliances sold by the likes of Dell and HPE and those built from software-defined storage. Arguably, modern SDS offers the flexibility of hardware choice without the inevitable lag of vendor-supplied hardware. (see our recent ebook for more details)
How does the modern shared storage experience fit in with the features and functionality offered by StorPool? First, we should set some categorisation for comparison. Modern storage should provide features in the following broad categories:
- Architecture – features designed to offer flexibility of deployment, the ability to use modern infrastructure (both storage and networking), plus support for a wide range of modern data protocols.
- Data Availability/Resiliency – features like on-disk checksums, automated failure detection and recovery, non-disruptive upgrades, data replication and snapshots/clones.
- Data Efficiency – thin provisioning, compression/dedupe, thin provisioning.
- Data Management – quality of service, native platform support (virtualisation and containerisation), storage tiering, data tiering and locality, storage pools.
- Operations Management – CLIs and APIs, mature and extensive data collection and analysis tools.
- Performance – offer high performance in terms of throughput, with low latency. Any performance capabilities should fully exploit the abilities of the underlying media.
In evaluating solutions, the most obvious route is to compare similar SDS offerings. This method does give a good view to IT organisations looking to compare one architectural design. However, in modern infrastructure, traditional SANs can now easily be replaced by SDS solutions such as StorPool. As a result, it makes sense to compare StorPool with the broader market.
The storage appliance market continues to be dominated by Dell, HPE, NetApp, IBM, Hitachi and Pure Storage. However, this market has been flat in revenue terms for years. Product evolution has been limited, with “newcomers” such as Pure Storage releasing new products and solutions while others re-invent or rehash their existing solutions.
This approach by vendors is fascinating to explore. Dell/EMC focuses on PowerMax (Symmetrix evolved from 30 years ago) and PowerStore (legacy Clariion/VNX/Unity heritage) yet has arguably underdeveloped solutions like PowerFlex available in its portfolio. NetApp continues to sell ONTAP, the original storage appliance software developed in the early 1990s. IBM sells either DS series (based on PowerPC) from 30 years ago or FlashSystem based on 20+ year-old SVC technology. Curiously, Infinidat, with consistent year-on-year growth, still doesn’t appear in analyst sales’ ratings.
How can we compare StorPool (or any SDS solution) with these vendor products? Architecturally, none of these solutions is designed for true scale-out or a cloud-based infrastructure model. However, functionally, most have a more comprehensive set of storage-based features than StorPool. For example, offering de-duplication, compression, or RAID-based resiliency (rather than mirroring). But when positioned for scale-out cloud infrastructure, StorPool offers greater integration and manageability than any of the appliance-based solutions.
Vendors such as StorPool will continue to gain share with MSPs and enterprises that have a greater bias and need for cloud-like functionality and scale-out, rather than cost-saving data efficiency features, so any comparisons need to reflect the requirements of the end-user.
Software-defined storage offers much more to service providers and enterprises looking to emulate the public cloud. Hardware is disaggregated from the discussion, as most solutions use off-the-shelf components rather than bespoke hardware designs. We compared StorPool with four popular SDS solutions in the market and summarised the results in individual radar charts.
Commvault Distributed Storage (Hedvig)
Commvault Systems acquired Hedvig in September 2019 for $225 million. The Hedvig Distributed Storage Platform has since been renamed Commvault Distributed Storage (CDS). The CDS platform is a scale-out architecture that offers block and unstructured data protocols. The solution fits well with the service-provider market and generally scores highly across each of our categories but is beaten by StorPool in the architecture and performance areas. As a commercial product, we don’t know how well CDS is achieving in the market. However, Commvault hasn’t pushed the solution too heavily, and we see little marketing news. In our opinion, CDS is more aligned to large-capacity unstructured data storage than high-performance block-based applications.
Ceph is an open-source storage platform delivering block and unstructured storage. The implementation is scale-out and has a surprisingly long history from early development in the mid-2000s. Red Hat acquired the leading developers of Ceph in 2014, selling commercial versions of the software as Red Hat Ceph Storage. Although Ceph is now seen as relatively feature-rich, the solution has a history of performance challenges and hardware inefficiency. The Ceph community has looked to resolve these issues with the introduction of a new storage backend codenamed BlueStore.
Microsoft Storage Spaces Direct (S2D)
Microsoft first introduced Storage Spaces in Windows Server 2012. The technology expanded in Windows 2016 to offer a distributed scale-out architecture called Storage Spaces Direct (S2D). Windows customers can use S2D to build out storage clusters that support both block and file-based applications. The features of S2D are comprehensive and include many traditional resiliency functions like snapshots and replication. Data efficiency capabilities are implemented through the ReFS (Resilient File System).
The only challenge with using Windows as the basis of a storage platform is the overhead of Windows itself. Windows Server implementations are generally more resource-intensive than Linux, require licensing and don’t offer the chance to implement HCI solutions or run Kubernetes. In addition, Microsoft recently discontinued the free edition of Hyper-V Server, introducing questions around the long-term viability of on-premises Windows and the Hyper-V virtualisation platform.
PowerFlex is a rebranding of the ScaleIO platform, acquired by EMC in 2013 and subsequently a Dell product following the EMC acquisition. We heard relatively little from the ScaleIO team after the company was acquired. In 2018, Dell announced ScaleIO would only be sold as a package with hardware. The current release is only up to version 3.6, which implies not much development has taken place on the product while in the Dell stable. Architecturally, ScaleIO is a scale-out block storage solution using commodity hardware. From researching through the available documentation, the platform seems to score well across the board. Unfortunately, PowerFlex appears to have been de-prioritised compared to PowerStore and PowerMax when it comes to marketing and development.
Finally, of course, we should include our assessment of StorPool. As we’ve highlighted before, the solution is perhaps an undiscovered gem in the storage world. The technology fits well with the service provider market, scoring highly on efficiency and usability. The SaaS backend reporting capabilities are probably the most extensive we’ve ever seen, with every possible metric covered. StorPool could be stronger in data efficiency by introducing de-duplication, compression and more RAID-like protection. However, these features would need to be developed in a way that doesn’t compromise performance (or could be set at a per-volume level). We’d like to see more integrated solutions discussions with products for unstructured protocols (NAS and object). This would expand the TAM for StorPool considerably, bearing in mind the current appetite for fast file and object solutions.
In this brief comparison we considered a wide range of other market solutions. These were discounted either because of insufficient market traction, maturity or lack of similar product fit.
The Architect’s View™
Making comparisons between storage products is not a straightforward task. The traditional appliance market isn’t an area of growth, whereas unstructured data and cloud-like infrastructure is becoming more popular. If we are to see widespread adoption of container-based applications, then scale-out container-aware storage is a must. The transition to software-defined solutions isn’t going to happen overnight but is gradually chipping away at the traditional storage base. StorPool is in a solid position to capitalise on this market, with some product enhancements and integrations required. We’re poised to see software-defined solutions become the dominant deployment model by the end of the decade.
This work has been made possible through sponsorship from StorPool.
Copyright (c) 2007-2021 – Post #3bcf – Brookend Ltd, first published on https://www.architecting.it/blog, do not reproduce without permission.