One of the biggest debates in enterprise storage focuses on whether scale-out offers a better architectural choice over scale-up. With recent increases in media density and capacity, multi-core processors and persistent memory, are the use cases for scale-out still fully justified?
For decades the standard design of storage platforms focused on tightly coupled scale-up architecture. By “scale-up” we mean the ability to increase capacity in a single system by adding more HDDs or SSDs. Most scale-up solutions use a dual-controller architecture, which inherently limits performance to the speed and capacity of the controller itself. The throughput capability of the controllers is a limiting factor, irrespective of the amount of connected storage, as the controllers are directly in the data path.
I use the term “tightly coupled” to reference the internal management of persistent storage within a storage appliance. In tightly coupled solutions, the backend storage is accessible directly by both controllers, so some degree of co-ordination between controllers is required. Early systems used replicated DRAM (which had challenges), whereas modern solutions can share persistent storage (like NVRAM, Optane or NVMe SSDs) over a PCIe bus.
The alternative to scale-up is to scale-out the architecture through the addition of extra compute resources, either as tightly or loosely coupled nodes. Tightly coupled solutions share storage, whereas loosely coupled designs assign storage to specific nodes and replicate data between nodes for resiliency. Depending on the design, a node can be “expendable”, if the remaining nodes provide data and performance resiliency. This means each node can be built around a lower-cost and less resilient design.
Scale-out designs are suitable for unstructured data, where the content can be spread across multiple nodes to improve resiliency and performance. Generally, with this type of data, the I/O profile doesn’t need deterministic latency at the block level (like block-based I/O does). This is why we saw scale-out as a prominent design in file and object storage solutions. A scale-out solution also permits the capacity of a single volume to exceed a single node, so object stores or file systems that need to support large capacity are well-suited for scale-out designs.
It’s worth noting that many solutions sold as scale-out are either limited in their scalability or use a mix of scale-up and scale-out together. For example, in some designs, nodes consist of multiple controllers, plus storage and battery backup. Multiple nodes are then linked together in a tightly coupled cluster. To my mind, this configuration is less flexible than either scale-up or scale-out alone, because it both introduces additional complexity and unnecessarily increases the BOM (bill-of-materials) to gain resiliency.
Scalability is important to enterprise customers, as capacity management requires planning the addition of more capacity or performance. In shared storage environments, it’s impossible to predict the long-term balance between performance and capacity (the I/O density). Each new application added to a platform changes the I/O density ratio, while existing applications change over time, depending on the application usage.
In my experience, solutions I designed focused on ensuring both capacity and performance could be expanded in a single array or appliance, subject to the acceptable fault-domain size (the impact of losing a single storage array). I generally designed array deployments to cater for the addition of storage or more controllers (or both), where possible. This mitigated the issue of having to know up-front exactly how a system should be configured.
Modern storage systems have changed significantly over the past 20 years. The limitations of processor speeds and DRAM have reduced to the point where vendors are designing scale-up storage arrays with 100+ cores, terabytes of DRAM and petabyte-level capacity. NVMe and SSD capacities now stretch to multi-terabytes, with 18-20TB HDDs and 38TB+ SSDs already on the market.
The result of this evolution is the ability to build a storage platform with incredible performance, all in a standard 2U chassis. In fact, the 2U appliance form factor is becoming the default design of choice for almost all vendors.
With so much capacity in a single appliance, federation becomes a better option than scale-out. In a federated architecture, multiple appliances act in a loosely coupled configuration. Each system is independent, with no software or hardware dependencies between them. A separate management platform provides capacity and performance metrics, while other processes are used to balance workloads across the architecture.
Virtualisation first provided this capability. VMware ESXi and vSphere makes it easy to move virtual machines between datastores either manually or using automated software (DRS). We can expect to see a similar process in play as container attached storage becomes more popular, as this software will provide the abstraction layer to load-balance across multiple physical appliances.
Federation offers a better solution for block-based workloads than scale-out. Each individual system has the option to be managed separately, from the perspective of system upgrades, code upgrades or decommissioning. In the block-storage world, metro-cluster designs enable data to be synchronously replicated between appliances without any outage. So internal load balancing within a single cluster of appliances is possible, even without virtualisation.
Picking the right Architecture
Do the advancements in hardware mean scale-out is becoming redundant? I don’t think that’s the case. However, for deployments up to a petabyte in capacity, scale-up architectures can probably deliver all the requirements of an application without too much difficulty and across all protocols. Above a petabyte (and in those cases where a large, single pool or namespace is needed), the better choice is definitely scale-out. Scale-out architectures introduce better resiliency at scale through the use of erasure coding rather than RAID protection. A scale-out design also offers greater efficiency in a geo-distributed model, compared to outright replication of an entire appliance.
One other aspect to consider in the choice of scaling up or out will be the impact of SmartNICs. SmartNIC technology offloads some networking, storage and security functionality to intelligent HBAs that can sit in or outside of the data path. Some storage and HCI vendors already use SmartNICs to offload data reduction functions. This technology may continue to disaggregate tasks away from the CPU, providing even greater throughput performance for scale-up technology. Of course, it may also lead to a new breed of scale-out solutions that are based more on an HCI model than traditional scale-out storage.
The Architect’s View
Both scale-out and scale-up are still valid storage architectures. However, improvements in technology mean the boundary between both designs has shifted to a much higher entry level for scale-out. The capacity of individual media devices has reached the point where it’s almost impossible to build a scale-out solution with less than 500TB of capacity. For many smaller (and some larger) organisations, the use of scale-up to deliver block, file and object protocols will work just fine. At the multi-petabyte level, scale-out makes sense from an operational and cost perspective, which is typically where unstructured data will be the exclusive requirement.
Copyright (c) 2007-2020 – Post #39b1 – Brookend Ltd, first published on https://www.architecting.it/blog, do not reproduce without permission.