One of the more curious vendors at Flash Memory Summit 2022 was GRAID Technology. At first glance, the company appears to be marketing nothing more than a high-end RAID card, yet won the “Most Innovative Flash Memory Startup” award. So, what’s going on here?
For servers where storage hardware protection isn’t provided by a scale-out architecture, a typical approach to implement data resiliency is to use on-board RAID provided by the server, install a dedicated RAID card or use software RAID. The latest Dell PERC (PowerEdge RAID Controller) devices, for example, can deliver up to a million write IOPS and a few million read IOPS, depending on the configuration (see this report). However, a single NVMe SSD is now capable of delivering this level of throughput, so new techniques are needed to exploit a fully deployed server with 32 NVMe drives.
Implementing RAID in software consumes CPU cores, potentially making application server software more expensive when licensing is based on sockets. In any event, if storage I/O consumes a considerable proportion of CPU time, the overall efficiency of the server is impacted, and more hardware must be deployed.
Note: One of the main reasons for the use of storage area networks was to centralise storage resources and avoid the need for host-based resiliency, but that’s a story for another day.
GRAID Technology Inc., a start-up founded in 2020, markets two products that are essentially high-end NVMe RAID cards. The SupremeRAID SR-1000 supports PCIe Gen3 systems, while the SR-1010 offers PCIe 4.0 support, with an obvious uplift in performance.
The performance numbers are pretty impressive – up to 19 million random 4KB read IOPS and 6 million random 4KB write IOPS (RAID-10). Throughput is up to 110GB/s sequential read and 25GB/s sequential write. These numbers are for Linux. Windows throughput is significantly lower.
The SupremeRAID products aren’t RAID cards in the traditional sense, where the hardware sits in the data path. Instead, the GRAID cards perform the complex calculations needed to support RAID using an NVIDIA GPU. Data flows to and from drives using a virtual NVMe driver, so SupremeRAID is essentially an offload engine. I suspect the Windows performance is lower due to the architecture of the virtual device driver implementation, which is perhaps less easy to implement than in Linux.
The implementation design makes the GRAID solution an acceleration product like many other SmartNICs on the market today. Being outside the data path also makes the physical implementation easier, as drives don’t have to be connected directly to the card itself (which is generally the way SAS RAID cards would work).
SupremeRAID also supports NVMe-oF devices, which is an interesting option. It could be possible, for example, to use drives deployed in JBOD shelves that just expose NVMe devices over a fast network. It also means that a server doesn’t need physically local storage at all.
Of course, there is no free lunch with RAID calculations. The RAID overhead needs to be attributed somewhere, either using CPU cycles from the core processor, a RAID card or the GRAID GPU. A single GRAID SupremeRAID SR-1010 consumes 70W of power and requires a x16 PCIe Gen4 slot. Once the SupremeRAID solution is deployed at scale, the additional power overhead starts to add up.
We also noted that SupremeRAID requires IOMMU to be disabled in the server BIOS. This could be an issue for some virtual environments with PCIe passthrough. Additionally, there isn’t a VMware driver, so this solution won’t work in virtual machine farms (and may not anyway, without IOMMU support).
The Architect’s View®
When is a RAID card not a RAID card? When it’s a RAID offload engine. Getting out of the data path is an elegant solution to gain I/O throughput at scale, where PCI Express itself could be the bottleneck. Support for NVMe-oF could lead to some fascinating design ideas. We could also see the GRAID products used to build massive I/O farms.
If your requirement is I/O throughput at any cost, then GRAID is for you. Outside of that, there’s a TCO calculation to be done in order to justify recovering CPU cycles by deploying a power-hungry add-in card. No doubt there will be a sweet spot where GRAID works well.
Choosing the right infrastructure architecture, whether that be dedicated storage via a SAN, implementing scale-out storage in hardware (or software) or using a GRAID solution to build a powerful scale-up solution, continues to offer enterprises lots of choices – and perhaps some risk of analysis paralysis.
Copyright (c) 2007-2022 – Post #8e70 – Brookend Ltd, first published on https://www.architecting.it/blog, do not reproduce without permission.