Quantum Corporation has announced Myriad, a new scale-out object and file storage platform. We look at the architecture in this blog post, with a follow-up discussion on the relative strengths and weaknesses of the solution compared to the broader market.
Myriad is a totally new scale-out object and file storage platform developed by Quantum. Although the solution is “software-defined” and can be run on any general storage hardware, Quantum is initially selling the platform as a bundled hardware and software combination.
The architecture is node (or server) based, with three distinct sets of components. NVMe storage nodes provide capacity, initially with ten 15TB SSDs per 1U server. Load balancer nodes (which are Linux-based network switches) provide front-end connectivity, initially with NFSv4, while NFSv3, SMB, S3 object storage and GPU Direct clients are on the roadmap. Overall system management is under the control of a deployment node, which configures, updates, and installs nodes within a cluster. An example configuration is shown in figure 1.
All the nodes are connected with 100Gb Ethernet networking, while RNICs in the storage nodes are used to offload some networking functionality.
We’ve mentioned the hardware configuration first, as this is probably the least exciting part of the architecture. Myriad is primarily a software solution designed to run on-premises and in the public cloud. The architecture makes use of a micro-services design, orchestrated with Kubernetes. Customers don’t need to understand or see the Kubernetes layer, as the implementation is internally managed. However, using the Kubernetes platform means Myriad can be deployed on public cloud infrastructure running managed Kubernetes services like AKS or EKS.
- Moving to Unstructured Data Stores
- Object Storage Performance – Your Mileage May Vary
- Building a Golden Data Repository
Figure 2 shows the Myriad software stack. At the base layer, data is stored on NVMe SSDs using a transactional key/value store. The “transactional” nature is vital to guarantee data integrity, especially with file services that must be POSIX compliant. We’ve seen the key/value store architecture used before in platforms such as VAST Data’s Universal Storage and the now-defunct Stellus.
The KV design abstracts data storage away from the storage hardware implementation, essentially creating a massive “bucket” of storage capacity that can be used to store objects, files, parts of large objects and files, and metadata. This abstraction makes it easier to think of unstructured data as simply data “pieces” to be stored within the system, whether the data is derived from files or objects. It also provides independence from the underlying storage hardware, which can be managed by a separate layer of indirection.
Myriad has all the features expected in a modern unstructured data store. This includes snapshots, clones, and dynamic erasure coding, which adjusts and protects data as systems expand and contract. The system also maintains availability in the event of a drive or node failure. Data optimisation is implemented through data compression and deduplication. The latter feature uses an in-memory cache to index rather than write new data that matches what’s already on physical media.
One aspect worth mentioning here is the composability feature, essentially allowing for the creation of many file systems (and presumably in the future, many object buckets). We hope to see this translate into an ability to programmatically create file systems and buckets for dynamic environments.
There are some interesting architectural decisions in the Myriad implementation. The design uses NVMe Zones, enabling large-capacity SSDs to be divided into smaller key/value pools. Some offloading of data transfers is achieved using RNICs, however, the design ensures node CPUs are still aware of data moving around the system (for consistency purposes). Each storage node implements part of a global cache, caching only data for drives within that node. Physical storage is mapped into pods, with local pods (within a node) connected to remote pods (in other nodes).
The shared-nothing architecture means that some east-west traffic must occur, especially to look up unique deduplicated data keys. Otherwise, the system would be limited to the cache capacity in each node. However, the shared-nothing design means there should be little or no limits on scalability.
It’s interesting to see how unstructured data stores have evolved in recent years as new architectures have been designed and implemented to make use of modern NVMe media. Myriad has echoes of Cleversafe (access and data nodes), VAST Data (scale-out NVMe nodes, KV store), Stellus (KV store) and Excelero (the use of RNICs and RDMA). However, there are also plenty of differences in implementation across these solutions that make all of them unique.
In terms of the broader Quantum portfolio, Myriad provides features that aren’t delivered by other solutions available to Quantum customers. ActiveScale isn’t designed for low latency, high-performance workloads but does provide the next level of tiering. Similarly, StorNext (both the file system and appliances) fit a different part of the market. We’ll dig into the relative positioning of Myriad in another post, however, it’s safe to say at this point that the solution plugs gaps that existing customers could quickly adopt.
The Architect’s View®
It’s a bold move to bring out a new unstructured data platform in the current market, with mature competition and some big start-up players all fighting for a share of the AI/ML, HPC and other high-performance storage use cases. However, we believe that Myriad is not about trying to compete directly against other vendors but instead is focused on filling gaps in the Quantum portfolio. With over 10,000 existing customers (including 52 of the Fortune 100), there’s a significant opportunity for the company to exploit. Part of this process will be to enable the synergy between the multiple platform offerings already available, including tape, unstructured archives, and file system products. We’ll discuss that in a separate blog post.
Copyright (c) 2007-2023 – Post #98c3 – Brookend Ltd, first published on https://www.architecting.it/blog, do not reproduce without permission.