The recent Future of Memory and Storage conference was notable for several announcements of 100+TB SSDs. Technically speaking, this capacity point was breached many years ago, but now we’re seeing mainstream vendors planning to offer NVMe SSD devices around the 128TB capacity mark. So, we can just replace those 32GB and 16GB models and save floor space, power, and cooling, right? Well, not so fast….
Background
Following on from pre-announcements at FMS, in recent days both Solidigm and Phison have announced 122TB SSDs. The Solidigm D5-P5336 is an enterprise drive currently supporting up to 61.44TB and debuted in July 2023 (see our article covering the announcement). The most recently announced upgrade to the D5-P5336 introduces a 122.88TB drive, effectively doubling the capacity of current models, and is due in Q1 2025.
It appears Solidigm has managed to cram more NAND chips into the existing form factors of the 61.44TB models, doubling the capacity and the total endurance of the drive (twice as much flash means the total endurance is twice as much, but the DWPD remains the same). However, performance declines slightly from the 61.44TB model, with 930,000 read IOPS (1,005,000 on the 61TB drive) and 25,000 write IOPS (38,000 for 61TB) and 3,200 MB/s sequential write (3,300 MB/s on 61TB). For read throughput, there is a slight improvement with 7,400 MB/s sequential read (7000 MB/s on 61TB). The interface on the 122TB D5-P5336 remains PCIe 4.0, which will be a limiting factor on performance, especially the read throughput detailed above.
Phison
The Phison Pascari D205V was announced this week, also promising 122.88TB of capacity in a U.2 or E3.L form factor. The drive has a slightly higher power footprint than the Solidigm D5-P5336, but is based on PCIe 5.0, delivering (up to) 14.6GB/s of read throughput, 3.2GB/s of write throughput, with up to 3 million random read IOPS and 35,000 write IOPS.
The Phison SSD performs significantly better than the Solidigm device for read I/O, we believe due to the use of PCIe 5.0. In terms of raw bandwidth, PCIe 4.0 delivers 1.969 GB/s or just under 8GB/s for a x4 interface (around the maximum read capability we see with 4.0 drives). PCIe 5.0 doubles those numbers, which aligns with the performance capabilities we see with the Solidigm and Phison devices.
Other Entrants
Western Digital has also demonstrated a 122TB drive, which will probably arrive as an update to the DC SN655. Samsung indicated the BM1743 SSD announced in June 2024 would accommodate a 128TB model in the future. Pure Storage is expected to deliver 150TB drives (for systems use only) by the end of 2024. It is clear that the jump to 100TB+ drives is here (and arguably was first done by Nimbus Data in 2018) as vendors seek to gain a greater segment of the market for high-performance and high-capacity storage.
Write Performance
While the throughput numbers are good (especially for PCIe 5.0 devices), we should be looking at the random I/O figures and specifically random write I/O as an indication of the future direction of the market. Both the Solidigm and Phison SSDs are heavily biased towards write I/O, a characteristic we highlighted with the Western Digital DC SN655 earlier this year.
The IOPS read/write ratio for the highest capacity DC SN655 is 30.7:1, while the D5-P5336 122TB is estimated at 37.2:1. The Phison Pascari drive is much worse at 85.7:1, but that number is skewed by the capability of PCIe 5.0, which means PCIe 5.0 enabled Solidigm and WD drives could also have much higher ratios.
The reason for the discrepancy between read and write performance is due to the use of QLC NAND, which has more complex write cycles than (for example) SLC media, a topic we covered in this article from 2017. Contrast the above numbers with the Solidigm D7-P5810 announced in 2023, which uses SLC NAND. This drive delivers 865,000 random read IOPS and 490,000 random write IOPS, a ratio of just 1.75:1. This drive also has latency data specifications included, which we don’t see on the high-capacity QLC drives (another differentiating factor).
Now, these high-capacity QLC drives are all marketed as being suited to “read intensive” workloads, including for the technology de-jure – artificial intelligence. So, it is a reasonable approach to use drives like the D5-P5336 for applications with a bias towards random read I/O. But, these drives only offer a relatively low level of endurance – 0.60 DWPD for the D5-P5336 compared to 50 DWPD for the D7-P5810 – and this has an effect on how we use these drives for general application workloads. (Side note: they probably also have higher latency, but that data isn’t usually quoted for capacity SSDs).
Hierarchy
The NAND flash and SSD market is evolving into a broader range of products, each of which has different characteristics. As the technology has evolved from SLC to MLC, TLC, and now QLC, there has been a trade-off in performance and endurance for gains in capacity. SSD vendors, including Solidigm, have created families of products which target different I/O profiles, such as low latency, high endurance or read-intensive workloads. Western Digital discussed this diversity in a recent online event (The New Era of NAND), decrying the end of the “layer race” to produce ever-denser products. In that presentation, the company discusses client SSDs, performance SSDs and capacity SSDs as the three main categories of products.
Essentially, as NAND has matured, the one-dimensional nature of flash we saw in the early 2010s is now a multi-dimensional range of products that, by necessity of design, must be applied to a range of disparate workloads. When did we see this scenario last? With the technology SSDs are looking to displace – the hard disk drive.
HDD Hierarchy
Looking back at the history of the hard disk drive, we can see an evolution from monolithic 5.25” models towards an increasing range of form factors, including 3.5” drives, 2.5” drives and even 1” models that were used in early Apple iPods. At the same time, the HDD market introduced faster spinning media, from 5.4K to 15K RPM.
The desire to increase drive areal density has also seen new recording technologies such as PMR, SMR, MAMR and now HAMR. Over time, the HDD market has transitioned purely into capacity products, with the performance segment yielded to SSDs.
I/O density is an issue we’ve been discussing for literally decades. The performance of hard drives has hardly changed over the last two decades, as the speed of reading data from a device is based on the rotational speed of the drive and the linear density of bits on each track. Totally random read I/O has a latency that is a function of the seek time (time taken to move the read head to the track to be read) plus half the time to rotate the disk to the right point to read data (on average). As a result, increasing drive capacities have reduced the I/O density.
During the heyday of the HDD and during the initial introduction of SSDs, storage systems vendors built solutions that used multiple media capacities and speeds. This approach was taken to optimise the placement of data onto the most cost and performance-efficient location possible. “Spindle count” was a concern, as, for example, vendors replaced 146GB drives with 300GB models, effectively halving the I/O capability of a system for the same capacity of storage.
Multiple media capacities were managed via tiering, while caching smoothed out issues of write I/O performance. These challenges were mostly eliminated with the introduction of “monolithic” flash products where the read/write performance was symmetric. In fact, NetApp CEO Tom Georgens decried the death of tiering in an earnings call in 2010.
Choices
As solid-state disks begin to mimic the characteristics of hard drives, the storage industry is once again revisiting the challenges experienced with the increase in capacity of hard disk drives. We can see several scenarios playing out.
- Tiering returns – tiering comes back into products using a mix of fast and capacity flash. This approach isn’t desirable and is something Pure Storage, for example, is totally against. Tiering is a compromise and introduces wasted I/O spent moving data between layers of media on some reactive assumption of future performance requirements.
- Architect better storage systems – another approach already in play is to build systems capable of exploiting new media more effectively. VAST Data achieved that with its Data Platform, wide striping read I/O and landing all write I/O onto fast media (initially Optane, now SLC NAND). StorONE is another example of a vendor that designed for new media, using a similar approach to manage read and write I/O independently. Infinidat, as yet another example, built an architecture with InfiniBox to mitigate the issues of slow hard drives that is equally applicable to large-capacity flash drives.
- Architect better storage media – another solution is to build tiering capabilities into the media itself. This process already occurs within devices that have some SLC capacity. However, the capability could be extended to create a variable SLC area or use NVMe namespaces to expose multiple media types to the storage system host. We believe that Pure Storage must be doing some kind of media management with Direct Flash Modules that is an analogue to these suggestions.
- Use new media – we could replace NAND flash altogether. Intel tried this approach with 3D-Xpoint and the Optane brand, but it wasn’t a success. There are many other persistent memory technologies on the market or on the horizon, but none so far that have the capacity and cost benefits of flash (remember the consumer market demand for flash made it possible to adopt it for the enterprise).
Another option, which is probably part of the tiering option, is to use software such as that announced by Solidigm in September 2023. CSAL effectively exposes the flash translation layer (FTL) to the host through the use of a virtual device driver, creating a logical aggregate drive. We discussed the concept at the time in this post.
The Architect’s View®
They say that there is nothing new in technology and all new ideas are just recycled old ones. While that is a gross generalisation, it is clear that the issues of scaling experienced in the HDD market will, once again, re-occur with NAND flash. There will not be one storage solution to fit all performance and capacity requirements.
With all our experience and the capabilities of modern server platforms, we believe there should be no need for a return to tiering. Vendors introducing the tiering model into their flash platforms will simply be taking the easy route and reverting to past behaviour.
While we’ve seen 30TB drives being used within storage systems, to our knowledge, no vendor has used 61TB drives. This will be an area to watch closely and understand how vendors will make these SSDs work in large-scale systems. We will cover this topic in more detail in our 2025 predictions, but for now, it looks like we should expect a range of vendor solutions to the “problem” of high-capacity NAND flash media.
Related Posts
- The Expanding Storage Hierarchy
- Editorial: Hyperscalar Storage – Build, Buy or Acquire?
- Managing Massive Media
- Caching vs Tiering
Copyright (c) 2007-2024 – Post #2dc1 – Brookend Limited, first published on https://www.architecting.it/blog, do not reproduce without permission.

