QLC NAND – What can we expect from the technology?

QLC NAND – What can we expect from the technology?

Chris EvansAll-Flash Storage, NVMe

Since NAND flash storage was first introduced into enterprise computing, we’ve seen a rapid explosion in the types and capabilities of flash products that can now be deployed in servers, HCI solutions and storage arrays.  QLC NAND is the next evolution of cost reducing, space increasing flash technology.  What is it and what can we expect it to deliver?

Update: This post was originally published on 11 August 2017.  It has since been updated to reflect the advances in QLC technology in the market.

QLC 101

Endurance threshold for NAND flash

Flash storage works by storing an electrical charge in a cell and using the presence of this charge to determine whether the cell is storing a “0” or a “1”.  Flash devices lay down billions of cells on a silicon substrate that can then be used to store gigabytes and now terabytes of information.  The original flash design stored a single bit in one cell, otherwise known as SLC or Single Level Cell technology.

Flash developers soon worked out that a cell could store multiple states by having a range of voltages in each cell and so MLC was developed, with each cell storing four states and therefore capable of recording two bits of binary data – 00, 01, 10 and 11.  TLC (Triple Level Cell) extends this to eight states and values from 000 to 111.  QLC goes another step further, doubling the states and adding the ability to store an additional bit.  Now we can represent values from 0000 to 1111.

Increasing Density

There are a number of challenges that arise with increasing the bit density per cell.  First, writing to flash is a destructive process, with the integrity of each cell being slightly damaged when it is written.  This means that flash has a finite lifetime for writes – a feature called endurance.  Endurance is measured in the number of P/E (program/erase) cycles that can be performed on the NAND flash itself.  Cells are programmed in blocks, rather than individually, requiring some clever algorithms and software to manage updates.  This is one reason why flash is over-provisioned.

P/E cycles for SLC were around 100,000, MLC around 10,000 and TLC around 1,000, although this figure has been improved by vendors.  Each generation results in an order of magnitude worse endurance.  For QLC we were expecting the figures to be around the 100 range.  However, manufacturers have improved the resilience and we now see around 3,000 P/E cycles for TLC and 1,000 for QLC (figures taken from Micron Reviewers’ Day, August 2018).

Second, as more data is stored in each cell, the contents must be read before writing, because the change to a single bit requires knowing what value was already present.  A second effect is the change in voltage required to jump between bit states.  This can cause instability in surrounding cells, so vendors use multiple steps to program QLC (as explained by Toshiba at their stand).  As the technology has evolved, writing to SLC, MLC and TLC has therefore become progressively slower.  QLC is even worse than previous generations, which has an impact on latency more than throughput.

Why QLC?

It’s a simple question, why develop another, less reliable media? Surely what we have is good enough?  Like any storage technology, there’s a desire to condense more data into a smaller space at a lower cost.  Look at the improvements the hard drive manufacturers have been making for years.  The same applies to flash.  The move from TLC to QLC gains 1/3 more capacity for the same number of cells, allowing for greater densities and larger capacity drives.  The industry is driving down cost and increasing capacity every year with a range of technologies, many of which work together. TLC, for example, has been combined with 3D-NAND and we’re seeing the same for QLC.

The storage industry has also built diversification into storage media.  In the HDD market, we have performance and capacity drives, small format (2.5″ and 1.8″) and a range of cheaper solutions that use lower specification interfaces like SATA.  The same applies to flash.  At the technology has developed, we have seen a divergence of offerings covering high and low endurance, high and low performance and of course ranges in capacity.  With each of these, there’s a cost profile that matches the requirements to the price.  As an example, look at the latest Nytro 3000 drives from Seagate.  There are 43 models with varying capacity, endurance and features such as encryption.

Obviously, the only issue with QLC will be reduced endurance and that may be reflected in how the technology is used.

Who’s Doing QLC?

Both Toshiba and Western Digital have already announced the (see related links below) development of QLC, based on 64-layer 3D-NAND using their BiCS3 technology (BiCS is Bit Cost Scaling, Toshiba’s name for 3D-NAND).  This promises 768Gb (96GB) die capacity, which translates to 1.5TB in a chip when combining 16 die into a single package.  The manufacturing process for these chips has likely been reduced from the 1Xnm process seen in 2D scalar NAND to something more like 40nm, in order to get reliability.  We can then expect to see this shrink to increase capacities and with BiCS4 an increase to 96 layers.

Samsung has pre-announced a 1Tb die with the potential to turn their QLC product into a 128TB drive, stacking 32 die per package.  Further details seem scarce, however hopefully we will start to see more details on performance and endurance.  SK Hynix doesn’t appear to be jumping on QLC just yet, while Micron is apparently developing QLC using 64 layer technology (link) but I can’t find any more specific details at the moment.

Update: The previous paragraphs have been left in for context.  The next paragraphs provide an update as of August 2018 and announcements at Flash Memory Summit 2018.

Micron announced a QLC enterprise drive, the ION 5210 in May 2018.  This product was subsequently announced with general availability in November 2018.  This will sit alongside existing TLC products, providing a lower-cost read-intensive product that is aimed at read/write ratios of 70/30 and above.  More details are available on this separate post.  We also spoke to Steve Hanna on a recent edition of Storage Unpacked.  The podcast is available here.

Samsung has announced it is manufacturing 4TB QLC consumer SSDs using 1Tb V-NAND chips.  The press release claims that QLC will be no slower than TLC, by using something called TurboWrite technology.  Products are expected later in 2018.

Intel also has a consumer device, based on the M.2 2280 format.  The SSD 660p has capacities of 512GB, 1TB and 2TB, with relatively low 0.1 DWPD, but remember this is a consumer product.  No doubt we will see an enterprise version in time.

Toshiba announced a 96-layer QLC BiCS 1.33Tb die at FMS, however, this isn’t currently shipping as a product.  Theoretically this could produce QLC SSDs with around 42TB of capacity (before over-subscription).

The Architect’s View™

The range of NAND flash products is overwhelming as the industry races to produce ever faster, cheaper and more reliable products.  With such as range of flash to choose from, we’re going to see the emergence of more “hybrid all-flash” platforms that will once again take us into the tiering scenarios of previous shared storage.  What will be interesting to see is the developments that will be made by having high endurance flash as the cache layer and QLC as the capacity layer.  SDS solutions like VMware Virtual SAN already do this today, but perhaps at not a granular enough level that we could start seeing in shared storage platforms.

While capacity hard drives will continue for some time yet, QLC flash looks to squeeze out 10K drives, leaving 7.2K RPM bulk media as the main product line for the HDD manufacturers.  Existing 15K & 10K drives will probably become niche products, reserved for price-conscious solutions where flash is still too expensive.

What’s next after QLC?  PLC perhaps (pentuple – quintuple would be confusing)?  I suspect we’ve a way to go with the technology we have, so let’s see how TLC and QLC pan out first, rather than getting too far ahead of ourselves!

Related Links

Copyright (c) 2009-2022 – Post #3EA2 – Brookend Ltd, first published on https://www.architecting.it/blog, do not reproduce without permission.