Arm Architectures in the Public Cloud

Arm Architectures in the Public Cloud

Chris EvansCloud, Enterprise, Opinion, Processing Practice: CPU & System Architecture, Processors

Two years ago, back in January 2020, we asked the question as to whether Arm processors were ready for the enterprise data centre.  In this post, we look at what’s changed since then and how the balance of power could shift between Intel, AMD and Arm.

Background

The premise for an initial discussion justifying Arm in the enterprise came from the announcement of Graviton2 processors by AWS at Reinvent in December 2019.  The original Graviton design was first announced in November 2018.  AWS had already developed the Nitro hardware ecosystem, and the team responsible, Annapurna Labs (acquired 2015), looked at the possibility of developing SoCs (systems-on-chip). 

Graviton Evolution

Graviton1 is based on Cortex-A72 cores, whereas Graviton2 moved to the Neoverse N1 architecture based on Cortex-A76.  Graviton3 extends the capabilities of the N1 architecture (although AWS hasn’t expressly indicated what’s under the covers), with DDR5 and PCIe 5.0 support and greater than 25% performance improvement over Graviton2 (AWS numbers). 

The Graviton designs are SoCs, or systems-on-chip, meaning the processor, memory and I/O components all sit as part of one large die.  This construct results in greater performance than existing motherboard designs.  In image 1, you can see the physical layout of the chip, which has 64 cores and associated L3 cache, DDR5 memory at the outside and two PCIe 5.0 controllers below. 

Image 1

Internally, Graviton3 has architectural improvements over Graviton2, including double the SIMD capability, support for bfloat16 data, and generally increased processing capabilities.  There’s much debate in the industry as to whether Graviton3 is based on Neoverse N2 or V1 cores.  What’s clear is that each generation of Graviton introduces faster processing, faster memory and faster I/O. 

Why Arm?

As we’ve seen with the Apple transition to M1, Arm processors (or, more specifically, SoCs) offer better power/performance efficiency compared with traditional x86 processors.  In Apple’s case, having complete control over the hardware design to the processor level enables the company to deliver laptops and desktops that consume less power or last longer over the course of a working day.  Apple designs use a mix of performance and efficiency cores to create a hybrid architecture that can deliver a boost when needed but generally operates in an efficiency mode.

In the data centre, application requirements are somewhat different.  Efficiency and performance cores don’t make any economic or practical sense.  This is because servers already consume significant power and run mixed workloads, potentially from multiple customers.  Assigning the appropriate core would be a complex task (and potentially expose security concerns).  In contrast, desktop and portable systems are owned by a single user.  However, the flexibility in design offered by SoCs using Arm means cloud vendors like AWS have the opportunity to build solutions optimised very specifically for multiple workload profiles. 

Graviton3 Instances

AWS has packaged Graviton3 in a form-factor supporting three SoCs per server, with one Nitro card supporting all three simultaneously.  This design offers greater rack density and the opportunity to use all the space available in 42U.  C7g instances, powered by Graviton3, are aimed at compute-intensive workloads and offer a claimed 25% performance increase over Graviton2-based C6g instances. 

For the end-user, widespread Linux support means moving to Graviton3 instances should be relatively straightforward.  The biggest challenge for developers is to determine the relative performance improvements compared to x86 and ensure applications take full advantage of the capabilities available.

Arm-based Services

However, for AWS, there’s a bigger story to tell.  As we discussed in this post from March 2021, EC2 instances are one building block for more complex services, specifically managed databases and analytics.  Graviton2 is already supported on Amazon DocumentDB, Aurora, RDS, Elasticache, MemoryDB and Neptune.  Analytics application support includes OpenSearch and EMR.  Graviton2 can be used on Fargate and Lambda serverless applications.  The recent announcement of FSx for OpenZFS also supports Graviton2.

In many cases, these managed services default to Graviton2 unless the customer indicates differently.  We can also assume that many back-end services use Graviton2 instances. 

AWS Future

What does the adoption of Arm mean for AWS?  Two years ago, the use of Arm and Graviton seemed tactical.  Today, with the rapid improvement in performance, the move by AWS appears to be more about control and the ability to highly optimise their ecosystem for cost, space, power and cooling.

As usual, AWS has played a long game.  Nitro offloads I/O intensive tasks like networking and storage.  Graviton has enabled AWS to bring new platform components like DDR5 and PCIe 5.0 to EC2 quicker than waiting for Intel or AMD to deliver the transition on their behalf.  We’re now seeing the benefits of this long-term strategy, which still has some way to go.

Cloud Future

What about the other public cloud vendors?  Microsoft Azure is rumoured to be developing custom chips, but nothing has emerged yet.  Google hired Uri Frank in March 2021 (announcement here) to drive the internal development of new custom chip designs.  So far though, we’re no closer to understanding what those products might be.  Elsewhere Oracle Cloud has instances based on the Ampere Altra processor, but that seems to be the only other implementation in the market.  So AWS remains the leader and only player in the widespread transition to Arm. 

The Architect’s View™

In November 2020, we highlighted Apple’s move to custom silicon-based on Arm designs.  At this point, AWS’ strategy was well underway, but the Apple announcement underscored the performance capabilities of Arm compared to perceived wisdom from ten years ago.  Arm is no longer the slower, more efficient option.

Server architectures are being deconstructed, with SmartNICs offloading tasks that would usually be handled by the central CPU.  Arm-based instances will be able to handle general application workloads more efficiently than x86 solutions.  Graviton3 demonstrates an increased capability to manage AI-intensive tasks.  Vendors such as NVIDIA are looking to compete with hybrid designs like Grace

At the outset of Cloud Computing, the unassailability of the x86 architecture wasn’t in doubt.  However, as we move further into the 2020s, Intel and AMD have a battle on their hands.  Both of these vendors need new designs to compete with those from Arm.  However, one of the biggest hurdles for the x86 vendors to overcome could be the very nature of the way these solutions are packaged.  Arm licenses designs for customers to build themselves.  There’s no licensing model (yet) for x86 that would enable AWS or other cloud providers to build custom designs.  As a result, cloud providers using Arm can move at their pace rather than at the release cadence of Intel or AMD.  We’ve seen this already – Graviton has released three versions in three years. 

Although the public cloud remains very hardware-focused, we believe that in the long term, customers will not care about the underlying processor architecture for general computing.  Instead, the efficiency and cost of executing application code will be of greater importance.  AWS is chipping away at x86 dominance, and other cloud vendors are likely to follow suit.  The next decade is going to be very challenging for Intel and AMD unless both or either company can catch up very quickly. 


Copyright (c) 2007-2022 – Post #ade6 – Brookend Ltd, first published on https://www.architecting.it/blog, do not reproduce without permission.