AWS Graviton processors have improved steadily across generations, with each iteration delivering advances in computational performance, price performance, energy efficiency, and memory capacity. Today, Amazon announced the general availability of the new M9g and M9gd instances of its Elastic Compute Cloud (EC2), for general-purpose workloads. These are the first Amazon products powered by Graviton5, the latest generation of Amazon’s CPU.
After five generations of custom silicon and eight years of continuous investment, Graviton powers over 350 instance types that are suitable for workloads including web applications, microservices, analytics, databases, machine learning inference, electronic design automation, gaming, video encoding, and agentic AI.
Graviton5 doubles the number of cores from Graviton4, from 96 to 192, and it supports DDR5-8800 memory and the latest PCIe gen6 interconnects. We’ve worked closely with leading DRAM manufactures to meet the DDR5-8800 level of performance, and AWS Graviton instances deliver the fastest memory of any processor instances in the cloud.
With Graviton5, Amazon also moved to a three-nanometer process, enabling greater circuit density and faster on-chip communication. Not only does Graviton5 pack in more cores than Graviton4, but each of those cores offers 25% better performance.
We've talked for a while about how micro benchmarks are very different from big, real-life workloads, and we design for our customers’ actual workloads — not small loops but all the code and complexity of a real application like a database.
To execute code quickly, modern processors predict branches that come from control flow in programs and speculatively execute the predicted paths. The Neoverse V3 core used in Graviton5, codefined by Arm and Amazon’s Annapurna Labs, substantially improves the branch prediction capability of the CPU, and that in turn makes it able to execute real applications like databases up to 30% better.
The DRAM of a CPU can be about 100 nanoseconds away. That doesn’t sound like a lot, but for a CPU that runs at 3.3 gigahertz, one memory access takes 330 cycles. CPUs use caches to bring data closer to the CPU, and when a request can be fulfilled from one of these caches, the CPU doesn’t have to wait for the full DRAM latency. Graviton5 has 64-kilobyte first-level caches, two-megabyte second-level caches, and 192 megabytes of level-three cache — more than five times as much as the previous generation of Graviton.
Graviton3 was the first Graviton CPU to adopt a chiplet architecture, using seven dies across cores, DRAM controllers, and PCIe controllers. Graviton4 followed the same architecture as Graviton3, with a few refinements.
However, in Graviton5, we’ve changed it substantially: the 192 cores in Graviton5 are split across four chiplets, with each chiplet containing DRAM controllers, PCIe controllers, and 48 cores, with custom die-to-die connectivity that provides up to 420 gigabytes per second of bandwidth between chiplets, minimizing latencies between cores in the mesh. There is no longer a separate I/O die nor a separate DRAM controller die. This organization allows us to configure two or four nonuniform-memory-access (NUMA) regions per chip and partition the size of the L3 cache to the size of the virtual machines (VMs) running on the CPU while reducing memory latency for VMs that are 48 cores or smaller.
With these enhancements, Graviton5 offers up to 25% better computational performance than Graviton4-based instances, with up to 35% faster performance for web applications, up to 35% for machine learning inference, and up to 30% for databases.
The M9g and M9gd instances that are powered by Graviton5 are also raising the bar on security even further with the introduction of the Nitro Isolation Engine. The Nitro Isolation Engine is an enhancement to the Nitro System, which enforces isolation of instances and harnesses formal verification to provide assurances of isolation with mathematical precision. The Nitro Isolation Engine is a purpose-built component that is responsible for enforcing isolation between VMs, including mediation of all access to VM memory, CPU register state, and I/O devices through a minimal set of APIs. The Nitro Isolation Engine leverages formal verification, a technique for mathematically demonstrating that hardware or software behaves as intended, and not just in specific test cases. This intensive verification establishes Nitro as the first formally verified cloud hypervisor, pioneering a new standard for mathematically proven cloud security. To learn more about the Nitro Isolation Engine, read the Amazon Science blog post.