AMD Zen 3 Architecture + SoC Design
AMD Zen 3 Architecture
Codename Vermeer, Zen 3 is the next evolution of the Zen architecture, delivering a 19% improvement in instructions per clock (IPC) through these improvements :
- Faster fetching, especially for branchy and large-footprint code
- L1 branch target buffer doubled in size to 1024 entries for better prediction latency
- Improved branch predictor bandwidth
- Faster recovery from misprediction
- “No bubble” prediction capabilities to make back-to-back predictions more quickly and better handle branchy code
- Faster sequencing of op-cache fetches
- Finer granularity in switching of op-cache pipes
- Reduce latency and enlarge structures to extract higher instruction-level parallelism (ILP)
- New dedicated branch and st-data pickers for integer, now at 10 issues per cycle (+3 vs. Zen 2)
- Larger integer window at +32 vs. Zen 2
- Reduced latency for select float and int operations
- Floating point has increased bandwidth by +2 for a total of 6-wide dispatch and issue
- Floating point FMAC is now 1 cycle faster
- Larger structures and better prefetching to support the enhanced execution engine bandwidth
- Overall higher bandwidth to feed the appetite of the larger/faster execution resources
- Higher load bandwidth vs. Zen 2 by +1
- Higher store bandwidth vs. Zen 2 by +1
- More flexibility in load/store operations
- Improved memory dependence detection
- +4 table walkers in the TLB
- Reduce dependency on main memory accesses, reduce core-to-core latency, reduce core-to-cache latency.
- Unify all cores in a CCD into a single unified complex consisting of 4, 6, or 8 contiguous cores
- Unify all L3 cache in a CCD into a single contiguous element of up to 32MB
- Rearchitect core/cache communication into a ring system
AMD Zen 3 SoC Design
In addition to micro architectural improvements, Zen 3 (Vermeer) also features SoC design changes.
In Zen 2, each CCD (Compute Die) is made up of two CCX (core complexes), each with a 16 MB L3 cache.
Zen 3 uses a unified complex, in which each CCD now contains a single CCX with a unified 32 MB L3 cache.
This unified CCD design eliminates CCX-to-CCX communication, greatly improving core-to-core latency.
On the other hand, AMD reused the chiplet design, with one or two CCDs (fabricated on 7 nm) paired with a 12 nm IOD (I/O Die).
Reads from CCD to IO are still 2X write, to conserve die area and transistor budget. And it uses the same IOD from Matisse (Zen 2).
The new Zen 3 CCD has 4.15 billion transistors, with a die size of 80.7 mm². The Matisse-era IOD remains the same – 2.09 billion transistors, with a die size of 125 mm².
AMD Ryzen 7 5800X Benchmarking Notes
In this review, we will take a look at the content creation and gaming performance of the AMD Ryzen 7 5800X, comparing it to 6 other processors :
- AMD Ryzen 7 3700X
- AMD Ryzen 7 2700X
- Intel Core i7-8700K
- AMD Ryzen 5 5600X
- AMD Ryzen 5 2600X
- AMD Ryzen 3 3300X
|AMD Ryzen 7 5800X||8 / 16||3.8 GHz||4.7 GHz||4 MB||32 MB||DDR4-3200|
|AMD Ryzen 7 3700X||8 / 16||3.6 GHz||4.4 GHz||4 MB||32 MB||DDR4-3200|
|AMD Ryzen 7 2700X||8 / 16||3.7 GHz||4.3 GHz||4 MB||16 MB||DDR4-2933|
|AMD Ryzen 5 5600X||6 / 12||3.7 GHz||4.6 GHz||3 MB||32 MB||DDR4-3200|
|Intel Core i7-8700K||6 / 12||3.7 GHz||4.7 GHz||1.5 MB||12 MB||DDR4-2666|
|AMD Ryzen 5 2600X||6 / 12||3.6 GHz||4.2 GHz||3 MB||16 MB||DDR4-2933|
|AMD Ryzen 3 3300X||4 / 8||3.8 GHz||4.3 GHz||2 MB||16 MB||DDR4-3200|
Here are the specifications of the Intel and AMD testbeds we used.
|Intel Testbed||AMD Testbed|
|Motherboard||ASUS ROG Strix Z370-F-Gaming||ASUS ROG Crosshair VIII Hero|
|Memory||G.SKILL Sniper X DDR4-3400 (8 GB x 2)|
Corsair Vengeance LPX DDR4-3200 (8 GB x 2)
|Graphics||NVIDIA GeForce RTX 2080 SUPER (GeForce 457.09)|
|Storage||1 TB SanDisk Ultra 3D SSD|
|OS||Microsoft Windows 10 (64-bit)|
Support Tech ARP!