ARM Cortex-X1
The ARM Cortex-X1 is a central processing unit implementing the ARMv8.2-A 64-bit instruction set designed by ARM Holdings' Austin design centre as part of ARM's Cortex-X Custom (CXC) program.[1][2]
General information | |
---|---|
Launched | 2020 |
Designed by | ARM Ltd. |
Performance | |
Max. CPU clock rate | to 3.0 GHz in phones and 3.3 GHz in tablets/laptops |
Address width | 40-bit |
Cache | |
L1 cache | 128 KiB (64 KiB I-cache with parity, 64 KiB D-cache) per core |
L2 cache | 512–1024 KiB per core |
L3 cache | 512 KiB – 8 MiB (optional) |
Architecture and classification | |
Microarchitecture | ARM Cortex-X1 |
Instruction set | ARMv8-A: A64, A32, and T32 (at the EL0 only) |
Extensions | |
Physical specifications | |
Cores |
|
Products, models, variants | |
Product code name(s) |
|
Variant(s) | |
History | |
Successor(s) | ARM Cortex-X2 |
Design
The Cortex-X1 design is based on the ARM Cortex-A78, but redesigned for purely performance instead of a balance of performance, power, and area (PPA).[1]
The Cortex-X1 is a 5-wide decode out-of-order superscalar design with a 3K macro-OP (MOPs) cache. It can fetch 5 instructions and 8 MOPs per cycle, and rename and dispatch 8 MOPs, and 16 µOPs per cycle. The out-of-order window size has been increased to 224 entries. The backend has 15 execution ports with a pipeline depth of 13 stages and the execution latencies consists of 10 stages. It also features 4x128b SIMD units.[3][4][5][6]
ARM claims the Cortex-X1 offers 30% faster integer and 100% faster machine learning performance than the ARM Cortex-A77.[3][4][5][6]
The Cortex-X1 supports ARM's DynamIQ technology, expected to be used as high-performance cores when used in combination with the ARM Cortex-A78 mid and ARM Cortex-A55 little cores.[1][2]
Architecture changes in comparison with ARM Cortex-A78
- Around 20% performance improvement (+30% from A77)[7]
- 30% faster integer
- 100% faster machine learning performance
- Out-of-order window size has been increased to 224 entries (from 160 entries)
- Up to 4x128b SIMD units (from 2x128b)
- 15% more silicon area
- 5-way decode (from 4-way)
- 8 MOPs/cycle decoded cache bandwidth (from 6 MOPs/cycle)
- 64 KB L1D + 64 KB L1I (from 32/64 KB L1)
- Up to 1 MB/core L2 cache (from 512 KB/core max)
- Up to 8 MB L3 cache (from 4 MB max)
Licensing
The Cortex-X1 is available as SIP core to partners of their Cortex-X Custom (CXC) program, and its design makes it suitable for integration with other SIP cores (e.g. GPU, display controller, DSP, image processor, etc.) into one die constituting a system on a chip (SoC).[1][2]
Usage
- Samsung Exynos 2100[8]
- Qualcomm Snapdragon 888(+)[9]
- Google Tensor[10]
See also
- ARM Cortex-A78, related high performance microarchitecture
- ARM Neoverse V1 (Zeus), server sister core to the Cortex-X1
- Comparison of ARMv8-A cores, ARMv8 family
References
- "Introducing the Arm Cortex-X Custom program". community.arm.com. Retrieved 2020-06-18.
- Ltd, Arm. "Cortex-X Custom CPU program". Arm | The Architecture for the Digital World. Retrieved 2020-06-18.
- Frumusanu, Andrei. "Arm's New Cortex-A78 and Cortex-X1 Microarchitectures: An Efficiency and Performance Divergence". www.anandtech.com. Retrieved 2020-06-18.
- "Arm Cortex-X1: The First From The Cortex-X Custom Program". WikiChip Fuse. 2020-05-26. Retrieved 2020-06-18.
- McGregor, Jim. "Arm Unleashes CPU Performance With Cortex-X1". Forbes. Retrieved 2020-06-18.
- "Arm Cortex-X1 and Cortex-A78 CPUs: Big cores with big differences". Android Authority. 2020-05-26. Retrieved 2020-06-18.
- "Cortex-X1 – Microarchitectures – ARM – WikiChip". en.wikichip.org. Retrieved 2021-02-13.
- "Exynos 2100 5G Mobile Processor: Specs, Features | Samsung". Samsung Semiconductor. Retrieved 2021-01-13.
- "Qualcomm Snapdragon 888 5G Mobile Platform | Latest 5G Snapdragon Processor | Qualcomm". www.qualcomm.com. Retrieved 2021-01-13.
- Amadeo, Ron (2021-10-19). "The "Google Silicon" team gives us a tour of the Pixel 6's Tensor SoC". Ars Technica.