Feature: Intel’s 12th Generation Core processor technology explained

Intel 12th generation Core processors usher in a new hybrid microarchitecture design along with a new platform with DDR5 and PCI 5.0 capabilities, and we will explain how it all works and combines to create Intel’s most advanced platform to date.

Intel decided to utilize two types of cores for its new 12th generation core processors, code named “Alder Lake”. Intel decided that for its newest processor it would utilize two micro architectures; “Golden Cove” and “Gracemont”. A short history lesson first. The Golden Cove microarchitecture is ultimately based on Sunny Cove, which was developed as a successor to Skylake (Intel’s longest running microarchitecture, 6th generation). Sunny Cove was developed by the same team (Haifa) that produced Skylake, which is known for developing microarchitectures that offer huge leaps in performance. Sunny Cove delivers roughly 18% more IPC than Skylake at the same frequency. Alder Lake is based on the “Intel 7” node, which was previously known as Intel’s 10nm ESFIN node. Intel’s 11th generation processors (“Rocket Lake”) used Cypress Cove, which was also based on Sunny Cove, but used Intel’s 14nm technology. Alder Lake’s Performance Cores (P-Cores) are high throughput performance cores designed to tackle very intensive tasks. In conjunction with P-Cores, Intel also has added efficient cores (E-Cores) on the same die. The E-Cores are low power efficient cores used to compliment the P-Cores, and use Intel’s Gracemont microarchitecture found in Atom processors. The E-cores are also made with the Intel 7 node and are developed to be somewhere on par with Skylake cores offering strong multi-threaded performance. To help the system determine which cores should be responsible for each task, Intel developed Thread Director.

Intel’s 12 Generation processors introduce a new hybrid landscape to the Core family, but they also require a new socket (LGA 1700), which means a new motherboard generation. Not only do we get a new motherboard chipset, we also get DDR5 and PCI-E 5.0, two next-generation protocols never before introduced to the consumer desktop market. While PCI-E is backwards compatible, DDR5 is not, but Intel did implement the ability for Alder Lake CPUs to use DDR4 if the motherboard supports it. To top that all off, Windows 11 launch coincides with Alder Lake launch and is considered the optimal operating system for Alder Lake. While new CPU and chipset launches are complex, they are also common. Adding in two brand new connection protocols for the CPU as well as a new optimized OS produces the perfect storm, and Intel is tackling it effectively along with its partners.

Compared to Skylake, the E-cores can offer up to 40% more single threaded performance at the same power or can produce the same performance at 40% of the power a Skylake core might use. Four E-cores offer 80% more performance while consuming less power than two Skylake cores running four threads. Putting it differently, they offer the same performance throughput while consuming 80% less power.

The performance monitoring unit (PMU) inside Alder Lake offers hardware telemetry and Intel says it the most advanced in the industry. Thread Director monitors and provides the OS scheduler with workload information so it can better place threads on the right cores, and uses the PMU as its secret weapon. Thread Director monitors instruction mix, state of core, and other telemetry and helps the OS to make scheduling decisions. Adding the hardware telemetry is what allows it to better help schedule threads.

New cache architecture is also part of the new design. The L3 cache is shared among P, E, and integrated graphics cores. Both L3 and L2 cache sizes have been increased to decrease latency and improve performance. Each cluster of four E-cores shares a common cache.

The chipset (platform controller hub/PCH) now offers up to x4 PCI-E 4.0 lanes. There is DDR4 support depending on the motherboard and chipset. The CPU itself offers x16 PCI-E 5.0 lanes and as we see on many motherboards these lanes are typically hard wired to the first PCI-E x16 slot. The switching and repeating hardware chips required to move around PCI-E 4.0 over longer distances and even switch between slots is very limited and expensive, and with PCI-E 5.0 at double the throughput compared to PCI-E 4.0, it is currently impractical.

The DMI bus is now x8 DMI 4.0, which doubles the bandwidth of the CPU to PCH link (probably to support the x4 PCI-E 4.0 lanes). Intel’s Volume Management Device, which was only avalible for enterprise is now available on the client platform. It allows direct control to NVMe devices through PCI-E bus without RAID or other hardware adapter.

Thread Director will place low workload threads on E-Cores so they do not take up P-Cores. Intel TD allows the OS to maximize where to put the work by providing a missing piece of the puzzle. The advanced hardware telemetry reduces any confusion the OS might have as to what cores to place workloads on. It is a real-time digital sorting system. With Windows 11 you get the latest Intel has to offer in regard to hybrid platform optimizations, but that also leaves us with questions regarding other OSes, including Windows 10. Windows 10 implements an early version of TD and uses core performance and efficiency information but does not include software-thread specific feedback like Windows 11. Support for Linux and ChromeOS are in development. Linux’s ITMT scheduler can recognize the difference between the P and E cores and can utilize Intel’s Speed Shift Technology to allow to better P and E-Core scheduling.

Intel’s Thread Director Monitor demonstrates real time P-Core and E-Core sorting. By selecting the type of thread to be produced, the program will send them to the CPU using TD to schedule the thread. Green threads are typical high-demand threads produced, light blue are background threads, blue are vector floating point operations, and yellow are AVX2 VNNI operations. Starting out the green were all on the P-Cores, then when the vector floating point (blue) were introduced the some green were bumped to the E-Cores, and when AVX 2 threads (yellow) were introduced they pushed over green and some blue to the E-Cores. Light blue background threads fell on the E-Cores. In summation, Intel’s Thread Director matches thread priority hierarchy with appropriate cores using existing methods plus advanced hardware telemetry. It is required to reap the full benefits of the hybrid CPU.

We see a steady jump in performance over 10th with both 11th and 12th generation processors. The E-cores seem to be on par with Skylake.

These results are between a 11900K (8 cores, 16 threads) and a 12900K (8+8 cores, 24 threads). We can see that power efficiency (65W vs. 250W) is better at 1.0x performance. With unrestricted power envelope restrictions, performance is around 50% better.

While we are talking about power definitions; Intel is redefining them. Intel is redefining Thermal Design Power (TDP). It is now replaced by Processor Base Power (PBP) and Maximum Turbo Power (MTP). MTP will be equivalent to PL2. Intel 12th Generation K-SKUs will, by default, enabled constant (sustained) Maximum Turbo Power, so there isn’t any duration of a power burst if the processor hardware telemetry allows. If you overheat your processor it will still downclock itself.

Shifting gears into new overclocking features, we can start with Intel’s added hardware improvements to the CPU package. The die z-height (upwards thickness) has been reduced by 25%, the solder thermal interface material (STIM) thickness has been reduced by 15%, and the copper internal heat spreader (IHS) is now 15% thicker. These changes will improve the thermal performance of the CPU package, or might possibly just keep it on par with previous offerings. It was only a few years ago some overclockers decided to not only melt the STIM and remove the copper IHS for direct die, but also sand down the top layer on the actual CPU die to improve thermals.

Intel has introduced many new overclocking features for this new 12th generation Core processors including new knobs for DDR5 and E-Cores. DDR5 overclocking additions include XMP 3.0 for DDR5, Dynamic Memory Boost, and synthetic internal BCLK. Intel Xtreme Tuning Utility (IXTU) has been updated to version 7.5. Intel will allow memory overclocking beyond Z-chipset motherboards.

Some clock ratios/multipliers have been increased, allowing the possibility of double Intel specified frequencies. A new E-Core ratio has been added to control the frequency of each cluster of four E-cores. Ring and cache frequency are still there. Internal BCLK overclocking is available, but an external discrete clock generator can offer a different BCLK clock. External PEG/DMI clock can be utilized as well in case you want to uncouple from the CPU BCLK clock.

New PPL (phase lock loop) overclocking modes have been added, so higher frequency can be reached on world record runs. BCLK voltage/frequency curve is also present and automated, PEG/DMI overclocking is available, and TjMax (thermal junction) offset is still there as well. AVX offset and AVX disable control is present.  You can also selectively disable hyper-threading on each individual core to help reach maximum frequency for world records.

The new Intel IXTU 7.1 adds E-Core control and telemetry monitoring. The XTU Benchmark has also been re-integrated. Real-time memory and system logging as now present.

Intel’s speed optimizer is a one-click option for overclocking, and if you have an i9 K-SKU it will raise CPU P-Cores by 100MHz and E-Cores 300MHz. Intel hinted at this capability for i5 and i7 K-SKUs down the road at the time of this presentation.

Intel introduced XMP almost 15 years ago, and maintains an XMP testing database on their website to know what kits work on which motherboards and with which BIOS firmware versions.

XMP 3.0 offers new capabilities and features. There are up to five choices, two are customizable while three are from the vendor. You can also rename your profiles to be more descriptive, but with a limit of 16 characters. The VDD, VDDQ, and VPP voltages are internally derived from an input voltage provided by the motherboard.

Intel has also enabled new DDR5 sandbox creations; the one above is an example from Corsair.

XMP 3.0 offers a lot of advantages over previous versions. There are now 184 bytes allocated for XMP SPD.

The last thing we will cover is Intel’s Dynamic Memory Boost Technology, which is like a turbo mode for your memory. With real-time memory frequency capabilities of the platform boosting is now possible without restarting and manually setting it and on-demand. It can switch between JEDEC (4800Mhz) and whatever the XMP definition is on your kit. We will have a review up of the 12th generation Core i9 12900K and 12600K soon.

Leave a Reply

Your email address will not be published.

Steve

Steve

Breaking News

Steve's Reviews