Intel 5-level paging: Difference between revisions

Content deleted Content added
move Implementation section down
GreenC bot (talk | contribs)
Rescued 1 archive link. Wayback Medic 2.5 per WP:URLREQ#anandtech.com
 
(9 intermediate revisions by 6 users not shown)
Line 1:
{{short description|Processor extension for the x86-64 line of processors}}
{{Use dmy dates|date=August 2018}}
 
[[File:Page Tables (5 levels).svg|thumb|A diagram of five levels of paging]]
'''Intel 5-level paging''', referred to simply as ''5-level paging'' in [[Intel]] documents, is a processor extension for the [[x86-64]] line of processors.<ref name="intel-white-paper">{{Cite web|url=https://www.intel.com/content/www/us/en/content-details/671442/5-level-paging-and-5-level-ept-white-paper.html|title=5-Level Paging and 5-Level EPT|publisher=Intel Corporation|date=May 2017}}</ref>{{Rp|11}} It extends the size of [[virtual address]]es from 48&nbsp;bits to 57&nbsp;bits by adding an additional level to x86-64's [[Page table#Multilevel page tables|multilevel page tables]], increasing the addressable [[virtual memory]] from 256&nbsp;[[terabytetebibyte|TBTiB]] to 128&nbsp;[[petabytepebibyte|PBPiB]]. The extension was first implemented in the [[Ice Lake (microprocessor)|Ice Lake]] processors,.<ref name="anandtech-13699">{{Cite web|url=https://www.anandtech.com/show/13699/intel-architecture-day-2018-core-future-hybrid-x86/2|archive-url=https://web.archive.org/web/20190502215444/https://www.anandtech.com/show/13699/intel-architecture-day-2018-core-future-hybrid-x86/2|url-status=dead|archive-date=2 May 2019|title=Sunny Cove Microarchitecture: A Peek At the Back End|work=Intel's Architecture Day 2018: The Future of Core, Intel GPUs, 10nm, and Hybrid x86|last=Cutress|first=Ian|access-date=2019-10-15}}</ref> and the 4.14 [[Linux kernel]] adds support for it.<ref>{{Cite news|url=https://www.zdnet.com/article/first-linux-4-14-release-adds-very-core-features-arrives-in-time-for-kernels-26th-birthday/|title=First Linux 4.14 release adds "very core" features, arrives in time for kernel's 26th birthday {{!}} ZDNet|last=Tung|first=Liam|work=ZDNet|access-date=2018-04-25|language=en}}</ref> Windows 10 and 11 with server versions also support this extension in their latest updates, where it is provided by a separate kernel of the system called ntkrla57.exe.<ref>{{cite tweet|user=aionescu|number=1142637363840946176|title=Old farts like me will remember the days of ntoskrnl.exe, ntkrnlpa.exe, ntkrnlmp.exe and ntkrpamp.exe.}}</ref>
 
== Technology ==
[[File:X86 Paging 64bit.svg|thumb|right|555px|4-level paging of the 64-bit mode]]
In the 4-level paging scheme (previously known as [[IA-32e]] paging), the 64-bit virtual memory address is divided into five parts. The lowest 12 bits contain the offset within the 4 KiB memory page, and the following 36 bits are evenly divided between the four 9 bit descriptors, each linking to a 64-bit [[X86-64#Page table structure|page table entry]] in a 512-entry page table for each of the four paging levels. This makes it possible to use bitbits 0 through 47 in the virtual address, for a total of 256&nbsp;TBTiB. <ref name="x86-software-developers-manual" />{{Rp|page=4{{hyp}}2}}
 
[[File:Page Tables (5 levels).svg|thumb|A diagram of five levels of paging]]
5-level paging adds another 9 bit page table descriptor, making it possible to use bits&nbsp;0 through&nbsp;56. This multiplies the address space by 512 and increases the limit to 128&nbsp;PBPiB.
 
With 5-level paging enabled, bits&nbsp;57 through&nbsp;63 must be copies of bit&nbsp;56.<ref name="intel-white-paper" />{{Rp|17}} This is the same as with 4-level paging, where the high-order bits of a virtual address that do not participate in address translation must be the same as the most significant implemented bit.
The 5-level paging is enabled by setting bit 12 of the CR4 register (known as LA57).<ref name="intel-white-paper" />{{Rp|16}} This is only used when the processor is operating in 64 bit mode, and only may be modified when it is not.<ref name="intel-white-paper" />{{Rp|16}} If the bit is not set, or the 5-level paging feature is not supported, the processor uses the 4-level page table structure when operating in 64-bit mode.<ref name="x86-software-developers-manual" />{{Rp|page=4{{hyp}}22}} This is similar to [[Physical Address Extension]] (PAE), where the third level of paging tables to allow 36-bit addressing was enabled by setting a bit in [[Control register#CR4|the CR4 register]].<ref>{{Cite web|url=https://learn.microsoft.com/en-us/previous-versions/windows/hardware/design/dn613969(v=vs.85)|title=Operating Systems and PAE Support - Windows 10 hardware dev|last=Hudek|first=Ted|website=[[Microsoft Learn]]|date=June 2017 |language=en-us|access-date=2024-01-27}}</ref><ref name="x86-software-developers-manual" />{{Rp|page=4{{hyp}}14}}
 
Future processors may allow full 64-bit virtual address space by extending the size of page table descriptors to 12 bits (4096 page table entries) and memory offset to 16 bits (64 KiB page size) in the 4-level paging scheme or 21 bits (2 MiB page size) in the 5-level scheme.<Ref name=VA64/> Extending page table entry size from 64 to 128 bits would allow arbitrary page sizes, as additional hardware flags would change the size and operation of descriptors on lower paging levels.<Ref name=VA64/>
 
== Drawbacks ==
Adding another level of indirection makes [[page table]] "walks" longer.<ref>{{Cite conference|title=CSALT: Context Switch Aware Large TLB|book-title=MICRO-50: the 50th Annual IEEE/ACM International Symposium on Microarchitecture : proceedings |___location=Cambridge, MA|publisher=Institute of Electrical and Electronics Engineers., IEEE Computer Society., ACM Special Interest Group on Microprogramming|doi=10.1145/3123939.3124549|page=450|isbn=978-1-4503-4952-9|oclc=1032337814|date = 14 October 2017}}</ref> A page table walk occurs when either the processor's [[memory management unit]] or the memory management code in the operating system navigates the tree of page tables to find the [[page table entry]] corresponding to a virtual address.<ref>{{Cite web|url=http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0301h/I1026235.html|title=ARM Information Center|website=infocenter.arm.com|access-date=2018-04-26}}</ref><ref name="x86-software-developers-manual">{{Cite book|url=https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html|title=Intel® 64 and IA-32 Architectures Software Developer's Manual|volume=3A|publisher=[[Intel Corporation]]}}</ref>{{Rp|page=4{{hyp}}22}} This means that, in the worst case, the processor or the memory manager has to access physical memory six times for a single virtual memory access, rather than five for the previous iteration of x86-64 processors. This results in slightly reduced memory access speed.<ref name="cse-451-paging-tlbs-slides" /> In practice this cost is greatly mitigated by caches such as the [[translation lookaside buffer]] (TLB).<ref name="cse-451-paging-tlbs-slides">{{Cite web|url=https://courses.cs.washington.edu/courses/cse451/08au/lectures/10-paging_TLBs.pdf|title=CSE 451: Operating Systems: Paging & TLBs|last=Levy|first=Hank|author-link=Hank Levy (computer scientist)|date=Autumn 2008|website=[[University of Washington]]|access-date=26 April 2018}}</ref> Future extensions may reduce page walks by limiting virtual address space per application, with dedicated hardware flags in an extended 128 bit page table entry, and allowing a larger 64&nbsp;KiB or 2&nbsp;MiB [[Page (computer memory)|page size]]s and backward compatibility with 4&nbsp;KBKiB page operations.<Ref name=VA64>{{cite patent | country = US | number = 9858198 | status = patent | title = 64KB64KiB page system that supports 4KB4KiB page operation | pridate = 2015-06-26 | fdate = 2015-06-26 | pubdate = 2016-12-29 | gdate = 2018-01-02 | invent1 = Larry Seiler | assign1 = Intel Corp.}} </ref>
 
== Implementation ==
5-level paging is implemented by the [[Ice Lake (microprocessor)|Ice Lake]] [[microarchitecture]],<ref name="anandtech-13699"/> [[Epyc#Fourth_generation_Epyc_(Genoa,_Bergamo_and_Siena)|EPYC 9004 and 8004 Series Processors]]<ref>{{Cite web|url=https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/tuning-guides/58020-epyc-9004-tg-high-perf-toolchain.pdf|title=Tuning Guide for AMD EPYC™ 9004 Processors|publisher=[[AMD]]|date=September 2023}}</ref><ref>{{Cite web|url=https://www.amd.com/content/dam/amd/en/documents/products/epyc/4th-gen-amd-epyc-processor-architecture-whitepaper.pdf|title=4TH GEN AMD EPYC™ PROCESSOR ARCHITECTURE|publisher=[[AMD]]|date=May 2024}}</ref> and [[List_of_AMD_Ryzen_processors#Storm_Peak_desktop|Storm peak]] [[Ryzen#Threadripper_series|Ryzen Threadripper]] PRO 7900WX series.<ref>{{Cite web|url=https://github.com/InstLatx64/InstLatx64/blob/37bd7b1ac29d95ddcf1fed90efaf7e82c89ab65d/AuthenticAMD/AuthenticAMD0A10F81_K19_StormPeak_01_CPUID.txt#L81C17-L81C25|title=CPUID dump for 96-Core AMD Ryzen Threadripper PRO 7995WX (Storm Peak) Zen4|website=[[GitHub]] |date=October 19, 2023}}</ref>
 
The 4.14 [[Linux kernel]] adds support for it.<ref>{{Cite news|url=https://www.zdnet.com/article/first-linux-4-14-release-adds-very-core-features-arrives-in-time-for-kernels-26th-birthday/|title=First Linux 4.14 release adds "very core" features, arrives in time for kernel's 26th birthday|last=Tung|first=Liam|work=ZDNet|access-date=2018-04-25|language=en}}</ref>
Support for the extension was submitted as a set of patches to the [[Linux kernel]] on 8 December 2016.<ref name="phoronix">{{Cite web|url=https://www.phoronix.com/scan.php?page=news_item&px=Intel-5-Level-Paging|title=Intel Working On 5-Level Paging To Increase Linux Virtual/Physical Address Space - Phoronix|author=Michael Larabel|date=9 December 2016|website=[[Phoronix]]|language=en|access-date=2018-04-26}}</ref> As was reported on the [[Linux kernel mailing list]], it consisted of extending the Linux memory model to use five levels rather than four.<ref>{{Cite mailing list|url=http://lkml.iu.edu/hypermail/linux/kernel/1612.1/00383.html|title=[RFC, PATCHv1 00/28] 5-level paging|last=Shutemov|first=Kirill A.|mailing-list=[[Linux kernel mailing list]]|date=December 8, 2016|access-date=2018-04-26}}</ref> This is because, although Linux [[Abstraction (software engineering)|abstracts]] the details of the page tables, it still depends on having a number of levels in its own representation. When an [[Instruction set architecture|architecture]] supports fewer levels, Linux emulates extra levels that do nothing.<ref>{{Cite web|url=https://www.kernel.org/doc/gorman/html/understand/understand006.html|title=Page Table Management|website=www.kernel.org|access-date=2018-04-26}}</ref> A similar change was previously made to extend from three levels to four.<ref>{{Cite web|url=https://lwn.net/Articles/106177/|title=Four-level page tables [LWN.net]|date=October 12, 2004|website=lwn.net|access-date=2018-04-26}}</ref>
 
Windows 10 and 11 with server versions also support this extension in their latest updates, where it is provided by a separate kernel image called [[ntoskrnl.exe|ntkrla57.exe]].<ref>{{cite tweet|user=aionescu|number=1142637363840946176|title=Old farts like me will remember the days of ntoskrnl.exe, ntkrnlpa.exe, ntkrnlmp.exe and ntkrpamp.exe.}}</ref>
 
== References ==