X86 memory segmentation: Difference between revisions

Content deleted Content added
Real mode: OMG, turns out, per en.wikipedia.org/w/index.php?title=X86_memory_segmentation&diff=prev&oldid=441344024 the 1 byte was NOT a typo. So that's where Joh got it from? I CAN actually understand what OR made the OP think 1 byte, but that's not a real (mode) segment size, it's sectionable! Just bc you're not USING all 16 bytes doesn't mean the segment wasn't 16 bytes. If every "1-byte segment" MUST be followed by a 15-byte gap before the next starts, THAT'S ACTUALLY A 16-BYTE SEGMENT!
no sentence
Tags: Visual edit Mobile edit Mobile web edit Advanced mobile edit
 
(46 intermediate revisions by 8 users not shown)
Line 3:
{{short description|Memory segmentation on Intel x86}}
{{Use dmy dates|date=May 2019|cs1-dates=y}}
The'''x86 memory segmentation''' is a term for the kind of [[memory segmentation]] characteristic of the Intel [[x86]] computer [[instruction set architecture]]. The x86 architecture has supported [[memory segmentation]] since the original [[Intel 8086]] in (1978.), Itbut allows''x86 programsmemory tosegmentation'' addressis morea thanplainly 64&nbsp;KBdescriptive (65,536&nbsp;[[byteretronym]]s). The introduction of memory, thesegmentation limitmechanisms in this architecture reflects the legacy of earlier 80xx processors., Inwhich 1982,initially<ref>in the [[Intel 802868008]]</ref> addedcould supportonly foraddress [[virtual16, memory]]or andlater<ref>from the [[memoryIntel protection8080]]</ref> 64&nbsp;KB theof originalmemory mode(16,384 wasor renamed '''65,536&nbsp;[[real modebyte]]'''s), and whose instructions and registers were optimised for the newlatter. versionDealing with larger addresses and more memory was namedthus '''[[protectedcomparably mode]]'''.slower, as Thethat [[x86capability was somewhat grafted-64]]on architecturein the Intel 8086. Memory segmentation could keep programs compatible, introducedrelocatable in 2003memory, hasand largelyby droppedconfining supportsignificant forparts segmentationof ina program's operation to 64-bit&nbsp;KB segments, the program could still run modefaster.
 
In 1982, the [[Intel 80286]] added support for [[virtual memory]] and [[memory protection]]; the original mode was renamed '''[[real mode]]''', and the new version was named '''[[protected mode]]'''. The [[x86-64]] architecture, introduced in 2003, has largely dropped support for segmentation in 64-bit mode.
In both real and protected modes, the system uses 16-bit ''segment registers'' to derive the actual memory address. {{anchor|Extra segment}}In real mode, the registers CS, DS, SS, and ES point to the currently used program [[code segment]] (CS), the current [[data segment]] (DS), the current [[stack segment]] (SS), and one ''extra'' segment determined by the programmer (ES). The [[Intel 80386]], introduced in 1985, adds two additional segment registers, FS and GS, with no specific uses defined by the hardware. The way in which the segment registers are used differs between the two modes.<ref name=Arch />
 
In both real and protected modes, the system uses 16-bit ''segment registers'' to derive the actual memory address. {{anchor|Extra segment}}In real mode, the registers CS, DS, SS, and ES point to the currently used program [[code segment]] (CS), the current [[data segment]] (DS), the current [[stack segment]] (SS), and one ''extra'' segment determined by the system programmer (ES). The [[Intel 80386]], introduced in 1985, adds two additional segment registers, FS and GS, with no specific uses defined by the hardware. The way in which the segment registers are used differs between the two modes.<ref name=Arch />
The choice of segment is normally defaulted by the processor according to the function being executed. Instructions are always fetched from the code segment. Any stack push or pop or any data reference referring to the stack uses the stack segment. All other references to data use the data segment. The extra segment is the default destination for string operations (for example MOVS or CMPS). FS and GS have no hardware-assigned uses. The instruction format allows an optional ''segment prefix'' byte which can be used to override the default segment for selected instructions if desired.<ref>{{cite book|last=Intel Corporation|title=IA-32 Intel Architecture Software Developer's Manual Volume 1: Basic Architecture|date=2004|url=http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf}}</ref>
 
The choice of segment is normally defaulted by the processor according to the function being executed. Instructions are always fetched from the code segment. Any data reference to the stack, including any stack push or pop, oruses anythe stack segment; data referencereferences indirected through the BP register typically referringrefer to the stack usesand so they default to the stack segment. The extra segment is the mandatory destination for string operations (for example MOVS or CMPS); for this one purpose only, the automatically selected segment register cannot be overridden. All other references to data use the data segment by default. The extradata segment is the default destinationsource for string operations, (forbut exampleit MOVScan orbe CMPS)overridden. FS and GS have no hardware-assigned uses. The instruction format allows an optional ''segment prefix'' byte which can be used to override the default segment for selected instructions if desired.<ref>{{cite book|last=Intel Corporation|title=IA-32 Intel Architecture Software Developer's Manual Volume 1: Basic Architecture|date=2004|url=http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf}}</ref>
 
== Real mode ==
[[Image:Overlapping realmode segments.svg|thumb|right|300px|Three segments in [[real mode]] memory (click on image to enlarge). There is an overlap between segment 2 and segment 3; the bytes in the turquoise area can be used from both segment selectors. If however the program code dealing with segment 2 never uses offsets large enough to reach 0x77D0, then it can be thought of as a shorter, non-overlapping, and at most 30,672-byte segment.]]
 
In [[real mode]] or [[Virtual 8086 mode|V86 mode]], the fundamental size of a ''segment'' can range from 16 throughis 65,536&nbsp;[[byte]]s, in 16-byte steps, with individual bytes being addressed using 16-bit ''offsets''.
 
The 16-bit segment selector in the segment register is interpreted as the most significant 16 bits of a linear 20-bit address, called a segment address, of which the remaining four least significant bits are all zeros. The segment address is always added to a 16-bit offset in the instruction to yield a ''linear'' address, which is the same as [[physical address]] in this mode. For instance, the segmented address 06EFh:1234h (here the suffix "h" means [[hexadecimal]]) has a segment selector of 06EFh, representing a segment address of 06EF0h, to which the offset is added, yielding the linear address 06EF0h&nbsp;+&nbsp;1234h&nbsp;=&nbsp;08124h.
(The leading zeros of the linear address, segmented addresses, and the segment and offset fields are shown here for clarity. They are usually omitted.)
 
{|
|-
! style=width:18em | <code>&nbsp; </code><code style="background:#DED">0000 &nbsp;0110 &nbsp;1110 &nbsp;1111</code><code>0000</code>
| '''Segment'''
| 16 bits, shifted 4 bits left (or multiplied by 0x10)
|-
! style=width:18em | <code>+ &nbsp;&nbsp;&nbsp;&nbsp; </code><code style="background:#DDF">0001 &nbsp;0010 &nbsp;0011 &nbsp;0100</code>
| '''Offset'''
| 16 bits
|- style="text-decoration:line-through"
! style=width:18em | <code>&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</code>
|
|-
! style=width:18em | <code>&nbsp; </code><code style="background:#FDF">0000 &nbsp;1000 &nbsp;0001 &nbsp;0010 &nbsp;0100</code>
| '''Address'''
| 20 bits
|}
Because of the way the segment address and offset are added, a single linear address can be mapped to up to 2<sup>12</sup> = 4096 distinct segment:offset pairs. For example, the linear address 08124h can have the segmented addresses 06EFh:1234h, 0812h:0004h, 0000h:8124h, etc. This could be confusing to programmers accustomed to unique addressing schemes, but it can also be used to advantage, for example when addressing multiple nested data structures.
 
This could be confusing to programmers accustomed to unique addressing schemes, but it can also be used to advantage, for example when addressing multiple nested data structures. While real mode segments are ''technically'' always 64&nbsp;[[Kilobyte|KB]] long, the practical effect is only that no segment can be ''longer'' than 64&nbsp;KB, rather than that every segment as actually used in a program ''must'' be treated as 64&nbsp;KB long. Becausedealing therewith effectively smaller segments is nopossible: protectionusable orsizes privilegerange limitationfrom in16 realthrough mode65,536 evenbytes, ifin a16-byte segmentsteps. Because there is programmaticallyno treatedprotection asor smallerprivilege thanlimitation 64&nbsp;KBin real mode, it is still entirely up to the program to coordinate and keep within the bounds of any segments,. This is true both when a segment is programmatically treated as smaller than, or the full 64&nbsp;KB, but it is also true that any program can always access any memory (by just changing segments, since it can arbitrarily set segment selectors to change segment addresses with absolutely no supervision). Therefore, while real mode can just as well be imaginedthought of as havingallowing adifferent variablesegment lengthlengths, forand eachas segment,allowing fromsegments 16to throughbe 65,536overlapping bytesor (in 16non-byteoverlapping steps)as desired, whichnone howeverof isthis justis notrestrictively enforced by the CPU.
 
The effective 20-bit [[address space]] of realPC/XT-generation modeCPUs limits the [[memory address|addressable memory]] to 2<sup>20</sup>&nbsp;bytes, or 1,048,576&nbsp;bytes (1&nbsp;[[Megabyte|MB]]). This derived directly from the hardware design of the Intel&nbsp;8086 (and, subsequently, the closely related 8088), which had exactly 20 [[address bus|address pins]]. (Both were packaged in 40-pin DIP packages; even with only 20 address lines, the address and data buses were multiplexed to fit all the address and data lines within the limited pin count.)
(The leading zeros of the linear address, segmented addresses, and the segment and offset fields are shown here for clarity. They are usually omitted.)
 
{{anchor|Paragraph}}{{resize|105%|{{sidebox|above='''Example calculation:'''|text=A segment value of 0Ch (12) would give a linear address at C0h (192) in the linear address space. The address offset can then be added to this number. 0Ch:0Fh (12:15) would be C0h+0Fh=CFh (192+15=207), CFh (207) being the linear address.}}}}Each segment begins at a multiple of 16&nbsp;bytes, called a ''paragraph'', from the beginning of the linear (flat) address space. That is, at 16&nbsp;byte intervals. Since all segments are technically 64&nbsp;KB long, this explains how overlap can occur between segments and why any ___location in the linear memory address space can be accessed with many segment:offset pairs. The actual ___location of the beginning of a segment in the linear address space can be calculated with ''segment'' × 16. Such address translations are carried out by the segmentation unit of the CPU.
The effective 20-bit [[address space]] of real mode limits the [[memory address|addressable memory]] to 2<sup>20</sup>&nbsp;bytes, or 1,048,576&nbsp;bytes (1&nbsp;[[Megabyte|MB]]). This derived directly from the hardware design of the Intel&nbsp;8086 (and, subsequently, the closely related 8088), which had exactly 20 [[address bus|address pins]]. (Both were packaged in 40-pin DIP packages; even with only 20 address lines, the address and data buses were multiplexed to fit all the address and data lines within the limited pin count.)
 
=== End-of-address-space quirkiness ===
{{anchor|Paragraph}}Each segment begins at a multiple of 16&nbsp;bytes, called a ''paragraph'', from the beginning of the linear (flat) address space. That is, at 16&nbsp;byte intervals. Since all segments are 64&nbsp;KB long, this explains how overlap can occur between segments and why any ___location in the linear memory address space can be accessed with many segment:offset pairs. The actual ___location of the beginning of a segment in the linear address space can be calculated with segment×16. A segment value of 0Ch (12) would give a linear address at C0h (192) in the linear address space. The address offset can then be added to this number. 0Ch:0Fh (12:15) would be C0h+0Fh=CFh (192+15=207), CFh (207) being the linear address. Such address translations are carried out by the segmentation unit of the CPU. The last segment, FFFFh (65535), begins at linear address FFFF0h (1048560), 16&nbsp;bytes before the end of the 20&nbsp;bit address space, and thus, can access, with an offset of up to 65,536&nbsp;bytes, up to 65,520 (65536−16) bytes past the end of the 20&nbsp;bit 8088 address space. On the 8088, these address accesses were wrapped around to the beginning of the address space such that 65535:16 would access address 0 and 65533:1000 would access address&nbsp;952 of the linear address space. The use of this feature by programmers led to the [[Gate A20]] compatibility issues in later CPU generations, where the linear address space was expanded past 20&nbsp;bits.
{{main article|A20 line}}
The last segment, FFFFh (65535), begins at linear address FFFF0h (1048560), 16&nbsp;bytes before the end of the 20-bit address space, and thus can access, with an offset of up to 65,536&nbsp;bytes, up to 65,520 (65536−16) bytes past the end of the 20-bit address space of the 8086 or 8088 CPU. A further 4,094 next-highest 64K-segments also still cross that 1MB-threshold, but by less and less. On the 8086 and 8088 CPUs, these address accesses were wrapped around to the beginning of the address space such that 65535:16 would access address 0, and e.g. 65533:1000 would access address&nbsp;952 of the linear address space. The fact that some programs written for the 8088 and 8086 relied on this quirky wrap-around as a feature led to the [[Gate A20]] compatibility issues in later CPU generations, with the [[Intel 286]] and above, where the linear address space was expanded past 20&nbsp;bits.
 
In 16-bit real mode, enabling applications to make use of multiple memory segments for a single data structure (in order to access more memory than available in any one 64K-segment) is quite complex, but was viewed as a necessary evil for all but the smallest tools (which could do with less memory). The root of the problem is that no appropriate address-arithmetic instructions suitable for flat addressing of the entire memory range are available.{{Citation needed|date=July 2011}} Flat addressing is possible by applying multiple instructions, which however leads to slower programs.
 
The ''[[x86 memory models|memory model]]'' concept derives from the setup of the segment registers. For example, in the ''tiny model'' CS=DS=SS, that is the program's code, data, and stack are all contained within a single 64&nbsp;KB segment. In the ''small'' memory model DS=SS, so both data and stack reside in the same segment; CS points to a different code segment of up to 64&nbsp;KB.
Line 50 ⟶ 56:
{{refimprove section|date=August 2015}}
 
[[Image:Protected mode segments.svg|thumb|300px|left|Three segments in [[protected mode]] memory (click on image to enlarge), with the '''local descriptor table'''.]]
 
=== 80286 protected mode ===
The [[Intel 80286|80286]]'s [[protected mode]] extends the processor's address space to 2<sup>24</sup> bytes (16 megabytes), but not by adjusting the shift value used to calculate a segment address from the value in a segment register. Instead, theeach 16-bit segment registersregister now containcontains an index into a table of [[segment descriptors]] containing 24-bit base addresses to which theoffsets offset isare added. To support old software, the processor starts up in "real mode", a mode in which it uses the segmented addressing model of the 8086. There is a small difference though: the resulting physical address is no longer truncated to 20&nbsp;bits, so [[real mode]] pointers (but not 8086 pointers) can now refer to addresses betweenfrom 100000<sub>16</sub> andthrough 10FFEF<sub>16</sub>. This roughlynearly 64-kilobyte region of memory was known as the [[High Memory Area]] (HMA), and later versions of [[DOS]] could use it to increase the available "conventional" memory (i.e. within the first [[Megabyte|MB]]), by moving parts of DOS from conventional memory into the HMA. With the addition of the HMA, the total address space is approximately 1.06&nbsp;MB. Though the 80286 does not truncate real-mode addresses to 20&nbsp;bits, a system containing an 80286 can do so with hardware external to the processor, by gating off the 21st address line, the [[A20 line]]. The IBM&nbsp;PC&nbsp;AT provided the hardware to do this (for full backward compatibility with software for the original [[IBM&nbsp;PC]] and [[IBM PC/XT|PC/XT]] models), and so all subsequent "[[IBM PC/AT|AT]]-class" PC clones did as well.
 
286 protected mode was seldom used as it would have excluded the large body of users with 8086/88 machines. Moreover, it still necessitated dividing memory into 64k segments like was done in real mode. This limitation can be worked around on 32-bit CPUs which permit the use of memory pointers greater than 64k in size, however as the Segment Limit field is only 24-bit long, the maximum segment size that can be created is 16MB (although paging can be used to allocate more memory, no individual segment may exceed 16MB). This method was commonly used on Windows 3.x applications to produce a flat memory space, although as the OS itself was still 16-bit, API calls could not be made with 32-bit instructions. Thus, it was still necessary to place all code that performs API calls in 64k segments.
 
Once 286 protected mode is invoked, it could not normally be exited except by performing a hardware reset. Machines following the rising [[IBM PC/AT]] standard could feign a reset to the CPU via the standardised keyboard controller, but this was significantly sluggish. Windows 3.x worked around both of these problems by intentionally triggering a [[triple fault]] in the interrupt-handling mechanisms of the CPU, which would cause the IBM AT-compatible hardware to reset the CPU, nearly instantly, thus causing it to drop back into real mode, nearly instantly.<ref>{{Cite web|url=http://blogs.msdn.com/b/larryosterman/archive/2005/02/08/369243.aspx|title = DevBlogs}}</ref>
 
=== Detailed segmentation unit workflow ===
Line 73 ⟶ 79:
In the [[Intel 80386]] and later, protected mode retains the segmentation mechanism of 80286 protected mode, but a [[paging]] unit has been added as a second layer of address translation between the segmentation unit and the physical bus. Also, importantly, address offsets are 32-bit (instead of 16-bit), and the segment base in each segment descriptor is also 32-bit (instead of 24-bit). The general operation of the segmentation unit is otherwise unchanged. The paging unit may be enabled or disabled; if disabled, operation is the same as on the 80286. If the paging unit is enabled, addresses in a segment are now virtual addresses, rather than physical addresses as they were on the 80286. That is, the segment starting address, the offset, and the final 32-bit address the segmentation unit derived by adding the two are all virtual (or logical) addresses when the paging unit is enabled. When the segmentation unit generates and validates these 32-bit virtual addresses, the enabled paging unit finally translates these virtual addresses into physical addresses. The physical addresses are 32-bit on the [[Intel 80386|386]], but can be larger on newer processors which support [[Physical Address Extension]].
 
TheAs mentioned above, the 80386 also introduced two new general-purpose data segment registers, FS and GS, to the original set of four segment registers (CS, DS, ES, and SS).
 
A 386 CPU can be put back into real mode by clearing a bit in the CR0 control register, however this is a privileged operation in order to enforce security and robustness. By way of comparison, a 286 could only be returned to real mode by forcing a processor reset, e.g. by a [[triple fault]] or using external hardware.
Line 86 ⟶ 92:
== Practices ==
Logical addresses can be explicitly specified in [[x86 assembly language]], e.g. (AT&T syntax):
{{codett|movl $42, %fs:(%eax) ; Equivalent to M[fs:eax]<-42) in|asm}} [[Register Transfer Language|RTL]]
 
or in [[Intel syntax]]:
<syntaxhighlight lang="asmnasm">
mov dword [fs:eax], 42
</syntaxhighlight>
Line 96 ⟶ 102:
 
* All CPU instructions are implicitly fetched from the ''[[code segment]]'' specified by the segment selector held in the CS&nbsp;register.
* Most memory references come from the ''[[data segment]]'' specified by the segment selector held in the DS&nbsp;register. These may also come from the extra segment specified by the segment selector held in the ES&nbsp;register, if a segment-override prefix precedes the instruction that makes the memory reference. Most, but not all, instructions that use DS by default will accept an ES override prefix.{{fact|date=April 2025}}
* Processor [[run-time stack|stack]] references, either implicitly (e.g. '''push''' and '''pop''' instructions) or explicitly ([[stack-based memory allocation|memory accesses using the (E)SP or (E)BP registers]]) use the ''stack segment'' specified by the segment selector held in the SS&nbsp;register. For explicit references, the segment can be overridden.
* [[x86 string instructions|String instructions]] (e.g. '''stos''', '''movs'''), along with data segment, also use the ''extra segment'' specified by the segment selector held in the ES&nbsp;register.