X86 memory segmentation: Difference between revisions

Content deleted Content added
Real mode: shorter
Real mode: tone down, deconflict; was self-contradictory. I'm still not happy with this. I have now realised (d'oh!) that the edit summary of the Joh edit that sucked me in contains a glaring misunderstanding, at odds with the A20, etc. stuff further down below. Also, what was deduced from that misunderstanding was arguably OR. I will likely revisit. This isn't even my final form.
Line 12:
[[Image:Overlapping realmode segments.svg|thumb|right|300px|Three segments in [[real mode]] memory (click on image to enlarge). There is an overlap between segment 2 and segment 3; the bytes in the turquoise area can be used from both segment selectors.]]
 
In [[real mode]] or [[Virtual 8086 mode|V86 mode]], the effective size of a ''segment'' can range from 16 through 65,536 [[byte]]s, in 16-byte steps, with individual bytes being addressed using 16-bit ''offsets''.
 
The 16-bit segment selector in the segment register is interpreted as the most significant 16 bits of a linear 20-bit address, called a segment address, of which the remaining four least significant bits are all zeros. The segment address is always added to a 16-bit offset in the instruction to yield a ''linear'' address, which is the same as [[physical address]] in this mode. For instance, the segmented address 06EFh:1234h (here the suffix "h" means [[hexadecimal]]) has a segment selector of 06EFh, representing a segment address of 06EF0h, to which the offset is added, yielding the linear address 06EF0h + 1234h = 08124h.
Line 41:
The effective 20-bit [[address space]] of real mode limits the [[memory address|addressable memory]] to 2<sup>20</sup>&nbsp;bytes, or 1,048,576&nbsp;bytes (1&nbsp;[[Megabyte|MB]]). This derived directly from the hardware design of the Intel&nbsp;8086 (and, subsequently, the closely related 8088), which had exactly 20 [[address bus|address pins]]. (Both were packaged in 40-pin DIP packages; even with only 20 address lines, the address and data buses were multiplexed to fit all the address and data lines within the limited pin count.)
 
{{anchor|Paragraph}}Each segment begins at a multiple of 16&nbsp;bytes, called a ''paragraph'', from the beginning of the linear (flat) address space. That is, at 16&nbsp;byte intervals. Since all segments are technically 64&nbsp;KB long, this explains how overlap can occur between segments and why any ___location in the linear memory address space can be accessed with many segment:offset pairs. The actual ___location of the beginning of a segment in the linear address space can be calculated with segment×16. A segment value of 0Ch (12) would give a linear address at C0h (192) in the linear address space. The address offset can then be added to this number. 0Ch:0Fh (12:15) would be C0h+0Fh=CFh (192+15=207), CFh (207) being the linear address. Such address translations are carried out by the segmentation unit of the CPU. The last segment, FFFFh (65535), begins at linear address FFFF0h (1048560), 16&nbsp;bytes before the end of the 20&nbsp;bit address space, and thus, can access, with an offset of up to 65,536&nbsp;bytes, up to 65,520 (65536−16) bytes past the end of the 20&nbsp;bit 8088 address space. On the 8088, these address accesses were wrapped around to the beginning of the address space such that 65535:16 would access address 0 and 65533:1000 would access address&nbsp;952 of the linear address space. The use of this feature by programmers led to the [[Gate A20]] compatibility issues in later CPU generations, where the linear address space was expanded past 20&nbsp;bits.
 
In 16-bit real mode, enabling applications to make use of multiple memory segments (in order to access more memory than available in any one 64K-segment) is quite complex, but was viewed as a necessary evil for all but the smallest tools (which could do with less memory). The root of the problem is that no appropriate address-arithmetic instructions suitable for flat addressing of the entire memory range are available.{{Citation needed|date=July 2011}} Flat addressing is possible by applying multiple instructions, which however leads to slower programs.