32-bit x86 assembly programming: Difference between revisions

Content deleted Content added
No edit summary
No edit summary
Line 1:
This page should be considerd to belong to [[X86 assembly language]].
 
[[Protected mode]] is the mode in which most modern [[operating system]]s run their code. When the computer boots, it first enters [[real mode]]; the operating system is responsible for switching into protected mode.
 
== Application registers ==
See [[x86-assembly language in realmode]] for a better understanding of this article.
 
There are 8 32-bit general-purpose registers in protected mode:
The 80286 processor did have a 16-bit protected mode. It is of historic interest only. We will concentrate on the 32-bit protected mode, wich is found on the 80386 processors and all its successors.
EAX, EBX, ECX, EDX, ESI, EDI, ESP and EBP.
All of them can be used both for addressing and data containing. Some of these registers are however better to use for certain operations than others. This is because mnemonics using certain registers could be translated into shorter [[opcode]]s than if they used other registers.
 
Four of the general-purpose register has smaller, 16- and 8- bit variants of themselves:
All the general application registers (AX, BX, CX, DX, SI, DI, SP and BP) are extended to a total of 32 bits. To denote that you intend the 32-bit register instead of the low 16-bit part, add an E before the name of the register..
 
AX = bits 0-15 of EAX
EAX, EBX, ECX, EDX, ESI, EDI, ESP and EBP
BX = bits 0-15 of EBX
CX = bits 0-15 of ECX
DX = bits 0-15 of EDX
 
AL = bits 0-7 of EAX and AX
The segment registers remain 16 bits wide and does not change names.
BL = bits 0-7 of EBX and BX
CL = bits 0-7 of ECX and CX
DL = bits 0-7 of EDX and DX
 
CS,AH DS,= ES,bits FS,8-15 GSof EAX and SSAX
BH = bits 8-15 of EBX and BX
CH = bits 8-15 of ECX and CX
DH = bits 8-15 of EDX and DX
 
''For example; if AL = 0x32 and AH = 0x12, then AX contains the number 0x1234.
The segment registers does not work like they did in [[realmode]]. Instead, they are used to point out an selector in a table pointed to by the GDTR- or LDTR-register.
If ECX initially contains the number 0x3A3F901D and CH changes to 0x44, then ECX will also change to contain 0x3A3F441D.''
 
There is also a 32-bit wide [[flag-register]] that could be used for conditional jumps and the like. The flag register is namned EFLAGS in protected mode.
The FLAGS- and IP-registers are also extended to 32 bits.
 
The flag register contains flags that could be either zero och one. If a flag is set to one, it is said to be set/high. Otherwise, the flag is said to be lowerd or unset. Important flags in the EFLAGS register is: carry (bit 0), zero (bit 6), sign flag (bit 7) and overflow (bit 12).
EIP, EFLAGS
 
There is also a 32-bit [[instruction pointer]], named EIP. The IP register points to where in the program the processor is currently executing it's code. The IP register cannot be accessed by the programmer directly.
There are also some new registers. They are useful ''only'' to system programmers.
 
== Mnemonics used in [[protected mode]] x86-assembly ==
CR0, CR2, CR3, CR4, GDTR, LDTR, IDTR, TR..
 
aaa, aad, aam, aas, adc, add, and, arpl, bound, bsp, bsr, bt, btc, btr, bts, call, cbw, cwde, clc, cld, cli, clts, cmc, cmp, cmps, cmpsb, cmpsw, cmpsd, cwd, cdq, daa, das, dec, div, enter, hlt, idiv, imul, in, inc, ins, insb, insw, insd, int, into, iret, iretd, ja, jae, jb, jbe, jc, jcxz, jecxz, je, jz, jg, jge, jl, jle, jmp, jna, jnae, jnb, jnbe, jnc, jne, jng, jnge, jnl, jnle, jno, jnp, jns, jnz, jo, jp, jpe, jpo, js, jz, lahf, lar, lea, leave, lgdt, lidt, lgs, lss, lds, les, lfs, lldt, lmsw, lock, lods, lodsb, lodsw, lodsd, loop, loope, loopz, loopne, loopnz, lsl, ltr, mov, movsx, movzx, mul, neg, nop, not, or, out, outs, outsb, outsw, outsd, pop, popa, popad, popf, popfd, push, pusha, pushad, pushf, pushfd, rcl, rcr, rol, ror, rep, repe, repz, repne, repnz, ret, sahf, sal, sar, shl, shr, sbb, scas, scasb, scasw, scasd, seta, setae, setb, setbe, setc, sete, setg, setge, setl, setle, setna, setnae, setnb, setnbe, setnc, setne, netng, setnl, setnle, setno, setnp, setpe, setpo, sets, setz, sgtd, sidt, shld, shrd, sldt, smsw, stc, std, sti, stos, stosb, stosw, stosd, str, sub, test, verr, verw, wait, xchg, xlat, xlatb, xor.
They are used by the operating system and cannot be accessed by ordinary user programs.
 
this does not include the [[floating point]]-, [[singe instruction multiple data|simd]]- and some other instructions.
There are also some other registers, like the test registers, debug registers, MMX, MSR, XMM and others.
 
There is also some undocodumented instructions, like the umov-instruction that could be used for [[in circuit emulator]]s. (umov stands for "user move", and with the knowledge of that instruction, it becomes much easier to write certain types of software debuggers.)
How to switch to protected mode:
 
== The addressing model ==
 
It is important to differ addresses from each other in protected mode. There are ''physical addresses'', ''linear addresses'' and ''logic addresses''.
 
A logic address is a segment-register and a offset-register paired together. However, only the offset address matters because nearly all operating systems use ''flat addressing'' (see below). With other words: A logic address is a pointer inside a program. (Example: In C, ''*pointer'' is a logic address.)
 
A linear address is a logic address that has gone through the descriptor-mechanism. (see ''Descriptors'' below.)
 
A physical address is a logic address that has gone through the paging mechanism. (see ''Paging'' below.)
 
That means that inside protected mode, each address has to go through two layers of redirectioning before it gets through to the real memory.
 
=== Descriptors ===
 
There is a ''Global Description Table'' (GDT) and a ''Local Description Table'' (LDT) that holds information about how the memory should look and behave. The GDT is pointed to by the GDT-register (GDTR) and the LDT is pointed to by the LDT-register (LDTR). The pointers to these tables are 48 bits wide, and contains two fields; A pointer to the beginning of the table (base), and a part that describes how large the table is in bytes (limit).
 
The base can be either 16- or 32-bits wide. It is only 16 bits wide when used to control a [[realmode]] enviroment.
 
To address some point in the memory, a segment register and a offset register is used. Segment registers are:
CS, DS, ES, FS, GS and SS.
 
CS points to the segment containing code and DS to the data segment. ES, FS and GS points to extra-segments that could be used to store additional data. The SS-segment is used to hold the [[Stack (computing)|stack]].
 
Each segment-register points to a descriptor. Each descriptor points to a well defined data area. If the descriptor that GS points defines its data area to starts at 0x000A0010, and to end at 0x000C0000, and the EAX-register contains the value 0x0001C234, then the combination GS:EAX will point to 0x000BC244.
 
The GDT and LDT contains ''descriptors'' that points to data areas that has diffrent properities. Most often, there is one null-descriptor, 2 data descriptors, 2 code descriptors and a multiply of TSS-descriptors.
Most often, the code- and data-descriptors points to an area in the memory that starts at 0 and ends at 4 gigabytes. This way, descriptors and segment registers becomes almost invisible for the application programmer. This is called ''flat memory''-model. In flat memory, segment registers lose their importance. Only offset-registers is used to point out the addresses.
 
TSS-descriptors is used to hold information about tasks. TSS-descriptors are part of the hardware support for multitasking that x86-processors enables.
 
If a segment register points to a descriptor in the GDT or LDT that has a 32-bit base while switching back to [[realmode]], the segmentregister will continiue to point to the 32-bit descriptor for as long as it stays unmodified. If the descriptor has a base pointing to 0, a limit of 4 [[gigabyte]]s, and the D-flag set, then it will become possible to use 32-bit addressing in realmode. This is sometimes called [[unreal mode]] as this is not entierly odinary realmode behaviour.
 
=== Pages ===
 
Paging can be turned on and off with the help of bit 31 in the CR0-register. Register CR3 is used to point to the page directory table. See [[paging]].
 
== Memory layout for PC-computers in protected mode ==
 
Is much the same as in realmode. Alas, some PC-computers has the 15th megabyte occupied by the video-card.
 
0-3FF Application [[RAM]]
400-5FF BDA ([[BIOS]] Data Area) *
600-9FFFF Application RAM
A0000-BFFFF VGA Video memory
C0000-EFFFF Optional ROMs (The VGA ROM is usually located at C0000)
F0000-FFFFF BIOS ROM
 
* = The BIOS is inactive in protected mode, therefor this area could be
considerd to be "application RAM" as well.
 
== Supervisor mode ==
 
See [[supervisor mode]]
 
== Non-application registers ==
 
CR0, CR1, CR2, CR3, TR4, TR5, TR6, TR7, GDTR, LDTR, IDTR and TR.
 
== Interrupts in protected mode ==
 
Interrupts is mostly much the same as in realmode, with the exception of being capable of performing more complicated switches. For example, a interrupt in protected mode can be programmed to automaticly switch into a specific process or thread.
 
The Interrupt Description Table (IDT) is pointed to by the IDT-register (IDTR), wich is 48 bits wide and works just like the GDTR/LDTR. (See above.)
 
== How to switch to protected mode ==
* load GDTR with the pointer to the GDT-table.
* load IDTR with the pointer to the IDT ''OR'' dissable interrupts.