Revision as of 23:53, 25 January 2004 edit 213.101.123.22 (talk) No edit summary ← Previous edit		Revision as of 20:31, 28 January 2004 edit undo Dysprosia (talk \| contribs) 28,388 edits scrub up Next edit →
Line 1: '''x86 assembly programming in protected mode''' utilises 32 bit addressing of registers and memory, and enables other features such as protection and paging. [[Protected mode]] is the mode in which most modern [[operating system]]s run their code. When the computer boots, it first enters [[real mode]]; the operating system is responsible for switching into protected mode.▼ ~~This page should be considerd to belong to [[X86 assembly language]].~~ ▲[[Protected mode]] is the mode in which most modern [[operating system]]s run their code. When the computer boots, it first enters [[real mode]]; the operating system is responsible for switching into protected mode. == Application registers == ~~There~~In protected mode, there are 8 32-bit general-purpose registers infor ~~protected mode~~use: ▼ * data registers EAX, the accumulator EBX, the base register ECX, the counter register EDX, the data register * address registers ▲There are 8 32-bit general-purpose registers in protected mode: ESI, the source register ~~EAX, EBX, ECX, EDX, ESI, EDI, ESP and EBP.~~ EDI, the destination register All of them can be used both for addressing and data containing. Some of these registers are however better to use for certain operations than others. This is because mnemonics using certain registers could be translated into shorter [[opcode]]s than if they used other registers.▼ ESP, the stack pointer register EBP, the stack base pointer register In addition there are non-application registers available, which change the state of the processor: ~~Four of the general-purpose register has smaller, 16- and 8- bit variants of themselves:~~ * control registers CR0 CR1 CR2 CR3 * test registers TR4 TR5 TR6 TR7 * descriptor registers GDTR, the global descriptor table register (see below) LDTR, the local descriptor table register (see below) *IDTR, the interrupt descriptor table register (see below) task register * TR ▲All of them can be used both for segmented addressing of memory and ~~data~~for ~~containing~~holding data. Some of these registers are however better to use for certain operations than others. This is because mnemonics using certain registers could be translated into shorter [[opcode]]s than if they used other registers. ~~AX = bits 0-15 of EAX~~ ~~BX = bits 0-15 of EBX~~ ~~CX = bits 0-15 of ECX~~ ~~DX = bits 0-15 of EDX~~ The lower 16 bits of each 32 bit register can be addressed seperately and like a register in its own right, and these 16 bit registers can be broken up into two eight-bit registers - that is 16 bits of data in a 16 bit register can be addressed 8 bits at a time: the upper eight and the lower eight bits, and can be treated as registers in their own right. ~~AL = bits 0-7 of EAX and AX~~ ~~BL = bits 0-7 of EBX and BX~~ If we take the EAX register, this register contains 32 bits and the lower 16 bits can be addressed by the AX register. The upper 8 bits of the AX register can be addressed by the AH register and the lower 8 bits of the AX register can be addressed by the AL register. ~~CL = bits 0-7 of ECX and CX~~ ~~DL = bits 0-7 of EDX and DX~~ IfFor example, if ECX initially contains the number 0x3A3F901D and CH changes to 0x44, then ECX will also change to contain 0x3A3F441D.''▼ ~~AH = bits 8-15 of EAX and AX~~ ~~BH = bits 8-15 of EBX and BX~~ ~~CH = bits 8-15 of ECX and CX~~ ~~DH = bits 8-15 of EDX and DX~~ ~~The~~There ~~flag~~is ~~register~~also a 32-bit wide ~~contains~~[[Flags (computing)\|flags ~~that~~register]], ~~could~~named beEFLAGS, ~~either~~which ~~zero~~contain ~~och~~the ~~one~~processor state. ~~If a~~Each flag is ~~set to~~ one, itbit is- ~~said~~and ~~to be~~thus set~~/high.~~ ~~Otherwise,~~0 ~~the~~or ~~flag~~1, isalso ~~said~~called toset, behigh, ~~lowerd~~and unset or ~~unset~~low. Important flags in the EFLAGS register is: carry (bit 0), zero (bit 6), sign flag (bit 7) and overflow (bit 12).▼ ~~''For example; if AL = 0x32 and AH = 0x12, then AX contains the number 0x1234.~~ ▲If ECX initially contains the number 0x3A3F901D and CH changes to 0x44, then ECX will also change to contain 0x3A3F441D.'' Flags are notably used in the x86 architecture for comparisons. A comparison is made between two registers, for example, and in comparison of their difference a flag is raised. A jump instruction then checks the respective flag and jumps if the flag has been raised: for example ~~There is also a 32-bit wide [[flag-register]] that could be used for conditional jumps and the like. The flag register is namned EFLAGS in protected mode.~~ cmp ax, bx jne do_something first compares the AX and BX registers, and if they are unequal, the code branches off to the do_something label. ▲The flag register contains flags that could be either zero och one. If a flag is set to one, it is said to be set/high. Otherwise, the flag is said to be lowerd or unset. Important flags in the EFLAGS register is: carry (bit 0), zero (bit 6), sign flag (bit 7) and overflow (bit 12). There is also a 32-bit [[instruction pointer]], named EIP. The IP register points to where in the program the processor is currently executing it's code. The IP register cannot be accessed by the programmer directly. == Mnemonics ~~used in [[protected mode]] x86-assembly~~for opcodes== In real mode, the following mnemonics are available: aaa, aad, aam, aas, adc, add, and, arpl, bound, bsp, bsr, bt, btc, btr, bts, call, cbw, cwde, clc, cld, cli, clts, cmc, cmp, cmps, cmpsb, cmpsw, cmpsd, cwd, cdq, daa, das, dec, div, enter, hlt, idiv, imul, in, inc, ins, insb, insw, insd, int, into, iret, iretd, ja, jae, jb, jbe, jc, jcxz, jecxz, je, jz, jg, jge, jl, jle, jmp, jna, jnae, jnb, jnbe, jnc, jne, jng, jnge, jnl, jnle, jno, jnp, jns, jnz, jo, jp, jpe, jpo, js, jz, lahf, lar, lea, leave, lgdt, lidt, lgs, lss, lds, les, lfs, lldt, lmsw, lock, lods, lodsb, lodsw, lodsd, loop, loope, loopz, loopne, loopnz, lsl, ltr, mov, movsx, movzx, mul, neg, nop, not, or, out, outs, outsb, outsw, outsd, pop, popa, popad, popf, popfd, push, pusha, pushad, pushf, pushfd, rcl, rcr, rol, ror, rep, repe, repz, repne, repnz, ret, sahf, sal, sar, shl, shr, sbb, scas, scasb, scasw, scasd, seta, setae, setb, setbe, setc, sete, setg, setge, setl, setle, setna, setnae, setnb, setnbe, setnc, setne, netng, setnl, setnle, setno, setnp, setpe, setpo, sets, setz, sgtd, sidt, shld, shrd, sldt, smsw, stc, std, sti, stos, stosb, stosw, stosd, str, sub, test, verr, verw, wait, xchg, xlat, xlatb, xor.▼ ~~this does~~ (not ~~include~~including the [[floating point]]-, [[singe instruction multiple data\|~~simd~~SIMD]]- and some other instructions.)▼ ▲aaa, aad, aam, aas, adc, add, and, arpl, bound, bsp, bsr, bt, btc, btr, bts, call, cbw, cwde, clc, cld, cli, clts, cmc, cmp, cmps, cmpsb, cmpsw, cmpsd, cwd, cdq, daa, das, dec, div, enter, hlt, idiv, imul, in, inc, ins, insb, insw, insd, int, into, iret, iretd, ja, jae, jb, jbe, jc, jcxz, jecxz, je, jz, jg, jge, jl, jle, jmp, jna, jnae, jnb, jnbe, jnc, jne, jng, jnge, jnl, jnle, jno, jnp, jns, jnz, jo, jp, jpe, jpo, js, jz, lahf, lar, lea, leave, lgdt, lidt, lgs, lss, lds, les, lfs, lldt, lmsw, lock, lods, lodsb, lodsw, lodsd, loop, loope, loopz, loopne, loopnz, lsl, ltr, mov, movsx, movzx, mul, neg, nop, not, or, out, outs, outsb, outsw, outsd, pop, popa, popad, popf, popfd, push, pusha, pushad, pushf, pushfd, rcl, rcr, rol, ror, rep, repe, repz, repne, repnz, ret, sahf, sal, sar, shl, shr, sbb, scas, scasb, scasw, scasd, seta, setae, setb, setbe, setc, sete, setg, setge, setl, setle, setna, setnae, setnb, setnbe, setnc, setne, netng, setnl, setnle, setno, setnp, setpe, setpo, sets, setz, sgtd, sidt, shld, shrd, sldt, smsw, stc, std, sti, stos, stosb, stosw, stosd, str, sub, test, verr, verw, wait, xchg, xlat, xlatb, xor. ▲this does not include the [[floating point]]-, [[singe instruction multiple data\|simd]]- and some other instructions. There is also some undocodumented instructions, like the umov-instruction that could be used for [[in circuit emulator]]s. (umov stands for "user move", and with the knowledge of that instruction, it becomes much easier to write certain types of software debuggers.) == The addressing model in protected mode == It is important to differ addresses from each other in protected mode. There are ''physical addresses'', ''linear addresses'' and ''logic addresses''. A logic address is a segment-register and a offset-register paired together. However, only the offset address matters because nearly all operating systems use ''flat addressing'' (see below). With other words: A logic address is a pointer inside a program. (Example: In C, ''<tt>*pointer''</tt> is a logic address.) A linear address is a logic address that has gone through the descriptor-mechanism. (see ''Descriptors'' below.) Line 56 ⟶ 70: === Descriptors === There is a ''Global Description Table'' (GDT) and a ''Local Description Table'' (LDT) that holds information about how the memory should look and behave. The GDT is pointed to by the GDT-register (GDTR) and the LDT is pointed to by the LDT-register (LDTR). The pointers to these tables are 48 bits wide, and contains two fields; A pointer to the beginning of the table (base), and a part that describes how large the table is in bytes (limit). Line 80 ⟶ 93: == Memory layout for PC-computers in protected mode == IsThe ~~much~~memory ~~the~~layout ~~same~~for ascomputers in ~~realmode~~protected mode is similar to that of real mode. Alas, some PC-computers has the 15th megabyte occupied by the video-card.▼ ▲Is much the same as in realmode. Alas, some PC-computers has the 15th megabyte occupied by the video-card. 0-3FF Application [[RAM]] Line 96 ⟶ 108: See [[supervisor mode]] ~~== Non-application registers ==~~ ~~CR0, CR1, CR2, CR3, TR4, TR5, TR6, TR7, GDTR, LDTR, IDTR and TR.~~ == Interrupts in protected mode ==

32-bit x86 assembly programming: Difference between revisions