This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Download Microsoft Edge
More info about Internet Explorer and Microsoft Edge
In the lists in this section, instructions marked with an asterisk (
*
) are particularly important. Instructions not so marked are not critical.
On the x86 processor, instructions are variable-sized, so disassembling backward is an exercise in pattern matching. To disassemble backward from an address, you should start disassembling at a point further back than you really want to go, then look forward until the instructions start making sense. The first few instructions may not make any sense because you may have started disassembling in the middle of an instruction. There is a possibility, unfortunately, that the disassembly will never synchronize with the instruction stream and you will have to try disassembling at a different starting point until you find a starting point that works.
For well-packed
switch
statements, the compiler emits data directly into the code stream, so disassembling through a
switch
statement will usually stumble across instructions that make no sense (because they are really data). Find the end of the data and continue disassembling there.
Instruction Notation
The general notation for instructions is to put the destination register on the left and the source on the right. However, there can be some exceptions to this rule.
Arithmetic instructions are typically two-register with the source and destination registers combining. The result is stored into the destination.
Some of the instructions have both 16-bit and 32-bit versions, but only the 32-bit versions are listed here. Not listed here are floating-point instructions, privileged instructions, and instructions that are used only in segmented models (which Microsoft Win32 does not use).
To save space, many of the instructions are expressed in combined form, as shown in the following example.
means that the first parameter must be a register, but the second can be a register, a memory reference, or an immediate value.
To save even more space, instructions can also be expressed as shown in the following.
which means that the first parameter can be a register or a memory reference, and the second can be a register, memory reference, or immediate value.
Unless otherwise noted, when this abbreviation is used, you cannot choose memory for both source and destination.
Furthermore, a bit-size suffix (8, 16, 32) can be appended to the source or destination to indicate that the parameter must be of that size. For example, r8 means an 8-bit register.
Memory, Data Transfer, and Data Conversion
Memory and data transfer instructions do not affect flags.
Effective Address
For example,
LEA eax, [esi+4]
means
eax
=
esi
+ 4. This instruction is often used to perform arithmetic.
Data Transfer
MOVSX
and
MOVZX
are special versions of the
mov
instruction that perform sign extension or zero extension from the source to the destination. This is the only instruction that allows the source and destination to be different sizes. (And in fact, they must be different sizes.
Stack Manipulation
The stack is pointed to by the
esp
register. The value at
esp
is the top of the stack (most recently pushed, first to be popped); older stack elements reside at higher addresses.
The C/C++ compiler does not use the
enter
instruction. (The
enter
instruction is used to implement nested procedures in languages like Algol or Pascal.)
The
leave
instruction is equivalent to:
mov esp, ebp
pop ebp
Data Conversion
All conversions perform sign extension.
Arithmetic and Bit Manipulation
All arithmetic and bit manipulation instructions modify flags.
Arithmetic
Unsigned and signed division. The first register in the pseudocode explanation receives the remainder and the second receives the quotient. If the result overflows the destination, a division overflow exception is generated.
The state of flags after division is undefined.
If the condition cc is true, then the 8-bit value is set to 1. Otherwise, the 8-bit value is set to zero.
Binary-coded Decimal
You will not see these instructions unless you are debugging code written in COBOL.
These instructions adjust the al and ah registers after performing an unpacked binary-coded decimal operation.
These instructions are remnants of the x86's CISC heritage and in recent processors are actually slower than the equivalent instructions written out the long way.
String Manipulation
After performing the operation, the source and destination register are incremented or decremented by sizeof(T), according to the setting of the direction flag (up or down).
The instruction can be prefixed by REP to repeat the operation the number of times specified by the ecx register.
The rep mov instruction is used to copy blocks of memory.
The rep stos instruction is used to fill a block of memory with accT.
Flags
The cmpxchg instruction is the atomic version of the following:
cmp accT, r/m
jz match
mov accT, r/m
jmp done
match:
mov r/m, r1
done:
Miscellaneous
The opcode for INT 3 is 0xCC. The opcode for NOP is 0x90.
When debugging code, you may need to patch out some code. You can do this by replacing the offending bytes with 0x90.
Idioms