jmp destination
According to Kip Irvine's "Assembly Language for x86 Processors" when the CPU executes an unconditional transfer, the offset of destination is moved into the instruction pointer.
Could someone explain this because I thought the address to which we want to jump must moved into the instruction pointer?
–
–
–
–
–
4.5.1 JMP Instruction
The JMP instruction causes an unconditional transfer to a destination, identified by a code label that is translated by the assembler into an offset. The syntax is
JMP destination
When the CPU executes an unconditional transfer, the offset of destination is moved into the instruction pointer, causing execution to continue at the new location.
Your confusion is understandable; this is poorly explained.
First of all, if an instruction says jmp destination
, then it will set the instruction pointer equal to destination
. You're right about that.
But the instruction behavior is being confused with the instruction encoding.
Instructions of the form jmp address
are encoded using relative offsets in x86. The offsets are relative to the address immediately following the jmp
instruction.
This can be encoded either as an EB
followed by a signed byte offset or an E9
followed by a signed dword offset. (Integers are little endian in x86)
For example,
00010000: EB 01 CC 90
Disassembles to
loc_10000:
jmp loc_10003 ; EB 01
int3 ; CC
loc_10003:
nop ; 90
00010000: E9 01 00 00 00 CC 90
Disassembles to
loc_10000:
jmp loc_10006 ; E9 01 00 00 00
int3 ; CC
loc_10006:
nop ; 90
Note that this means instructions written the same way may have different encodings when located at different addresses. For example,
00010000: EB 02 EB 00 CC EB FD EB FB
Disassembles to
loc_10000:
jmp loc_10004 ; EB 02
jmp loc_10004 ; EB 00
loc_10004:
int3 ; CC
jmp loc_10004 ; EB FD (FD == -3)
jmp loc_10004 ; EB FB (FB == -5)
Side note: There are several different forms of the jmp
instruction, but the type you are speaking of can only be encoded with a relative offset.
Anyway, what the author is saying is that, for an assembler to generate machine code for an instruction like jmp destination
, it must convert destination
to a byte offset relative to the end of the jmp
instruction. Most of the time, you don't need to worry about this process, however. You can just define a label in your assembly and write jmp my_label
, and the assembler will take care of everything for you.
–
address: bytes: comment:
0x0004 01 20 00 ; jmp destination ; here ip = 0x0004
0x0007 ?? repeated 0x19 times
destination:
0x0020 02 ; hlt ; here ip = 0x0020
compiled from this source:
.code
org 0x0004
jmp destination
org 0x0020
destination:
So the symbol destination
here means absolute address 0x0020
in section .code
(which I won't give any special meaning, but you can imagine whatever complex construction as you wish, for example see segment registers in 16b mode of x86).
Then if the instruction with code 0x01 jmp
is "near", only offset of that absolute address is used, which is 0x0020 in this simple fake example.
You can still have other variants of jmp
on your CPU, like "relative" 0x03 jmp rel8
capable to jump -128..+127 bytes from current ip
, or "far" 0x04 jmp bank/segment:offset
, which would set not only ip
, but also some banking/segment mechanism.
So that word "offset" points to an era of segment:offset
addressing, where full instruction pointer on x86 is cs:ip
, not just ip
. (cs = code segment)
In modern 32/64b x86 OS you usually don't have to touch cs
, and work only with offsets inside 32/64b flat virtual memory mapping, then "address" has the same meaning as "offset of address".
–
–
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.