ASM_(x86-64)::Instructions_and_Registers
x86-64 refers to the 64-bit version of the x86 CPU architecture implemented by many CPU manufacturers. Assembly programs consist of an arrangement of instructions that will be executed by the CPU to interact with memory and registers.
What are instructions?
Instructions are the lowest-level unit of execution that the CPU interprets and executes. There are a wide variety of possible instructions and it would be impossible to cover all of them.
The CPU will execute instructions from lower addresses to high addresses, with a special register RIP that indicates the address of the instruction being executed. This may sound confusing, but it is actually very similar to how humans read recipes, from the start of the recipe (low address) to the end of the recipe (high address), doing one step(instruction) at a time.
Let's take a look at an assembly code example from the previous lesson.
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov DWORD PTR [rbp-8], esi
mov edx, DWORD PTR [rbp-4]
mov eax, DWORD PTR [rbp-8]
add eax, edx
pop rbp
ret
When the CPU executes this code, it will start from the first instruction push rbp
and keep executing one instruction at a time till it eventually reaches the last instruction ret
. As each instruction gets executed, the special RIP
register will be incremented accordingly to point to the next instruction, if this sounds confusing it will be explained shortly.
Baby's first instruction
While you may have been introduced to a variety of instructions from the past few examples, the first instruction we will explicitly go through is the nop
!
The nop
serves an important functionality, it does nothing!
Here is a sample assembly code snippet showcasing the nop
s function:
nop
nop
nop
After this code runs, nothing happens (except for RIP
increasing).
What are registers?
Before we continue introducing more instructions, it will be important to go through the concept of registers. In traditional programming, you should have learnt about the concept of variables. Variables are temporary constructs that store information for intermediate calculations and operations. Registers behave very similarly, but only store certain data types, and they store information in the CPU itself!
In the x86-64 architecture CPUs, there are a few groups of registers: general-purpose, segment, and EFLAGS registers. On 64-bit systems like the x86-64, most registers will store and represent values using 64-bits of storage.
Here is a table of the register names.
Register Type | Register Names |
---|---|
General-purpose | RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP, R8-R15 |
Segment | ES, CS, SS, DS, FS |
*EFLAGS | CF, PF, AF, ZF, SF ... |
*EFLAGS "registers" are bit-flags on the same register.
General-purpose registers are used for storage of intermediate values, simple integer calculations, and used as memory pointers. The RSP
register has some unique properties that will be discussed later, and generally isn't used the same way as other general-purpose registers.
Segment registers are a more advanced concept that will not be discussed for now. Most simple assembly programs will not need to deal with these.
EFLAGS is a special register used for control-flow that will be discussed in a future lesson. Generally, this register is not directly operated on, but rather is modified as a side effect of instructions.
Using registers
Let's dive right in and start looking at some assembly code! The first instruction we'll cover is mov
.
The syntax for mov
is as follows:
mov <dst>, <src>
The mov
instruction is quite straightforward, it moves the values from the <src>
operand to the <dst>
operand. The <dst>
operand will contain the name of the register you intend to write to, and the <src>
operand can contain the name of the register you intend to read from, or an immediate numerical value.
mov rax, 123
result:
rax: 123
You can also move values between registers.
mov rax, 123
mov rbx, rax
result:
rax: 123
rbx: 123
Initializing...
Sub-registers
Instruction Pointer (RIP)
As mentioned earlier, there is a special register called RIP
that acts as a pointer to the next instruction to be executed.
This acts a sort of cursor to indicate the current instructions being executed.
The value of this register is never explicitly modified.
Rather, as the program progresses, the RIP
register is modified as a side-effect accordingly.
Therefore, you will not observe instructions that mov
values into rip
for example.
We can observe this effect in the below example.
Take note of the value of the RIP
register as you step through each line of assembly code.
Initializing...
In general, the side effect of each instruction will be to increment the value of RIP
to refer to the next immediate instruction.
Intuitively, you can think of it as a teacher pointing to the next line in a storybook every time the current line has been read.
In later lessons, we will cover other instructions that may modify RIP
in a very different manner.
But in general, it is a safe guess that RIP
will automatically point to the neighbouring instruction at every step.
Quiz
Which register is not a general-purpose register?
Consider the following assembly code.
mov rax, 1
mov rbx, 2
mov rcx, rax
mov rcx, rbx
What will be the value of rcx
after the code has completed running.
mov rax, 1
mov rbx, 2
mov rcx, rax
mov rcx, rbx