ASM_(x86-64)::Introduction
In this module, we'll be introducing the x86_64 assembly language. If you've done programming before, you should know about programming languages like Python, C++, Java etc. But what are assembly languages?
Assembly Languages / Instructions
Assembly languages are low-level programming languages that almost directly map to the machine code instructions understood by computers. It is these instructions that run any of our computer programs under the hood, but programming languages like mentioned earlier help to abstract these instructions away from us, thus we make use of them in real-world programming situations.
At this point, the usefulness of assembly language and the concept of abstraction may seem confusing, so we've come up with an analogy to bring across this concept.
Analogy
We can think of our computer or more specifically the CPU in our computer as Bob
, the dumb chef.
Bob is not the brightest chef around. He can only understand simple instructions given by his superiors:
fill <bowl>, <ingredient/bowl>
stir <bowl>
chop <bowl>
boil <bowl>, <time>
To prepare a simple dish like fried rice, Bob may need the following recipe.
; cook rice
fill A, rice
fill A, water
boil A, 18 minutes
stir A
; ingredients
fill B, onions
fill B, carrots
chop B
boil B, peas
; mix to fry
fill B, A
Now you may notice from this example that, to prepare the ingredients for a simple recipe like fried rice, we've already had to provide many explicit instructions for Bob to follow.
What if we needed to prepare much more complex recipes? It would be such a hassle to prepare the recipe for Bob for anything more complex. That is where Alice
comes in!
Compared to Bob, Alice is much more intelligent, and can instead understand high-level and complex instructions provided by her superiors. Furthermore, she is able to convert compile these complex instructions into a recipe containing only the simple instructions that Bob is able to understand and execute. It is this behaviour that makes us call Alice the compiler.
Here is the fried rice example again, but now with Alice involved.
Recipe given to Alice:
1. Boil rice for 18 minutes
2. Chop onions and carrots
3. Mix the onions, carrots and peas with the rice
Now Alice can compile these high-level instructions into the verbose recipe which we will provide to Bob to execute!
As you can see, making use of Alice the compiler allows us to abstract away the low-level instructions used by Bob the cpu, saving us effort and time.
Similarly, while we are used to programming in higher-level programming languages like C, C++ and Golang, they are actually compiled by their corresponding compilers (gcc, g++, go) into the machine-code (x86-64!) before our CPU actually can run these programs.
Heres how it looks in the world of C and x86-64 instead of food.
int add(int a, int b) {
return a + b;
}
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov DWORD PTR [rbp-8], esi
mov edx, DWORD PTR [rbp-4]
mov eax, DWORD PTR [rbp-8]
add eax, edx
pop rbp
ret
Now for readers who may be more familiar with this area, you may know that x86-64 is not the only cpu architecture that exists, but we've decided to teach this first because it's what most of our home computers should be using. For extra knowledge, here is a table of devices and their corresponding *architectures.
*Note: There are bound to be exceptions, and this is a general overview.
Device | CPU Architecture(s) |
---|---|
Home Computers | x86 / x86-64 |
Routers | MIPS / ARM |
Smartphones | ARM |
M1 Macs | ARM |
Thus, by learning the x86-64 assembly language we learn what's going on under the hood of the programs we use everyday. Such understanding can also open the horizon for low-level development and binary analysis and exploitation.
Quiz
From 1994 to 2006, Apple's computers used to run on this architecture before moving migrating to x86.
Hint: P _ _ _ _ P _
(take note of capitalisation)
Why learn Assembly?
As mentioned earlier, higher-level programming languages and compilers exist largely to abstract away the low-level hardware-dependent programming languages from the programmer. Thus, it may seem counter-intuitive that we are now trying to learn the low-level assembly code and be less efficient.
However, there are many benefits to learning how computers work on the low-level, especially when going into security-focused binary analysis. Strong knowledge of low-level concepts and assembly code can be useful in the following fields.
Operating System Development (OSDev)
Operating systems power almost all of our hardware devices nowadays, from our Windows/Mac operating systems on our computers and our Android/iOS operating systems on our mobile devices.
One of the major components of an OS is the kernel, which lies at the core of the operating system, and handles many low-level functionalities. To work in kernel development, understanding of the hardware and CPU architecture is paramount as many functionalities are implemented using hardware specific features.
Hypervisor Development
Hypervisors are another widely used technology both in business and consumer products. Modern hypervisors leverage CPU-specific optimisations and features in order to allow for the best performance and security.
Malware Analysis
Malware analysis is the art of dissecting and understanding malware in order to prevent future attacks, and mitigate the damage of attacks that have already occured. Understanding how computer programs run is one of the key skills required to perform malware analysis, and knowledge of assembly code is definitely not excluded from that.
Exploit Development
Exploit development is one of the topics we will cover in future modules, and involves finding bugs in software and creating exploits that leverage the vulnerabilities to perform malicious actions.
In this area, it is crucial to reverse-engineer the inner workings of programs to find vulnerabilities, and later on to exploit the inner workings for exploit development. Knowledge of assembly is crucial in these activities.