Introduction to ARM64
Last updated
Last updated
After a long journey with ARM 32. it's time to talk briefly about ARM64.
As I mentioned at the start of this book. I won't get deep into assembly and architecture cause it is not in the scope of this book.
I want this book to be a hands-on experience.
So what's different about ARM64?
AArch64 or ARM64 is just the 64-bit version of the ARM architecture.
So there will difference in the number of registers, instructions, instruction formats, etc.
ARMv8.1-A, ARMv7-A, ARMv8.2-A, etc are some examples of AARCH 64.
The reason why it's important to look into ARM64 is that most smartphones today are manufactured on ARM64. They are also laptops like Apple m1, and m2 running on ARM64.
if you want to read more about the architecture. Go through the links below.
Now let's look at the registers.
The architecture provides 31 general-purpose registers. Each register is 64-bit. They are named from x0 to x31.
These registers can be divided into two parts.
x0
Can be used to contain the return value.
x0 - x7
Arguments (X0 to x7 (First 8 arguments)) - Rest on the stack
x8 - x18
General purpose. Holds data and variables.
x19 - x28
If used by a function, must have their values preserved and later restored upon returning to the caller
x29 (fp)
Frame Pointer (points to bottom of frame)
x30 (lr)
Link Register. Holds the return address of a call
x16
Holds the system call # in (SVC 0x80) call
x31 / sp/ (x/w)zr
Stack Pointer (sp): Points to the top of the stack. This can also be the zero register depending on the instruction. The zero register holds the value 0.
PC
Program Counter. Contains the address of the next instruction to be executed
APSR / CPSR
Current Program status register (holds the flag values)
Aarch64 also has a set of registers for floating-point and single-instruction/multiple-data (SIMD) operations. For more details, refer to the ARM documentation.
The instructions are similar that of ARM32 with some minor changes. We will only look into this briefly.
MOV
Used to copy values.
MOVN
Used to copy negative values
LSL/LSR
Logical shift left, Logical shift right
LDR
Load data from the memory to the register
STR
Store data from the register to the memory
LDP
Load a pair of values to the registers.
STP
Store a pair of values to memory.
ADR
Loads an address within a certain range, without performing a data load.
ADRP
ADRP is similar to ADR, but it:
shifts pages (4KiB, P in ADRP stands for Page) relative to the current pages instead of just bytes
zeroes out the 12 lower bits
CMP
Compare two values, flags are updated automatically
B
B <label>
performs a direct, PC-relative, branch to <label>
.
BR
BR
performs an indirect, or absolute, branch to the address specified in Xn
BLR
Branch with a register and save the return address to lr
BEQ
BEQ for "Branch if Equal", which means the Branch will only be taken if the Z flag is set.
There are many other instructions. we don't need to learn each and every instruction for exploitation. we can learn as we go through.
if you still want to look into other instructions with examples. Go through the pdf and the link below.
This is the most important part.
Let's see how functions work in ARM64.
The first 8 arguments to the function are passed through registers x0 to x7 and the rest are passed through the stack.
x0 is generally used to hold the return value.
RET
instruction is used to return from the function.
x30 (link register) holds the return address.
The instructions used to call a subroutine are BLR, BL, BR, etc.
Branch with link (BL) copies the address of the next instruction (after the BL) into the link register (x30) before branching
BR is used to branch to register, for example: br x7
BLR is used to branch to a register and stores the address of the next instruction (after the BL) (return address) into the link register (x30)
Let's see an example now.
Now compile it using GCC.
sudo gcc func.c -o arm64functions
Let's load it into gdb and do a disassembly of the main function.
So do you see anything new here?
Let's identify the function prologue of the main function.
We can see an 'stp' instruction and mov instruction.
So the 'stp' instruction will store two values into the specified stack location from the x29 and x30 register.
"!" denotes pre-increment. So firstly the sp will be set to sp - 32.
In short,
sp = sp - 32
Then,
sp = x29
sp + 8 = x30
Let's see this using the examine command.
Lastly, the 'mov x29,sp' instruction will set the frame pointer (x29) to the current stack pointer (sp).
Now let's what the 'adrp' instruction does.
Before executing 'adrp' the x0 register is 1.
After execution, we can see that x0 is loaded with the address 0x400000.
Now there is add instruction that will add the value '0x6b0' into the register x0.
After the add instruction, we can see the branch and link instruction.
In ARM32 'call' instruction was used to call the function. But here we can see it's using 'bl' instruction to invoke the printf() function.
the if we step through using the 'ni' command. We will reach the next branch and link the instruction that calls our written function (function1).
Our two arguments are passed through the registers w0 and w1 from the order right to left.
The reason why we see w0 and w1 registers instead of x0 and x1 registers is that both our values can be contained in w0 and w1 registers. The compiler optimized the code so that it uses less space.
Step through these instructions till you reach the 'bl' instruction.
Now take note of the address of the next instruction after the 'bl' instruction.
That is
0x0000000000400608 <+32>: str w0, [x29,#28]
Now let's step into the function using the 'si' command.
if we check the x30 register now, it holds the return address. The return address is 0x0000000000400608
and it is the address of the next instruction after the "bl" instruction in the main function.
Let's now walk through the instructions inside our function.
sub sp, sp, #0x20
Makes space for the variables.
str w0, [sp,#12]
Stores the first argument ( value 1 ) at [sp + 12].
str w1, [sp, #8]
Stores the second argument (value 2) at [sp + 8].
ldr w1, [sp, #12]
ldr w0, [sp, #8]
Loads the first and second argument (values 1 and 2 ) into w1 and w0.
So w1 = 1 and w0 = 2.
add w0, w1, w0
This will add w1 and w0 and saves the result to w0.
w0 = w0 + w1 = 2 + 1 = 3.
str w0, [sp, #28]
Stores the value in w0 into [sp + 28]
ldr w0, [sp, #28]
Loads the value 3 from [sp + 28] into w0.
add sp, sp, #0x20
Clears the stack. it deallocates the space that was allocated at the beginning.
Now we reached the 'ret' instruction.
When we execute the 'ret' instruction, the pc will be loaded with the return address in the x30 register.
As we can see now both pc and x30 contain the same address and we returned to the main function.
Let's walk through the instructions in the main function.
str w0, [x29, #28]
Stores the value in w0 (3) into [x29 + 28]
mov w0, #0x0
Assigns the value zero to the w0 register. This is the return value of our main function.
Next, we have our function epilogue
ldp x29, x30, [sp], #32
This will load,
x29 = [sp]
x30 = [sp + 8]
and increment
sp = [sp + 32]
Here it retains the values of the x29 and x30 registers which are preserved in the stack at the beginning of the main function.
So when 'ret' is hit the program counter will be loaded with the return address from x30.
The key takeaway from here is how the functions work. Everything else is kind of similar to ARM32. Here the return address is stored in x30 so when we doing our overflows we should keep an eye on the x30 register because getting control over the x30 register is the way to hijack the program execution.