# Introduction to ARM64

### Introduction

After a long journey with ARM 32. it's time to talk briefly about ARM64.

As I mentioned at the start of this book. I won't get deep into assembly and architecture cause it is not in the scope of this book.&#x20;

I want this book to be a hands-on experience.&#x20;

So what's different about ARM64?

\
**AArch64** or **ARM64** is just the 64-bit version of the ARM architecture.

So there will difference in the number of registers, instructions, instruction formats, etc.

ARMv8.1-A, ARMv7-A, ARMv8.2-A, etc are some examples of AARCH 64.

The reason why it's important to look into ARM64 is that most smartphones today are manufactured on ARM64. They are also laptops like Apple m1, and m2 running on ARM64.

if you want to read more about the architecture. Go through the links below.

{% embed url="<https://en.wikipedia.org/wiki/AArch64>" %}

{% embed url="<https://devblogs.microsoft.com/oldnewthing/20220726-00/?p=106898>" %}

Now let's look at the registers.

### Registers

The architecture provides 31 general-purpose registers. Each register is 64-bit. They are named from x0 to x31.

These registers can be divided into two parts.

{% embed url="<https://documentation-service.arm.com/static/6364eac8c5a70d2cdb15ff10?token=>" %}
From ARM developer Documentation
{% endembed %}

{% embed url="<https://i.imgur.com/1NFTdnb.jpg>" %}
From : <https://hackmd.io/>
{% endembed %}

| Registers         | Purpose                                                                                                                                                   |
| ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
| x0                | Can be used to contain the return value.                                                                                                                  |
| x0 - x7           | Arguments (X0 to x7 (First 8 arguments)) - Rest on the stack                                                                                              |
| x8 - x18          | General purpose. Holds data and variables.                                                                                                                |
| x19 - x28         | If used by a function, must have their values preserved and later restored upon returning to the caller                                                   |
| x29 (fp)          | Frame Pointer (points to bottom of frame)                                                                                                                 |
| x30 (lr)          | Link Register. Holds the return address of a call                                                                                                         |
| x16               | Holds the system call # in (SVC 0x80) call                                                                                                                |
| x31 / sp/ (x/w)zr | Stack Pointer (sp): Points to the top of the stack. This can also be the zero register depending on the instruction. The zero register holds the value 0. |
| PC                | Program Counter. Contains the address of the next instruction to be executed                                                                              |
| APSR / CPSR       | Current Program status register (holds the flag values)                                                                                                   |

Aarch64 also has a  set of registers for floating-point and single-instruction/multiple-data (SIMD) operations. For more details, refer to the ARM documentation.

{% embed url="<https://developer.arm.com/documentation/>" %}

### Instructions

The instructions are similar that of ARM32 with some minor changes. We will only look into this briefly.

| Opcode  | Purpose                                                                                                                                                                                                |
| ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| MOV     | Used to copy values.                                                                                                                                                                                   |
| MOVN    | Used to copy negative values                                                                                                                                                                           |
| LSL/LSR | Logical shift left, Logical shift right                                                                                                                                                                |
| LDR     | Load data from the memory to the register                                                                                                                                                              |
| STR     | Store data from the register to the memory                                                                                                                                                             |
| LDP     | Load a pair of values to the registers.                                                                                                                                                                |
| STP     | Store a pair of values to memory.                                                                                                                                                                      |
| ADR     | Loads an address within a certain range, without performing a data load.                                                                                                                               |
| ADRP    | <p></p><p>ADRP is similar to ADR, but it:</p><ul><li>shifts pages (4KiB, P in ADRP stands for Page) relative to the current pages instead of just bytes</li><li>zeroes out the 12 lower bits</li></ul> |
| CMP     | Compare two values, flags are updated automatically                                                                                                                                                    |
| B       | `B <label>` performs a direct, PC-relative, branch to `<label>`.                                                                                                                                       |
| BR      | `BR` performs an indirect, or absolute, branch to the address specified in `Xn`                                                                                                                        |
| BLR     | Branch with a register and save the return address to lr                                                                                                                                               |
| BEQ     | BEQ for "Branch if Equal", which means the Branch will only be taken if the Z flag is set.                                                                                                             |

There are many other instructions. we don't need to learn each and every instruction for exploitation. we can learn as we go through.

if you still want to look into other instructions with examples. Go through the pdf and the link below.

{% embed url="<https://iitd-plos.github.io/col718/ref/arm-instructionset.pdf>" %}

{% embed url="<https://0xinfection.github.io/reversing/pages/arm-64-course.html>" %}

{% embed url="<https://cit.dixie.edu/cs/2810/arm64-assembly.html>" %}

### Functions

This is the most important part.&#x20;

Let's see how functions work in ARM64.

* The first 8 arguments to the function are passed through registers x0 to x7 and the rest are passed through the stack.
* x0 is generally used to hold the return value.
* `RET` instruction is used to return from the function.
* x30 (link register) holds the return address.
* The instructions used to call a subroutine are BLR, BL, BR, etc.
* Branch with link (BL) copies the address of the next instruction (after the BL) into the link register (x30) before branching
* BR is used to branch to register, for example: br x7
* BLR  is used to branch to a register and stores the address of the next instruction (after the BL) (return address) into the link register (x30)

Let's see an example now.

```
#include <stdio.h>

int function1(int a,int b){

int c; 
c = a + b;

return c;

 
}
 
int main(){
int res;
printf("Hello world");
res = function1(1,2);

return 0;


}

```

Now compile it using GCC.

`sudo gcc func.c -o arm64functions`

Let's load it into gdb and do a disassembly of the main function.

<figure><img src="https://3643148735-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FuoH6DVWgxRANBNbHnsUM%2Fuploads%2FGFsIFWxUIg3APLEWy7Jk%2Farmfunc.png?alt=media&#x26;token=2e7250dd-20af-451f-b590-18d7d7441585" alt=""><figcaption></figcaption></figure>

So do you see anything new here?

Let's identify the function prologue of the main function.

<figure><img src="https://3643148735-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FuoH6DVWgxRANBNbHnsUM%2Fuploads%2F9ODrOC5psvS0DnmWfjGC%2Fprologue-edited.png?alt=media&#x26;token=1c9d97dd-efe0-4ebe-8f0c-be21b85d6190" alt=""><figcaption></figcaption></figure>

We can see an **'stp'** instruction and mov instruction.

So the '**stp'** instruction will store two values into the specified stack location from the x29 and x30 register.&#x20;

"!" denotes pre-increment. So firstly the sp will be set to sp - 32.

In short,

*sp = sp - 32*

*Then,*

*sp = x29*

*sp + 8 = x30*

Let's see this using the examine command.

<figure><img src="https://3643148735-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FuoH6DVWgxRANBNbHnsUM%2Fuploads%2F6RSmxvyFbPEu7doGUXq6%2Fgef.png?alt=media&#x26;token=c71f87e4-d283-4a30-8caf-e3c7cf8886de" alt=""><figcaption></figcaption></figure>

<figure><img src="https://3643148735-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FuoH6DVWgxRANBNbHnsUM%2Fuploads%2FXmmQafJjemkBmTzq6K3H%2Fgef2.png?alt=media&#x26;token=63806fab-1308-4701-b03a-d02b0268404b" alt=""><figcaption></figcaption></figure>

Lastly, the 'mov x29,sp' instruction will set the frame pointer (x29) to the current stack pointer (sp).

Now let's what the 'adrp' instruction does.

<figure><img src="https://3643148735-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FuoH6DVWgxRANBNbHnsUM%2Fuploads%2Fn7ZsdA7X9DGPBoErtWB7%2Fbefore.png?alt=media&#x26;token=c2d5f933-c20d-42c3-8e2b-8de2637477ce" alt=""><figcaption></figcaption></figure>

Before executing 'adrp' the x0 register is 1.

<figure><img src="https://3643148735-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FuoH6DVWgxRANBNbHnsUM%2Fuploads%2FPaLaJCOhch6Had64IKMD%2Fafter.png?alt=media&#x26;token=5bd4836a-1c15-4e28-b859-68b6e5b4c976" alt=""><figcaption></figcaption></figure>

After execution, we can see that x0 is loaded with the address 0x400000.

Now there is **add** instruction that will add the value '0x6b0' into the register x0.

<figure><img src="https://3643148735-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FuoH6DVWgxRANBNbHnsUM%2Fuploads%2Fga1NHx1INWnt3YKtEiZ4%2Fadd.png?alt=media&#x26;token=cb12d2be-25b3-4715-93e8-679560f8e542" alt=""><figcaption></figcaption></figure>

After the add instruction, we can see the **branch and link instruction.**&#x20;

In ARM32 'call' instruction was used to call the function. But here we can see it's using 'bl'  instruction to invoke the printf() function.

the if we step through using the 'ni' command. We will reach the next branch and link the instruction that calls our written function (function1).

Our two arguments are passed through the registers **w0 and w1 from the order right to left**.

The reason why we see w0 and w1 registers instead of x0 and x1 registers is that both our values can be contained in w0 and w1 registers. The compiler optimized the code so that it uses less space.

<figure><img src="https://3643148735-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FuoH6DVWgxRANBNbHnsUM%2Fuploads%2FaUeZ8XFfO4tyrZEuxptv%2Farguments.png?alt=media&#x26;token=070ff83b-4395-4eb6-9595-e1b15986085b" alt=""><figcaption></figcaption></figure>

Step through these instructions till you reach the 'bl' instruction.

Now take note of the address of the next instruction after the 'bl' instruction.

That is&#x20;

`0x0000000000400608 <+32>: str w0, [x29,#28]`

Now let's step into the function using the 'si' command.

<figure><img src="https://3643148735-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FuoH6DVWgxRANBNbHnsUM%2Fuploads%2FMf7Z3dKKOo4n9i2hZ3KH%2Fx30.png?alt=media&#x26;token=5fff7e21-201a-4399-a5a8-57e12815664c" alt=""><figcaption></figcaption></figure>

if we check the x30 register now, it holds the return address. The return address is `0x0000000000400608`and it is the address of the next instruction after the "bl" instruction in the main function.

Let's now walk through the instructions inside our function.

`sub sp, sp, #0x20`

Makes space for the variables.

`str w0, [sp,#12]`

Stores the first argument ( value 1 ) at \[sp + 12].

&#x20;`str w1, [sp, #8]`

Stores the second argument (value 2) at \[sp + 8].&#x20;

`ldr w1, [sp, #12]`&#x20;

`ldr w0, [sp, #8]`

Loads the first and second argument (values 1 and 2 ) into w1 and w0.

So w1 = 1 and w0 = 2.

`add w0, w1, w0`

This will add w1 and w0 and saves the result to w0.

w0 = w0 + w1 = 2 + 1 = 3.

`str w0, [sp, #28]`

Stores the value in w0 into \[sp + 28]

`ldr w0, [sp, #28]`

Loads the value 3 from \[sp + 28] into w0.

`add sp, sp, #0x20`

Clears the stack. it deallocates the space that was allocated at the beginning.

Now we reached the  'ret' instruction.

<figure><img src="https://3643148735-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FuoH6DVWgxRANBNbHnsUM%2Fuploads%2FKa31ID0IkBLTblmWj8hj%2Fret.png?alt=media&#x26;token=69e61a8b-eaab-4ac6-9e58-96c5690854d6" alt=""><figcaption></figcaption></figure>

When we execute the 'ret' instruction, the pc will be loaded with the return address in the x30 register.

<figure><img src="https://3643148735-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FuoH6DVWgxRANBNbHnsUM%2Fuploads%2FxETAfngi3aAKoT4zaEmk%2Fx30.png?alt=media&#x26;token=cdfd9cfe-695d-4998-b487-21790854c242" alt=""><figcaption></figcaption></figure>

&#x20;As we can see now both **pc** and **x30** contain the same address and we returned to the main function.

Let's walk through the instructions in the main function.

`str w0, [x29, #28]`

Stores the value in w0 (3) into \[x29 + 28]

`mov w0, #0x0`

Assigns the value zero to the w0 register. This is the return value of our main function.

Next, we have our function epilogue

<figure><img src="https://3643148735-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FuoH6DVWgxRANBNbHnsUM%2Fuploads%2FbMNYnMf93S4fokL3wWFc%2Fpaint-edt.png?alt=media&#x26;token=041d8297-04e9-4799-9de5-80702d190e85" alt=""><figcaption></figcaption></figure>

`ldp x29, x30, [sp], #32`

This will load,

`x29 = [sp]`

`x30 = [sp + 8]`

and increment&#x20;

`sp = [sp + 32]`

Here it retains the values of the x29 and x30 registers which are preserved in the stack at the beginning of the main function.

So when 'ret' is hit the program counter will be loaded with the return address from x30.

The key takeaway from here is how the functions work. Everything else is kind of similar to ARM32. Here the return address is stored in x30 so when we doing our overflows we should keep an eye on the x30 register because getting control over the x30 register is the way to hijack the program execution.
