Lecture 7: exploitation techniques
Buffer overflows:
- common mistake
- can exploit locally and remotely
- can modify both data and control flow
- architecture and OS version dependent
- example buffer overflow was contiguous, arbitrary-length, null-terminated, stack-based. variations in these are possible.
- typical signs of buffer overflows: fixed-length buffers, passing pointer to buffer without size, array access without size check, pointer arithmetic without size/end pointer
- vulnerable functions:
gets()
reads up to newline - replace with fgets()
strcpy()
/strcat()
copies up to null byte - replace with strncpy()
/strncat()
sprintf()
etc. length depends on format arguments - replace with snprintf()
etc.
scanf()
etc. length depends on input string - put bound on %s
formats
- own input functions might be sloppy, always check assumptions
Array overflow (provides arbitrary write)
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
long array[8];
long index = strtol(argv[1], NULL, 10);
long value = strtoul(argv[2], NULL, 16);
array[index] = value;
return 0;
}
You can load shellcode into environment, then write to this array to overwrite the return address.
Off-by-one errors
- e.g. wrong comparison operator, forget about string terminator
- similar to regular overflows, but can overwrite only one element above array size
- this can be exploited to overflow adjacent buffers
- note: every pointer contains 2 null bytes (at end in 64-bit, integers are stored little endian)
Data/BSS overflows
Data and BSS store global variables
No return address reachable for contiguous overflows.
What can you do?
- overwriting function pointer
- overwrite saved frame pointer (attacker can set up fake stack, later return from this stack)
- overwrite C++ object pointer (can hijack virtual function calls)
- overwriting variables often breaks security, like changing strings/integers
- changing pointers
Heap overflows
explicit allocation functions return memory on heap, which survives function return but needs explicit deallocation.
harder to exploit: no return addresses, relative locations depend on order and malloc implementation
instead you target e.g. metadata
heap organisation:
- heap grows towards higher memory address
- memory managed through in-band control structures (metadata is between buffers)
- control structures can be manipulated through heap overflows for arbitrary code execution
- depends on architecture and OS (especially libc)
- heap is divided in chunks, adjacent free blocks are merged
dlmalloc (used in glibc) implementation:
- find free chunk from free list
- if not found, allocate more memory from OS and add to free list
- if chunk too large, split in two and add new chunk to free list
- Remove chunk from free list
- Mark chunk as used in metadata
- Return pointer to data area in chunk
dlmalloc’s free:
- Locate chunk with data pointer
- Mark chunk as free in metadata
- If next chunk also free, merge with next chnk
- If previous chunk also free, merge with previous chunk
- Add chunk to free list
Metadata at start of every chunk:
struct malloc_chunk {
size_t prev_size;
size_t size;
struct malloc_chunk* fd; // used only if free, otherwise data pointer starts here
struct malloc_chunk* bk;
};
Chunk size:
- 8 bytes overhead per allocated block (only size field always used)
- size always multiple of 16 (data size+overhead rounded up, four low-order bits always 0)
- low-order bits of size field used for status
Free list:
- used to find free block to allocate
- doubly linked list using
fd
and bk
fields
- insertion in free list: free(), splitting large block in malloc
- removal from free list: malloc(), merging free blocks in free()
Exploiting dlmalloc:
- assume we find heap buffer overflow
- overwrite
fd
and bk
(requires free block)
- make program call unlink (e.g. to merge block when block before is freed)
- unlink writes chosen data (
fd
) at chosen location (bk
)
In stack buffer overflows, return address is at fixed offset (so it’s easy to reach)
Heap overflow/format string write to an absolute address
Alternative target is Global Offset Table
- used to lazily load library functions
- address is looked up and stored on first call
- has a fixed location
- can use printf
Integer overflow
Integers have a fixed size, each integer type has limited range.
If result does not fit in range of integer, CPU still computes result but discards bits that don’t fit
Classification:
- truncation: integer is cast to a smaller type, discarding extra bits
- arithmetic overflow: computation result out of range for type, wrapping around
- signedness: negative integer cast to unsigned type, incorrectly interpreting sign bit
printf and related take format string and parameters
careless programmers might let user specify format string
parameter passing is just like for other functions (registers, then stack)
missing parameters filled with whatever happened to be there – information leaks, or position to reach all of stack
%n
- writes to memory, stores number of output characters so far to pointer passed as parameter. so controlling format strings implies arbitrary write.
int main(int argc, char **argv) {
char buf[256];
int y = 1;
snprintf(buf, sizeof(buf), argv[1]); // missing parameter! so format string is attacker-controlled
printf("buffer (%d): %s\n", strlen(buf), buf);
printf("y is %d/0x%x (@ %p)\n", y, y, &y);
return 0;
}
Temporal errors
Spatial errors let attacker access outside space allocated for buffer.
Temporal errors let attacker access buffer before/after intended time frame
Main types;
- use after free
- uninitialized variables
Use after free:
- sometimes program retains pointer to freed memory location
- malloc buffer that was freed
- local variable/alloca buffer after function return
- future allocation/function call can re-use memory
- dereferencing dangling pointer results in undefined behavior
- attacker can craft input to overwrite memory with own data
- program allocates X
- program uses X to store some data
- program frees X
- program allocates Y overlapping with X
- data written to Y also overwrites relevant part of X
- program uses X, causing incorrect result
Uninitialized variables
- local variables and malloc buffers not automatically zeroed
- instead contain whatever happened to be there
- compilers try to warn, but e.g. arrays, struct/union members, malloc buffers not checked
- attacker can initialize variable in way that programmer didn’t expect:
- program allocates X
- program uses X to store data under attacker control
- program frees X
- program allocates Y overlapping X
- program doesn’t initialize some/all of Y, causing attacker-provided data from X to stay there
- program uses Y, causing incorrect result
Type confusion
C++ provides classes (basically structs tying data to functions)
Instance of class is object, can be on stack or on heap
Classes can inherit from one or more classes, can call all functions available from parent.
Object pointer can be cast from child to parent.
C++ typecasts:
- reinterpret_cast: no checks (fast), assumes programmer knows their shit (unsafe)
- static_cast: compile-time check (fast at runtime), allows any possibly valid cast including parent-to-child (still unsafe)
- dynamic_cast: run-time check (slow), ensures runtime type is consistent with compile-time type (safe)
static_cast is common but unsafe:
- object may be cast to wrong type
- incorrect cast causes mismatch between runtime type and compile-time type, members read/written according to wrong type