Packers
Binary packers
Packer takes binary program and makes a new program that has unpacker and packed version of P.
- the loader loads the new binary (unpacker), the unpacker unpacks and loads original program
What’s a binary?
A binary is code in binary format (PE for Windows, ELF for Linux, Mach-O for Mac).
The format
- defines what the file looks like on disk and in memory
- contains info about machine to run it on, executable/library, entry point, sections
ELF format:
- used for executables, libraries, and others, on many architectures and OSes
- dual nature
- view on logical sections: described by section header table (
.data
, .text
, .bss
, etc.)
- view on structure in memory: what segments are executable and which are read/write (data), how large they are – described by program header table
- structure
- elf header at beginning: magic number
7F 45 4C 46
, file type, architecture, entry point, program and section headers offset, string table offset
- program headers divide data in segments, providing easy mapping from data to memory
- array of structures for type of segment, position in ELF file, address in memory, physical address, size on disk, size in memory, flags for r/w/x, alignment in memory
- section headers define sections
- one entry for each section: index in string table, what kind of info it has, flags for write/alloc/exec, base address in memory, location in elf file, some other info
- elf program headers have everything that kernel needs to load file
- sections:
- examples:
.text
: code
.data
: initialised data
.bss
: uninitialized data
.got
/.plt
: for dynamic linking
.ctors
/.dtors
: constructors/destructors
- used at link time
- do not have predefined structure, but described by section headers that do
- symbol tables:
- SYMTAB: contains all symbols needed to link/debug files, not needed for running
- DYNSYM: contains symbols for dynamic linking, loaded in memory at runtime so as small as possible
Stripped binaries
Symbol table can be removed with strip -s <program>
- dynamic table has to be preserved for functions imported from shared libraries
- all names of functions and variables gone
Functions and global symbols
Address of global symbols imported from external libraries computed when binary loaded in memory
- can relocate or PIC (code freely relocatable, adds level of indirection via global offset table and procedure linking table)
- every time code has to reference global symbol, uses Global Offset Table (GOT,
.got
) in data section
- at runtime, GOT entries modified by dynamic linker to point to intended data
If code needs to call function in different module, dynamic linker creates array of read-only jump stubs: Procedure Linking Table (PLT, .plt
)
- stubs use GOT entries to call right function
- lazy binding: initially point to resolver in
.plt counterpart
- relocation confined to
.got
and .got.plt
rather than .text
Process creation in Linux
- kernel loads segments defined by program headers into memory
- if interpreter defined, load it too
- kernel sets up stack and starts at interpreter’s entry point
- if no interpreter, use process’ entry point
ELF auxiliary vectors
Mechanism to transfer kernel level info to user processes (such as pointer to system call entry point in memory).
ELF Loader:
- parses ELF file
- maps various program segments in memory
- sets up entry point, initializes process stack
- puts ELF auxiliary vectors on process stack, along with argc, argv, envp
Packers
Initially for compression, but convenient for malware to evade antivirus, and many packers also have anti-debugging techniques.
We want to run the malware, let it unpack itself, and dump memory at the right moment (when it’s completely unpacked).
- the right moment is when you have “normal behavior”
- check system calls, e.g. using
strace
- you can dump memory in gdb:
dump binary memory dump_name start end
Analysing a binary
Static
file
: determine file type
readelf
: display information about contents of ELF files
-h
: file header
-l
program headers
-S
: sections headers
-s
: symbol table
ldd
: print shared libraries
nm
: list symbols from object files
strings
: print strings of printable characters
Dynamic methods
/proc/<pid>
: general information about process with <pid>
/cmdline
: command line
/environ
: environment
/maps
: memory map
strace
: tracks system calls performed by process
- can also follow child process, show signals, decode syscall arguments
-i
: print instruction pointer at time of syscall
ltrace
: tracks dynamically linked library calls