Binary and Malware Analysis

Table of Contents

Assembly

Low-level processor-specific symbolic language. We focus on user-mode x86 64-bit assembly, AT&T syntax.

Program composed of

Instructions

Form: mnemonic source, destination

Intel uses little endian ordering – from lowest address, you lay out bytes from the end (little address has end bytes)

Signed integers expressed in 2’s complement – sign change by flipping bits and adding one.

Comparisons:

Conditional jumps

Data

Data objects in data segment:

.data
    myvar: .long 0x1234567, 0x23456789
    bar: .word 0x1234
    mystr: .asciz "foo"

Stack frames

Stack grows downwards (towards lower memory addresses). Stack pointer (%rsp) points to top of stack

Stack composed of frames, which are pushed on stack during function calls. Address of current frame stored in frame pointer register (on Intel, %rbp)

Each frame contains

Parameter passing in caller function for Linux

Prologue in called function

Epilogue in called function