Transcript document
Part 2: Advanced Static Analysis
Chapter 4: A Crash Course in x86 Disassembly
Chapter 5: IDA Pro
Chapter 6: Recognizing C Code Constructs in
Assembly
How software works
gcc compiler driver pre-processes, compiles,
assembles and links to generate executable
Links
together object code (i.e. game.o) and static
libraries (i.e. libc.a) to form final executable
Links in references to dynamic libraries for code
loaded at load time (i.e. libc.so.1)
Executable may still load additional dynamic
libraries at run-time
hello.c
Program
Source
Preprocessor
hello.i
Modified
Source
Compiler
hello.s
Assembly
Code
Assembler
hello.o
Object
Code
Linker
hello
Executable
Code
Static libraries
Suppose you have utility code in x.c, y.c, and z.c
that all of your programs use
Link
together individual .o files
gcc –o hello hello.o x.o y.o z.o
Create
a library libmyutil.a using ar and ranlib and
link library in statically
libmyutil.a : x.o y.o z.o
ar rvu libmyutil.a x.o y.o z.o
ranlib libmyutil.a
gcc –o hello hello.c –L. –lmyutil
Note: library code copied directly into binary
Dynamic libraries
Avoid having multiple copies of common code on disk
Problem: libc
“gcc program.c –lc” creates an a.out with entire libc object
code in it (libc.a)
Almost all programs use libc!
Solution:
Have binaries compiled with a reference to a
library of shared objects versus an entire copy of the
library
Libraries loaded at run-time from file system
“ldd <binary>” to see which dynamic libraries a program relies
upon
gcc flags “–shared” and “-soname” for handling and generating
dynamic shared object files
The linking process (ld)
Merges object files
Merges multiple relocatable (.o) object files into a single executable program.
Resolves external references
References to symbols defined in another object file.
Relocates symbols
Relocates symbols from their relative locations in the .o files to new absolute
positions in the executable.
Updates all references to these symbols to reflect their new positions.
References in both code and data
» code: a();
» data: int *xp=&x;
/* reference to symbol a */
/* reference to symbol x */
Executables
Various file formats
Linux
= Executable and Linkable Format (ELF)
Windows = Portable Executable (PE)
ELF
Standard binary format for object files in Linux
One unified format for
Relocatable
object files (.o),
Shared object files (.so)
Executable object files
Better support for shared libraries than old a.out
formats.
More complete information for debuggers.
ELF Object File Format
ELF header
Magic number, type (.o, exec, .so), machine,
byte ordering, etc.
Program header table
Page size, virtual addresses of memory
segments (sections), segment sizes, entry point
.text section
Code
.data section
Initialized (static) data
.bss section
Uninitialized (static) data
“Block Started by Symbol”
0
ELF header
Program header table
(required for executables)
.text section
.data section
.bss section
.symtab
.rel.text
.rel.data
.debug
Section header table
(required for relocatables)
ELF Object File Format (cont)
.symtab section
Symbol table
Procedure and static variable names
Section names and locations
.rel.text section
Relocation info for .text section
Addresses of instructions that will need to be
modified in the executable
Instructions for modifying.
.rel.data section
Relocation info for .data section
Addresses of pointer data that will need to be
modified in the merged executable
.debug section
Info for symbolic debugging (gcc -g)
0
ELF header
Program header table
(required for executables)
.text section
.data section
.bss section
.symtab
.rel.text
.rel.data
.debug
Section header table
(required for relocatables)
PE (Portable Executable) file format
Windows file format for executables
Based on COFF Format
Magic Numbers, Headers, Tables, Directories, Sections
Disassemblers
Overlay Data with C Structures
Load File as OS Loader Would
Identify Entry Points (Default & Exported)
Example C Program
m.c
a.c
int e=7;
extern int e;
int main() {
int r = a();
exit(0);
}
int *ep=&e;
int x=15;
int y;
int a() {
return *ep+x+y;
}
Merging Relocatable Object Files into an
Executable Object File
Relocatable Object Files
system code
.text
system data
.data
Executable Object File
0
headers
system code
main()
m.o
a.o
a()
main()
.text
int e = 7
.data
more system code
a()
.text
int *ep = &e
int x = 15
int y
.data
system data
int e = 7
int *ep = &e
int x = 15
uninitialized data
.bss
.text
.symtab
.debug
.data
.bss
Program execution
Operating system provides
Protection and resource allocation
Abstract view of resources (files, system calls)
Virtual memory
Uniform memory space abstraction for each process
Gives the illusion that each process has entire memory space
How does a program get loaded?
The operating system creates a new process.
Including
among other things, a virtual memory
space
Important: any hardware-based debugger must
know OS state in page tables to map accesses to
virtual addresses
System loader reads the executable file from the
file system into the memory space.
Reads
executable from file system into memory
space
Executable contains code and statically link libraries
Done via DMA (direct memory access)
Executable in file system remains and can be executed
Loading Executable Binaries
Executable object file for
example program p
0
ELF header
Program header table
(required for executables)
.text section
Process image
init and shared lib
segments
.data section
.bss section
.text segment
(r/o)
Virtual addr
0x080483e0
0x08048494
.symtab
.rel.text
.rel.data
.data segment
(initialized r/w)
0x0804a010
.debug
Section header table
(required for relocatables)
0x0804a3b0
.bss segment
(uninitialized r/w)
More on relocation
Assembly code with relative and absolute
addresses
With
VM abstraction, old linkers decide layout and
can supply definitive addresses
Windows “.com” format
Linker can statically bind the program to virtual addresses
Now, they provide hints as to where they would like to be
placed
But….this
could also be done at load time (address
space layout randomization)
Windows “.exe” format
Loader rewrites addresses to proper offsets
System needs to force position-independent code
» Force compiler to make all jumps and branches relative to current
location or relative to a base register set at run-time
ELF uses Global Offset Table
Program execution
CPU
Memory
Addresses
Registers
E
I
P
Object Code
Program Data
OS Data
Data
Condition
Codes
Instructions
Stack
Programmer-Visible State
EIP - Instruction Pointer
a. k. a. Program Counter
Address of next instruction
Register File
Heavily used program data
Condition Codes
Store status information about most recent arithmetic
operation
Used for conditional branching
Memory
Byte addressable array
Code, user data, OS data
Includes stack used to support
procedures
Run-time data structures
0xffffffff
kernel virtual memory
(code, data, heap, stack)
0xc0000000
user stack
(created at runtime)
0x40000000
memory
invisible to
user code
%esp (stack pointer)
memory mapped region for
shared libraries
brk
run-time heap
(managed by malloc)
read/write segment
(.data, .bss)
0x08048000
0
read-only segment
(.init, .text, .rodata)
unused
loaded from the
executable file
Registers
The processor operates on data in registers
(usually)
movl
(%eax), %ecx
Fetch data at address contained in %eax
Store in register %ecx
movl
$array, %ecx
Move address of variable array into %ecx
Typically,
data is loaded into registers, manipulated
or used, and then written back to memory
The IA32 architecture is “register poor”
Few
general purpose registers
Source or destination operand is often memory
locations
IA32 General Registers
31
15
87
0
%ax
%eax
%ah
%al
%cx
%ecx
%ch
%cl
%dx
General purpose
registers (mostly)
%edx
%dh
%dl
%bx
%ebx
%bh
%bl
%esi
%si
%edi
%di
%esp
%sp
Stack pointer
%ebp
%bp
Frame pointer
Special purpose
registers
Operand types
A typical instruction acts on 1 or more operands
addl
%ecx, %edx adds the contents of ecx to
edx
Three general types of operands
Immediate
Like a C constant, but preceded by $
e.g., $0x1F, $-533
Encoded with 1, 2, or 4 bytes based on instruction
Register:
the value in one of the 8 integer registers
Memory: a memory address
There are many modes for addressing memory
Operand examples using mov
Source
movl
Destination
C Analog
movl $0x4,%eax
temp = 0x4;
movl $-147,(%eax)
*p = -147;
Imm
Reg
Mem
movl %eax,%edx
temp2 = temp1;
Reg
Reg
Mem
movl %eax,(%edx)
*p = temp;
Mem
Reg
movl (%eax),%edx
temp = *p;
Memory-memory
single instruction
transfers cannot be done with
Addressing Modes
Immediate and registers have only one mode
Memory on the other hand …
Absolute
specify the address of the data
Indirect
use register to calculate address
Base + displacement
use register plus absolute address to calculate address
Indexed
Indexed
» Add contents of an index register
Scaled index
» Add contents of an index register scaled by a constant
Summary of IA32 Operand Forms
Type
Form
Operand Value
Name
Immediate
$Imm
Imm
Immediate
Register
Ea
R[Ea]
Register
Memory
Imm
M[Imm]
Absolute
Memory
(Ea)
M[R[Ea]]
Indirect
Memory
Imm(Eb)
M[Imm + R[Eb]
Base + displacment
Memory
(Eb, Ei)
M[R[Eb] + R[Ei]]
Indexed
Memory
Imm(Eb, Ei)
M[Imm + R[Eb] + R[Ei]]
Indexed
Memory
(, Ei, s)
M[R[Ei] * s]
Scaled Indexed
Memory
Imm(, Ei, s)
M[Imm + R[Ei] * s]
Scaled Indexed
Memory
(Eb, Ei, s)
M[R[Eb] + R[Ei] * s]
Scaled Indexed
Memory
Imm (Eb, Ei, s)
M[Imm + R[Eb] + R[Ei] * s]
Scaled Indexed
x86 instructions
Rules
Source
operand can be memory, register or
constant
Destination can be memory or register
Only one of source and destination can be memory
Source and destination must be same size
Flags set on each instruction
EFLAGS
Conditional
branches handled via EFLAGS
What’s the “l” for on the end?
addl 8(%ebp),%eax
It stands for “long” and is 32-bits
It tells the size of the operand.
Baggage from the days of 16-bit processors
For x86, x86_64
8
bits is a byte
16 bits is a word
32 bits is a double word
64 bits is a quad word
IA32 Standard Data Types
C Declaration
Intel Data Type
GAS Suffix
Size in bytes
char
Byte
b
1
short
Word
w
2
int
Double word
l
4
unsigned
Double word
l
4
long int
Double word
l
4
unsigned long
Double word
l
4
char *
Double word
l
4
float
Single precision
s
4
double
Double precision
l
8
long double
Extended precision
t
10/12
Global vs. Local variables
Global variables stored in either .data or .bss
section of process
Local variables stored on stack
Global vs local example
int x = 1;
int y = 2;
void a()
{
x = x+y;
printf("Total = %d\n",x);
}
int main(){a();}
void a()
{
int x = 1;
int y = 2;
x = x+y;
printf("Total = %d\n",x);
}
int main() {a();}
Global vs local example
int x = 1;
int y = 2;
void a()
{
x = x+y;
printf("Total = %d\n",x);
}
int main(){a();}
080483c4 <a>:
80483c4:
80483c5:
80483c7:
80483ca:
80483d1:
80483d8:
80483db:
80483de:
80483e1:
80483e5:
80483ec:
80483f1:
80483f2:
push
mov
sub
movl
movl
mov
add
mov
mov
movl
call
leave
ret
%ebp
%esp,%ebp
$0x18,%esp
$0x1,-0x8(%ebp)
$0x2,-0x4(%ebp)
-0x4(%ebp),%eax
%eax,-0x8(%ebp)
-0x8(%ebp),%eax
%eax,0x4(%esp)
$0x80484f0,(%esp)
80482dc <printf@plt>
void a()
{
int x = 1;
int y = 2;
x = x+y;
printf("Total = %d\n",x);
}
int main() {a();}
080483c4 <a>:
80483c4:
80483c5:
80483c7:
80483ca:
80483d0:
80483d5:
80483d8:
80483dd:
80483e2:
80483e6:
80483ed:
80483f2:
80483f3:
push
mov
sub
mov
mov
lea
mov
mov
mov
movl
call
leave
ret
%ebp
%esp,%ebp
$0x8,%esp
0x804966c,%edx
0x8049670,%eax
(%edx,%eax,1),%eax
%eax,0x804966c
0x804966c,%eax
%eax,0x4(%esp)
$0x80484f0,(%esp)
80482dc <printf@plt>
Arithmetic operations
void f(){
int a = 0;
int b = 1;
a = a+11;
a = a-b;
a--;
b++;
}
int main() { f();}
08048394 <f>:
8048394:
8048395:
8048397:
804839a:
80483a1:
80483a8:
80483ac:
80483af:
80483b2:
80483b6:
80483ba:
80483bb:
push
mov
sub
movl
movl
addl
mov
sub
subl
addl
leave
ret
%ebp
%esp,%ebp
$0x10,%esp
$0x0,-0x8(%ebp)
$0x1,-0x4(%ebp)
$0xb,-0x8(%ebp)
-0x4(%ebp),%eax
%eax,-0x8(%ebp)
$0x1,-0x8(%ebp)
$0x1,-0x4(%ebp)
Machine Instruction Example
int sum(int x, int y)
{
int t = x+y;
return t;
}
_sum:
pushl %ebp
movl %esp,%ebp
movl 12(%ebp),%eax
addl 8(%ebp),%eax
movl %ebp,%esp
popl %ebp
ret
C Code
Add
two signed integers
Assembly
Add 2 4-byte integers
“Long” words in GCC parlance
Same instruction whether signed or
unsigned
Operands:
x:
y:
t:
Register
Memory
Register
%eax
M[%ebp+8]
%eax
» Return function value in %eax
Object Code
0x401046:
03 45 08
3-byte
instruction
Stored at address 0x401046
Condition codes
The IA32 processor has a register called eflags
(extended flags)
Each bit is a flag, or condition code
CF Carry Flag SF Sign Flag
ZF Zero Flag
OFOverflow Flag
As programmers, we don’t write to this register and seldom
read it directly
Flags are set or cleared by hardware depending on the
result of an instruction
Condition Codes (cont.)
Setting condition codes via compare instruction
cmpl b,a
Computes a-b without setting destination
CF set if carry out from most significant bit
Used for unsigned comparisons
set if a == b
SF set if (a-b) < 0
OF set if two’s complement overflow
ZF
(a>0 && b<0 && (a-b)<0) || (a<0 && b>0 &&
(a-b)>0)
Byte
and word versions cmpb, cmpw
Condition Codes (cont.)
Setting condition codes via test instruction
testl b,a
Computes a&b without setting destination
Sets condition codes based on result
Useful to have one of the operands be a mask
Often used to test zero,
testl %eax, %eax
positive
set when a&b == 0
SF set when a&b < 0
Byte and word versions testb, testw
ZF
if statements
void f(){
int x = 1;
int y = 2;
if (x==y) {
printf("x equals y.\n");
} else {
printf("x is not equal to y.\n");
}
}
int main() { f();}
080483c4 <f>:
80483c4:
80483c5:
80483c7:
80483ca:
80483d1:
80483d8:
80483db:
80483de:
80483e0:
80483e7:
80483ec:
80483ee:
80483f5:
80483fa:
80483fb:
push
mov
sub
movl
movl
mov
cmp
jne
movl
call
jmp
movl
call
leave
ret
%ebp
%esp,%ebp
$0x18,%esp
$0x1,-0x8(%ebp)
$0x2,-0x4(%ebp)
-0x8(%ebp),%eax
-0x4(%ebp),%eax
80483ee <f+0x2a>
$0x80484f0,(%esp)
80482d8 <puts@plt>
80483fa <f+0x36>
$0x80484fc,(%esp)
80482d8 <puts@plt>
if statements
int a = 1, b = 3, c;
if (a > b)
c = a;
else
c = b;
00000018: C7 45 FC 01 00 00 00
mov dword ptr [ebp-4],1
; store a = 1
0000001F: C7 45 F8 03 00 00 00
mov dword ptr [ebp-8],3
; store b = 3
00000026: 8B 45 FC
mov eax,dword ptr [ebp-4]
; move a into EAX register
00000029: 3B 45 F8
cmp eax,dword ptr [ebp-8]
; compare a with b (subtraction)
0000002C: 7E 08
jle 00000036
; if (a<=b) jump to line 00000036
0000002E: 8B 4D FC
mov ecx,dword ptr [ebp-4]
; else move 1 into ECX register &&
00000031: 89 4D F4
mov dword ptr [ebp-0Ch],ecx
; move ECX into c (12 bytes down) &&
00000034: EB 06
jmp 0000003C
; unconditional jump to 0000003C
00000036: 8B 55 F8
mov edx,dword ptr [ebp-8]
; move 3 into EDX register &&
00000039: 89 55 F4
mov dword ptr [ebp-0Ch],edx
; move EDX into c (12 bytes down)
Loops
int factorial_do(int x)
{
int result = 1;
do {
result *= x;
x = x-1;
} while (x > 1);
return result;
}
factorial_do:
pushl
movl
movl
movl
.L2:
imull
decl
cmpl
jg
leave
ret
%ebp
%esp, %ebp
8(%ebp), %edx
$1, %eax
%edx, %eax
%edx
$1, %edx
.L2
C switch statements
Implementation options
Series of conditionals
testl followed by je
Good if few cases
Slow if many cases
Jump table (example below)
Lookup branch target from a table
Possible with a small range of integer constants
GCC picks implementation based on structure
Example:
.L3
switch (x) {
case 1:
case 5:
code at L0
case 2:
case 3:
code at L1
default:
code at L2
}
.L2
.L0
.L1
.L1
.L2
.L0
1. init jump table at .L3
2. get address at .L3+4*x
3. jump to that address
Example
int switch_eg(int x)
{
int result = x;
switch (x) {
case 100:
result *= 13;
break;
case 102:
result += 10;
/* Fall through */
case 103:
result += 11;
break;
case 104:
case 106:
result *= result;
break;
default:
result = 0;
}
return result;
}
int switch_eg(int x)
{
int result = x;
switch (x) {
case 100:
result *= 13;
break;
case 102:
result += 10;
/* Fall through */
case 103:
result += 11;
break;
case 104:
case 106:
result *= result;
break;
default:
result = 0;
leal -100(%edx),%eax
cmpl $6,%eax
ja .L9
jmp *.L10(,%eax,4)
.p2align 4,,7
.section
.rodata
.align 4
.align 4
.L10:
.long .L4
.long .L9
.long .L5
.long .L6
.long .L8
.long .L9
.long .L8
.text
.p2align 4,,7
.L4:
leal (%edx,%edx,2),%eax
leal (%edx,%eax,4),%edx
jmp .L3
.p2align 4,,7
.L5:
addl $10,%edx
.L6:
addl $11,%edx
jmp .L3
.p2align 4,,7
.L8:
imull %edx,%edx
jmp .L3
.p2align 4,,7
.L9:
xorl %edx,%edx
.L3:
movl %edx,%eax
}
return result;
}
Key is jump table
41 at L10
Array of pointers to jump locations
x86-64 conditionals
Modern CPUs with deep pipelines
Instructions
fetched far in advance of execution
Mask the latency going to memory
Problem: What if you hit a conditional branch?
Must predict which branch to take!
Branch prediction in CPUs well-studied, fairly effective
But, best to avoid conditional branching altogether
x86-64 conditionals
Conditional instruction execution
Conditional Move
Conditional move instruction
cmovXX src, dest
Move value from src to dest if condition XX holds
No branching
Handled as operation within Execution Unit
Added with P6 microarchitecture (PentiumPro onward)
Example
movl 8(%ebp),%edx
movl 12(%ebp),%eax
cmpl %edx, %eax
cmovll %edx,%eax
#
#
#
#
Get x
rval=y
rval:x
If <, rval=x
Current version of GCC won’t use this instruction
Thinks it’s compiling for a 386
Performance
14 cycles on all data
More efficient than conditional branching (simple control flow)
But overhead: both branches are evaluated
x86-64 conditional example
int absdiff(
int x, int y)
{
int result;
if (x > y) {
result = x-y;
} else {
result = y-x;
}
return result;
}
absdiff:
movl
%edi,
movl
%esi,
subl
%esi,
subl
%edi,
cmpl
%esi,
cmovle %edx,
ret
# x in %edi, y in %esi
%eax # eax = x
%edx # edx = y
%eax # eax = x-y
%edx # edx = y-x
%edi # x:y
%eax # eax=edx if <=
IA32 Stack
Stack “Bottom”
Region
of memory
managed with stack
discipline
Grows toward lower
addresses
Register %esp indicates
lowest stack address
Increasing
Addresses
address of top element
Stack
Pointer
%esp
Stack Grows
Down
Stack “Top”
IA32 Stack Pushing
Stack “Bottom”
Pushing
pushl
Src
Decrement %esp by 4
Increasing
Addresses
Fetch operand at Src
Write operand at address
given by %esp
e.g.
pushl %eax
subl $4, %esp
movl %eax,(%esp)
Stack Grows
Down
Stack
Pointer
%esp
-4
Stack “Top”
IA32 Stack Popping
Stack “Bottom”
Popping
popl
Dest
Read operand at address
Increasing
Addresses
given by %esp
Write to Dest
Increment %esp by 4
e.g.
popl %eax
movl (%esp),%eax
addl $4,%esp
Stack
Pointer
%esp
Stack Grows
Down
+4
Stack “Top”
Stack Operation Examples
Initially
pushl %eax
popl %edx
0x110
0x110
0x110
0x10c
0x10c
0x10c
0x108
123
0x108
123
0x108
123
Top
0x104
213
0x104
213
Top
%eax
213
%edx
%esp
%eax
213
%edx
0x108
%esp
0x108
0x104
Top
%eax
213
%edx
555
213
%esp
0x104
0x108
Procedure Control Flow
Procedure call:
call label
Push address of next instruction (after the call) on stack
Jump to label
Procedure return:
ret
Pop address from stack into eip register
Procedure Call Example
804854e:
8048553:
e8 3d 06 00 00
50
call
8048b90 <main>
next instruction
call
0x110
0x110
0x10c
0x10c
0x108
123
8048b90
0x108
123
0x104
0x8048553
%esp
0x108
%esp
0x108
0x104
%eip
0x804854e
%eip
0x804854e
0x8048b90
%eip is program counter
Procedure Return Example
8048e90:
c3
ret
ret
0x110
0x110
0x10c
0x10c
0x108
123
0x104
0x8048553
%esp
0x104
%esp
0x104
0x108
%eip
0x8048e90
%eip
0x8048553
0x8048e91
%eip is program counter
0x108
123
0x8048553
Procedure Control Flow
When procedure foo calls who:
foo is the caller, who is the callee
Control is transferred to the ‘callee’
When procedure returns
Control is transferred back to the ‘caller’
Last-called, first-return (LIFO) order
Naturally implemented via the stack
foo(…)
{
• • •
who();
• • •
}
call
who(…)
{
• • •
amI();
• • •
amI();
ret • • •
}
call
amI(…)
{
• • •
ret • • •
}
Procedure calls and stack frames
How does the ‘callee’ know where to return later?
Return address placed in a well-known location on
stack within a “stack frame”
How are arguments passed to the ‘callee’?
Arguments placed in a well-known location on stack
within a “stack frame”
Upon procedure invocation
Stack frame is pushed onto program stack
Upon procedure return
Its frame is popped off of stack
Caller’s stack frame is recovered
foo’s
stack
frame
who’s
stack
frame
amI’s
stack
frame
Call chain: foo => who => amI
increasing addresses
Stack frame created for the procedure
stack growth
Stack bottom
Keeping track of stack frames
The stack pointer (%esp) moves around
Can
be changed within procedure
Problem
How can we consistently find our parameters?
The
base pointer (%ebp)
Points to the base of our current stack frame
Also called the frame pointer
Within each function, %ebp stays constant
Most information on the stack is referenced
relative to the base pointer
Base pointer setup is the programmer’s
Actually usually the compiler’s job
job
IA32/Linux Stack Frame
Current Stack Frame (Yellow) (From Top
to Bottom)
Parameters for function about to be
called
Caller
Frame
“Argument build” of caller
Arguments
Local variables
If can’t keep in registers
Saved register context
Old frame pointer
Frame Pointer
(%ebp)
Saved
Registers
+
Local
Variables
Caller Stack Frame (Pink)
Return address
Pushed by call instruction
Return Addr
Old %ebp
Arguments for this call
“Argument build” of callee
etc…
Stack Pointer
(%esp)
Argument
Build
swap
Calling swap from call_swap
int zip1 = 15213;
int zip2 = 91125;
void call_swap()
{
swap(&zip1, &zip2);
}
void swap(int *xp, int *yp)
{
int t0 = *xp;
int t1 = *yp;
*xp = t1;
*yp = t0;
}
call_swap:
• • •
pushl $zip2
pushl $zip1
call swap
• • •
# Global Var
# Global Var
•
•
•
Resulting
Stack
&zip2
&zip1
Rtn adr
%esp
swap
swap:
pushl %ebp
movl %esp,%ebp
pushl %ebx
movl
movl
movl
movl
movl
movl
12(%ebp),%ecx
8(%ebp),%edx
(%ecx),%eax
(%edx),%ebx
%eax,(%edx)
%ebx,(%ecx)
movl -4(%ebp),%ebx
movl %ebp,%esp
popl %ebp
ret
Setup
Body
Finish
void swap(int *xp, int *yp)
{
int t0 = *xp;
int t1 = *yp;
*xp = t1;
*yp = t0;
}
swap Setup #1
Resulting
stack
Entering
Stack
%ebp
%ebp
•
•
•
•
•
•
&zip2
yp
&zip1
xp
Rtn adr
%esp
Rtn adr
Old %ebp
swap:
pushl %ebp
movl %esp,%ebp
pushl %ebx
%esp
swap Setup #2
Resulting
stack
Stack before
instruction
%ebp
•
•
•
•
•
•
yp
yp
xp
xp
Rtn adr
Rtn adr
Old %ebp
%esp
Old %ebp
%ebp
%esp
swap:
pushl %ebp
movl %esp,%ebp
pushl %ebx
swap Setup #3
Resulting
Stack
Stack before
instruction
•
•
•
•
•
•
yp
yp
xp
xp
Rtn adr
Rtn adr
Old %ebp
%ebp
Old %ebp
%ebp
%esp
Old %ebx
%esp
swap:
pushl %ebp
movl %esp,%ebp
pushl %ebx
Effect of swap Setup
Resulting
Stack
Entering
Stack
%ebp
•
•
•
Offset
(relative to %ebp)
•
•
•
&zip2
12
yp
&zip1
8
xp
4
Rtn adr
0
Old %ebp
%ebp
Old %ebx
%esp
Rtn adr
%esp
movl 12(%ebp),%ecx # get yp
movl 8(%ebp),%edx # get xp
. . .
Body
swap Finish #1
swap’s
Stack
•
•
•
Offset
Offset
•
•
•
12
yp
12
yp
8
xp
8
xp
4
Rtn adr
4
Rtn adr
0
Old %ebp
%ebp
0
Old %ebp
%ebp
-4
Old %ebx
%esp
-4
Old %ebx
%esp
Observation
Saved
& restored register %ebx
movl -4(%ebp),%ebx
movl %ebp,%esp
popl %ebp
ret
swap Finish #2
swap’s
Stack
Offset
swap’s
Stack
•
•
•
Offset
•
•
•
12
yp
12
yp
8
xp
8
xp
4
Rtn adr
4
Rtn adr
0
Old %ebp
%ebp
0
Old %ebp
-4
Old %ebx
%esp
%ebp
%esp
movl -4(%ebp),%ebx
movl %ebp,%esp
popl %ebp
ret
swap Finish #3
swap’s
Stack
Offset
•
•
•
12
yp
8
xp
4
Rtn adr
0
Old %ebp
%ebp
swap’s
Stack
Offset
%ebp
•
•
•
12
yp
8
xp
4
Rtn adr
%esp
%esp
movl -4(%ebp),%ebx
movl %ebp,%esp
popl %ebp
ret
swap Finish #4
%ebp
swap’s
Stack
%ebp
•
•
•
•
•
•
12
yp
&zip2
8
xp
&zip1
4
Rtn adr
Offset
Exiting
Stack
%esp
%esp
Observation
movl -4(%ebp),%ebx
movl %ebp,%esp
popl %ebp
ret
& restored register %ebx
Didn’t do so for %eax, %ecx, or %edx
Saved
swap
swap:
pushl %ebp
movl %esp,%ebp
pushl %ebx
Setup
movl
movl
movl
movl
movl
movl
Body
12(%ebp),%ecx
8(%ebp),%edx
(%ecx),%eax
(%edx),%ebx
%eax,(%edx)
%ebx,(%ecx)
movl -4(%ebp),%ebx
movl %ebp,%esp
popl %ebp
ret
void swap(int *xp, int *yp)
{
int t0 = *xp;
int t1 = *yp;
*xp = t1;
*yp = t0;
}
Save old %ebp of caller frame
Set new %ebp for callee (current) frame
Save state of %ebx register from caller
Retrieve parameter yp from caller frame
Retrieve parameter xp from caller frame
Perform swap
Finish
Restore the state of caller’s %ebx register
Set stack pointer to bottom of callee frame (%ebp)
Restore %ebp to original state
Pop return address from stack to %eip
Equivalent to single leave instruction
Local variables
Where are they in relation to ebp?
Stored
“above” %ebp (at lower addresses)
How are they preserved if the current function
calls another function?
updates %esp beyond local variables
before issuing “call”
Compiler
What happens to them when the current function
returns?
Are
lost (i.e. no longer valid)
Register Saving Conventions
When procedure foo calls who:
foo is the caller, who is the callee
Can Register be Used for Temporary Storage?
Conventions
“Caller Save”
Caller saves temporary in its frame before calling
“Callee
Save”
Callee saves temporary in its frame before using
IA32 Register Usage
Integer Registers
Two have special uses
%ebp, %esp
Three managed as callee-save
%eax
Caller-Save
Temporaries
%ebx, %esi, %edi
Old values saved on stack
prior to using
Three managed as caller-save
%eax, %edx, %ecx
Do what you please, but
expect any callee to do so, as
well
Return value in %eax
%edx
%ecx
%ebx
Callee-Save
Temporaries
%esi
%edi
%esp
Special
%ebp
simple.c
gcc –O2 –c simple.c
int simple(int *xp, int y)
{
int t = *xp + y;
*xp = t;
return t;
}
_simple:
pushl
movl
movl
movl
movl
addl
movl
popl
ret
%ebp
Setup stack frame pointer
%esp, %ebp
8(%ebp), %edx
get xp
12(%ebp), %ecx get y
(%edx), %eax
move *xp to t
%ecx, %eax
add y to t
%eax, (%edx)
store t at *xp
%ebp
restore frame pointer
return to caller
Function pointers
Pointers in C can also point to code locations
Function pointers
Store and pass references to code
Some uses
Dynamic “late-binding” of functions
Dynamically “set” a random number generator
Replace large switch statements for implementing dynamic event handlers
» Example: dynamically setting behavior of GUI buttons
Emulating “virtual functions” and polymorphism from OOP
qsort() with user-supplied callback function for comparison
» man qsort
Operating on lists of elements
» multiplicaiton, addition, min/max, etc.
Malware leverages this to execute its own code
Using pointers to functions
// function prototypes
int doEcho(char*);
int doExit(char*);
int doHelp(char*);
int setPrompt(char*);
// dispatch table section
typedef int (*func)(char*);
typedef struct{
char* name;
func function;
} func_t;
func_t func_table[] =
{
{ "echo",
doEcho },
{ "exit",
doExit },
{ "quit",
doExit },
{ "help",
doHelp },
{ "prompt", setPrompt },
};
// find the function and dispatch it
for (i = 0; i < cntFuncs; i++) {
if (strcmp(command,func_table[i].name)==0){
done = func_table[i].function(argument);
break;
}
}
if (i == cntFuncs)
printf("invalid command\n");
#define cntFuncs
(sizeof(func_table) / sizeof(func_table[0]))
Function pointers example
main:
leal
#include <sys/time.h>
#include <stdio.h>
void fp1(int i){ printf("Even\n“,i);}
void fp2(int i) { printf("Odd\n”,i); }
andl
pushl %ebp
main(int argc, char **argv) {
void (*fp)(int);
int i = argc;
}
%esp, %ebp
pushl %ecx
subl
$4, %esp
movl
(%ecx), %eax
movl
$fp2, %edx
testb $1, %al
jne
.L4
movl
$fp1, %edx
movl
%eax, (%esp)
.L4:
call
mashimaro % ./funcp a
Even 2
mashimaro % ./funcp a b
Odd 3
mashimaro %
$-16, %esp
pushl -4(%ecx)
movl
if (argc%2)
fp=fp2;
else
fp=fp1;
fp(i);
4(%esp), %ecx
*%edx
addl
$4, %esp
popl
%ecx
popl
%ebp
leal
ret
-4(%ecx), %esp
Uses in operating system
Interrupt descriptor table
Pointers
to interrupt handler functions
IDTR points to IDT
System services descriptor table
Pointers
to system call functions
Import address table
Pointers
to imported library calls
Malware attacks all of these
More disassembly
Code patterns in assembly
Calling
conventions (fast vs. standard vs. cdecl)
ebp omission
ecx use as C++ this pointer
C++ vtables (virtual function table)
WinXP SP2 prologue with patching support
For detours
Exception
handlers (FS register)
Linked list of functions stored in exception frames on
stack
Advanced disassembly
Windows examples
Largely
the same with small modifications
Size of operands (i.e. dword) specified (not in
operator suffix)
Reverse ordering of operands
Disassembly example
0000
mov ecx, 5
for(int i=0;i<5;i++)
0003
push aHello
{
0009
call printf
000E
loop 00000003h
0014
...
0000
cmp ecx, 100h
if(x == 256)
0003
jnz 001Bh
{
0009
push aYes
000F
call printf
}
0015
jmp 0027h
else
001B
push aNo
{
0021
call printf
0027
...
printf(“Hello”);
}
printf(“Yes”);
printf(“No”);
}
Disassembly example
int main(int argc, char **argv)
{
WSADATA wsa;
SOCKET s;
struct sockaddr_in name;
unsigned char buf[256];
// Initialize Winsock
if(WSAStartup(MAKEWORD(1,1),&wsa))
return 1;
// Create Socket
s = socket(AF_INET,SOCK_STREAM,0);
if(INVALID_SOCKET == s)
goto Error_Cleanup;
name.sin_family = AF_INET;
name.sin_port = htons(PORT_NUMBER);
name.sin_addr.S_un.S_addr = htonl(INADDR_ANY);
// Bind Socket To Local Port
if(SOCKET_ERROR == bind(s,(struct sockaddr*)&name,sizeof(name)))
goto Error_Cleanup;
// Set Backlog parameters
if(SOCKET_ERROR == listen(s,1))
goto Error_Cleanup;
push ebp
mov ebp, esp
sub esp, 2A8h
lea eax, [ebp+0FFFFFE70h]
push eax
push 101h
call 4012BEh
test eax, eax
jz 401028h
mov eax, 1
jmp 40116Fh
push 0
push 1
push 2
call 4012B8h
mov dword ptr [ebp+0FFFFFE6Ch], eax
cmp dword ptr [ebp+0FFFFFE6Ch], byte 0FFh
jnz 401047h
jmp 401165h
mov word ptr [ebp+0FFFFFE5Ch], 2
push 800h
call 4012B2h
mov word ptr [ebp+0FFFFFE5Eh], ax
push 0
call 4012ACh
mov dword ptr [ebp+0FFFFFE60h], eax
push 10h
lea ecx, [ebp+0FFFFFE5Ch]
push ecx
mov edx, [ebp+0FFFFFE6Ch]
push edx
call 4012A6h
cmp eax, byte 0FFh
jnz 40108Dh
jmp 401165h
push 1
mov eax, [ebp+0FFFFFE6Ch]
push eax
call 4012A0h
cmp eax, byte 0FFh
jnz 4010A5h
jmp 401165h
Tools for disassembling
IDA Pro, IDA Pro Free
–
Disassembler
–
Execution graph
–
Cross-referencing
–
Searching
–
Function analysis
–
Function and variable labeling
Tools for disassembling
objdump
objdump -d <object_file>
Analyzes bit pattern of series of instructions
Produces approximate rendition of assembly code
Can be run on either executable or relocatable (.o) file
gdb Debugger
gdb p
disassemble sum
Disassemble procedure
x/13b sum
Examine the 13 bytes starting at sum
In-class exercise
Lab 5-1 (Steps 1-17)
–
Use IDA Pro to bring up the code of DllMain
–
Bring up Figures 5-1L, the equivalent of 5-2L, and 5-3L
–
Find the remote shell routine in which memcmp is used to
compare command strings received over the network
–
Show the code for the function called if the command robotwork
is invoked
–
Show IDA Pro graphs of DLLMain and sub_10004E79
–
Explain what the assembly code on p. 499 does
–
Find the socket call referred to in Table 5-1L and change its
integer constants to symbolic ones
–
Show the assembly on p. 500. Find the routine that calls this
assembly which shows that it is an anti-VM check.
In-class exercise
Lab 6-1
–
Show the imported network functions in any tool
–
Show the output of executing the binary
–
Load binary in IDA Pro to generate Figure 6-1L
Lab 6-2
–
Generate Listing 6-1L and 6-2L using a tool of your choice. What
calls hint at this code's function?
–
Using either Wireshark or netcat with Apate DNS, execute the
malware to generate Listing 6-3L
–
In IDA Pro, show the functions called by main. What does each
one do?
–
In IDA Pro, show the order that the WinINet calls are used and
explain what each one does.
–
Generate Listing 6-5L and explain what each cmp does.
Windows
Chapter 7: Analyzing Malicious Windows
Programs
Types
Hungarian notation
word
(w) = 16 bit value
double word (dw) = dword = 32 bit value
•
Handles
•
Long
dwSize = A type that is a 32-bit value
(H)
HWND = A handle to a window
Pointer (LP)
Callback
File system functions
Malware often hits file system
CreateFile,
ReadFile, WriteFile
Memory mapping calls: CreateFileMapping,
MapViewOfFile
Trickiness
•
•
•
Alternate Data Streams (special file data)
\Device\PhysicalMemory (accesses memory)
\\.\ (accesses device)
Registry functions
Malware often hits registry
Registry
stores OS and program configuration
information
HKEY_LOCAL_MACHINE (HKLM) – Settings global
to the machine
HKEY_CURRENT_USER (HKCU) – Settings for
current user
Regedit tool for examining values
Functions: RegOpenKeyEx, RegSetValueEx,
RegGetValue (Listing 7-1)
Networking APIs
Berkeley sockets API
socket,
bind, listen, accept, connect, recv, send
Listing 7-3
WinINet API
InternetOpen,
InternetOpenURL, InternetReadFile
DLLs
Dynamic link libraries
Store
code that is re-used amongst applications
including malware
Can be used to store malicious code for injection
into a process
Malware uses standard Windows DLLs to interact
with OS
Malware uses third-party DLLs (e.g. Firefox DLL) to
avoid re-implementing functions
Processes
Execute code outside of current process
CreateProcess
Listing
7-4
Hijack execution of current process
Injecting
code via debugger or DLLs
Companion execution
Store
executable in resource section of PE
Program extracts executable and writes it to disk
upon execution
Threads
Windows threads share same memory space but
have separate registers and stack
Used
by Malware to insert a malicious DLL into a
process's address space
CreateThread
address
with address of LoadLibrary as start
Services
Processes run in the background
Scheduled
and run by Windows service manager
without user input
OpenSCManager,
CreateService, StartService
Allows
malware to maintain persistence on a
machine
Types
•
•
•
WIN32_SHARE_PROCESS = allows multiple
processes to contact service (e.g. svchost.exe)
WIN32_OWN_PROCESS = independent process
KERNEL_DRIVER = loads code into kernel
COM
Microsoft Component Object Model
Interface
standard that allows software components
to call each other
•
•
OleInitialize, CoInitializeEx
CLSID = class identifier, IID = interface identifier
“Navigate”
•
•
Malware
•
•
function in IWebBrowser2 interface
Used by malware to launch browser
Listing 7-11
implemented as COM server
Browser helper objects
Detect COM servers running via its calls
–
DllCanUnloadNow, DllGetClassObject, DllInstall,
DllRegisterServer, DllUnregisterServer
Exceptions
Allow program to handle exceptional conditions
during program execution
Windows
•
•
•
•
Used
Structured Exception Handling
Exception handling information stored on stack
Listing 7-13
Not all handlers respond to all exceptions
Thrown to caller's frame if not handled
by malware to hijack execution
•
•
Handler address replaced by address to
injected malicious code
Adversary then triggers exception
Kernel-mode malware
Windows API calls (Kernel32.dll)
Typically
call into underlying Native API (Ntdll.dll)
Code
in Ntdll then transfers to kernel
(Ntoskrnl.exe) via INT 0x2E, SYSENTER,
SYSCALL
•
Figure 7-3
Malware
often calls Ntdll directly to avoid
detection via interposition of security programs
between Kernel32.dll and Ntdll.dll
•
•
Example: Windows API (ReadFile, WriteFile)
versus Native API (NtReadFile, NtWriteFile)
Figure 7-4
Kernel-mode malware
Other Native API calls
NtQuerySystemInformation,
NtQueryInformationProcess,
NtQueryInformationThread,
NtQueryInformationFile, NtQueryInformationKey
•
Can also carry “Zw” prefix
NtContinue
•
•
Used to return from an exception
Location to return is specified in exception
context, but can be modified to transfer
execution in nefarious ways
Kernel-mode malware
Legitimate programs typically do not use
NativeAPI exclusively
Programs that are native applications (as
specified in subsytem part of PE header) are
likely malicious
In-class exercise
Lab 7-2
Using strings, identify the network resource being used by the
malware
What imports give away the mechanism this malware uses to
launch the browser?
Go to the code snippet shown on p. 518. Follow the references
to show the values of rclsid and riid in memory.
Debug the program and break at the call shown on p. 519. Run
the call to show the browser being launched with the embedded
URL
Extra
Run-time data structures
More code snippets
Registry modifications for disabling task manager and changing browser default
page
HKEY_CURRENT_USER\Software\Policies\Microsoft\Internet Explorer\Control Panel,Homepage
HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Policies\SystemDisableRegistryTools
HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\MainStart Page
HKEY_CURRENT_USER\Software\Yahoo\pager\View\YMSGR_buzz content url
HKEY_CURRENT_USER\Software\Yahoo\pager\View\YMSGR_Launchcast DisableTaskMgr
More code snippets
Kills anti-virus, zone-alarm, firewall processes
More code snippets
New variants
Download worm update files and register them as services
regsvr32 MSINET.OCX
Internet Transfer ActiveX Control
Check for updates