with the Slides
Download
Report
Transcript with the Slides
Hello ASM World:
A Painless and Contextual
Introduction to x86 Assembly
rogueclown
DerbyCon 3.0
September 28, 2013
who?
• security consultant by vocation
• mess around with computers, code, CTFs
by avocation
• frustrated when things feel like a black
box
what is assembly language?
• not exactly machine language…but close
– instructions: mnemonics for machine
operations
– normally a one-to-one correlation between
ASM instruction and machine instruction
• varies by processor
– today, we will be discussing 32-bit x86
why learn assembly language?
• some infosec disciplines require it
• curious about lower-level details of
memory or interfacing with an operating
system
• it’s fun and challenging!
how does assembly
language work?
hello memory
• what parts of computer memory does
assembly language commonly access?
• how does assembly language access those
parts of computer memory?
where is this memory?
• what one “normally” thinks of as memory
– RAM
– virtual memory
• CPU
– registers
computer memory layout
• heap
– global variables, usually allocated at
compile-time
– envision a bookshelf…that won’t let you push
books together when you take one out
• stack
– local, contextual variables
– envision a card game discard pile
– you will use this when coding ASM. a lot.
registers
• memory located on the CPU
• registers are awesome because they are
fast.
• registers are a pain because they are tiny.
registers
• general purpose registers
– alphabet soup
• eax, ebx, ecx, edx
• can address in parts: ax, ah, al
– stack and base pointers
• esp
• ebp
– index registers
• esi, edi
registers
• instruction pointer
– eip
– records the next instruction for the program
to follow
• other registers
– eflags
– segment registers
instructions
• mov
– moves a value to a register
– can either specify a value, or specify a
register where a value resides
• syntax in assembly
– Intel syntax: mov ebx, 0xfee1dead
– AT&T syntax: mov $0xfee1dead, %eax
instructions
• interrupt
– int 0x80
– int 0x3
• system calls
– how a program
interacts with the
kernel of the OS
instructions
• mathematical instructions
– add, sub, mul, div
mov eax, 10
cdq
; edx is now 0
div 3; eax is now 3, edx is now 1
– dec, inc – useful for looping
mov ecx, 3
dec ecx
; ecx is now 2
jumps
• jge, jg, jle, jl
– work with a compare (cmp) instruction
• jz, jnz, js, jns
– check zero flag or sign flag for jump
instructions
• stack operations: push and pop
mov eax, 10
push eax
inc eax
push eax
pop ebx
pop ecx
;
;
;
;
;
10 on top of stack
eax is now 11
11 on top of stack
ebx is now 11
ecx is now 10
instructions
• function access instructions
– call
• places the address of the next instruction on top
of the stack
• moves execution to identified function
– ret
• returns to the memory address on top of the
stack
• designed to work in tandem with the “call”
instruction…but we’re hackers, yes?
sections of ASM code
• .data
– constant variables initialized at compile time
• .bss
– declaration of variables that may are set of
changed during runtime
• .text
– executable instructions
$%&#@%^ instructions:
how do they work?
putting it together
• time to take a bit of C code, and
reimplement it in assembly language!
where does shellcode
come in?
what is shellcode?
• instructions injected into a running
process
• lacks some of the luxuries of writing a
stand-alone program
– no laying out nice memory segments in a .bss
or .data section
– basically, just one big .text section
a first stab at shellcode…
• this is going to look mostly familiar,
except for how data is handled.
why did it fail?
• bad characters
– shellcode is often passed to an application as
a string.
– if a character makes a string act funny, you
may not want it in your shellcode
• 0x00, 0x0a, 0x0d, etc.
– use an encoder, or do it yourself
try that shellcode again…
where can i learn more about
assembly language?
suggested resources
• dead trees
– “Hacking: The Art of Exploitation” by Jon
Erickson
– “Practical Malware Analysis” by Michael
Sikorski and Andrew Honig
– “Gray Hat Python” by Justin Seitz
suggested resources
• the series of tubes
– http://ref.x86asm.net – quick and dirty opcode
reference
– http://www.nasm.us/doc – Netwide Assembler
documentation
• system calls
– Linux:
• /usr/include/asm/unistd.h
• man 2 $syscall
– Windows:
• http://msdn.microsoft.com/library/windows/desktop/hh92
0508%28vs.85%29 – Windows API reference
how to find me
• Twitter: @rogueclown
• email: [email protected]
• IRC: #derbycon, #misec, or #burbsec on
Freenode
• or, just wave me down at the con