LLVM - Loda`s blog

Download Report

Transcript LLVM - Loda`s blog

Introduce LLVM from
a hacker's view.
Loda chou.
[email protected]
For HITCON 2012
Who am I?
 I am Loda.
 Work for 豬屎屋 (DeSign House).
 Be familiar for MS-Windows System and
Android/Linux Kernel.
 Sometimes…also do some software crack job.
 Like to dig-in new technology and share technical
articles to promote to the public.
 Motto
 The way of a fool seems right to him ,but a wise man
listens to advice. (Proverbs 12:15)
愚妄人所行的,在自己眼中看為正直;惟智慧人肯聽人的勸教
What is LLVM?
 Created by Vikram Adve and Chris Lattne on 2000
 Support different front-end compilers (gcc/clang/....) and different
languages (C/C++,Object-C,Fortran,Java
ByteCode,Python,ActionScript) to generate BitCode.
 The core of LLVM is the intermediate representation (IR). Different
front-ends would compile source code to SSA-based IR, and
traslate the IR into different native code on different platform.
 Provide RISC-like instructions (load/store…etc), unlimited registers,
exception (setjmp/longjmp)..etc
 Provide LLVM Interpreter and LLVM Compiler to run LLVM
application.
Let's enjoy it.
Android Dalvik RunTime
Dalvik ByteCode AP in dex/odex
Dalvik
Virtual Machine
Dalvik ByteCode
Framework in JAR
Java Native Interface
Partial Dalvik AP
implemented in
Native .so
Native .so library
Linux Kernel
The features of Dalvik VM
 Per-Process per-VM
 JDK will compile Java to Sun’s bytecode, Android would
use dx to convert Java bytecode to Dalvik bytecode.
 Support Portable Interpreter (in C), Fast Interpreter (in
Assembly) and Just-In Time Compiler
 Just-In-Time Compiler is Trace-Run based.
 By Counter to find the hot-zone
 Would translate Dalvik bytecode to
ARMv32/NEON/Thumb/Thumb2/..etc CPU instructions.
LLVM Interpreter RunTime
LLVM BitCode AP
Running by LLI (Low Level Virtual Machine
Interpreter & Dynamic Compiler)
Native .so library
Linux Kernel
Why LLVM?
 Could run llvm-application as the performance of native
application
 Could generate small size BitCode, translate to target
platform assembly code then compiled into native
execution file (final size would be almost the same as you
compile it directly from source by GCC or other compiler.)
 Support C/C++/… program to seamlessly execute on
variable hardware platform.
 x86, ARM, MIPS,PowerPC,Sparc,XCore,Alpha…etc
 Google would apply it into Android and Browser (Native
Client)
The LLVM Compiler Work-Flows.
llc -mcpu=x86-64
C/C++
Assembly
clang -emit-llvm
llc -mcpu=cortex-a9
Java
Fortran
BitCode
Assembly
X86
LLVM
Compiler
gcc
X86
Execution File
arm-none-linux-gnueabi-gcc
-mcpu=cortex-a9
ARM
ARM
Assembly
Execution File
...etc
LLVM in Mobile Device
C/C++
BitCode
Java
ByteCode
BitCode
Render
Script
BitCode
Application
LLVM in Browser
Chromium
Browser
HTML/Java Script
SRPC
NPAPI
Would passed the security
checking before execution.
IMC
UnTrust Part
Native Client APP
Call to run-time
framework
Service Framework
SRPC
IMC
Storage
IMC : Inter-Module Communications
Service
SRPC : Simple RPC
NPAPI : Netscape Plugin Application Programming Interface
Trust Part
LLVM Compiler Demo.
Use clang to compile BitCode File.
[root@localhost reference_code]# clang -O2 -emit-llvm sample.c -c -o sample.bc
[root@localhost reference_code]# ls -l sample.bc
-rw-r--r--. 1 root root 1956 May 12 10:28 sample.bc
Convert BitCode File to x86-64 platform assembly code.
[root@localhost reference_code]# llc -O2 -mcpu=x86-64 sample.bc -o sample.s
Compiler the assembly code to x86-64 native execution file.
[root@localhost reference_code]# gcc sample.s -o sample -ldl
[root@localhost reference_code]# ls -l sample
-rwxr-xr-x. 1 root root 8247 May 12 10:36 sample
Convert BitCode File to ARM Cortext-A9 platorm assembly code.
[root@localhost reference_code]# llc -O2 -march=arm -mcpu=cortex-a9 sample.bc -o
sample.s
Compiler the assembly code to ARM Cortext-A9 native execution file.
[root@localhost reference_code]# arm-none-linux-gnueabi-gcc -mcpu=cortex-a9 sample.s -ldl -o
sample
[root@localhost reference_code]# ls -l sample
-rwxr-xr-x. 1 root root 6877 May 12 10:54 sample
What is the problems for LLVM?
Let’s see a simple sample
code.
LLVM dlopen/dlsymc Sample.





[root@www LLVM]# clang -O2 -emit-llvm dlopen.c -c -o dlopen.bc
[root@www LLVM]# lli dlopen.bc
libraryHandle:86f5e4c8h
puts function pointer:85e81330h
loda
int (*puts_fp)(const char *);
int main()
{
void * libraryHandle;
libraryHandle = dlopen("libc.so.6", RTLD_NOW);
printf("libraryHandle:%xh\n",(unsigned int)libraryHandle);
puts_fp = dlsym(libraryHandle, "puts");
printf("puts function pointer:%xh\n",(unsigned int)puts_fp);
puts_fp("loda");
return 0;
}
Make execution code as data buffer
 Would place the piece of machine code as a data buffer
to verify the native/LLVM run-time behaviors.
0000000000000000 <AsmFunc>:
0: 55
push %rbp
1: 48 89 e5
mov %rsp,%rbp
4: b8 04 00 00 00
mov $0x4,%eax
9: bb 01 00 00 00
mov $0x1,%ebx
e: b9 00 00 00 00
mov $0x0,%ecx
f: R_X86_64_32 gpHello
13: ba 10 00 00 00
mov $0x10,%edx
18: cd 80
int $0x80
1a: b8 11 00 00 00
mov $0x11,%eax
1f: c9
leaveq
20: c3
retq
Native Program Run Code in Data
Segment
 [root@www LLVM]# gcc self-modify.c -o self-modify
 [root@www LLVM]# ./self-modify
 Segmentation fault
int (*f2)();
char
TmpAsmCode[]={0x90,0x55,0x48,0x89,0xe5,0xb8,0x04,0x00,0x00,0x00,0xbb,0x01,0x00,0x00,0x00,0xb9,0x4
0,0x0c,0x60,0x00,0xba,0x10,0x00,0x00,0x00,0xcd,0x80,0xb8,0x11,0x00,0x00,0x00,0xc9,0xc3};
char gpHello[]="Hello Loda!ok!\n";
int main()
{
int vRet;
unsigned long vpHello=(unsigned long)gpHello;
TmpAsmCode[19]=vpHello>>24 & 0xff;
TmpAsmCode[18]=vpHello>>16 & 0xff;
TmpAsmCode[17]=vpHello>>8 & 0xff;
TmpAsmCode[16]=vpHello & 0xff;
f2=(int (*)())TmpAsmCode;
vRet=f2();
printf("vRet=:%d\n",vRet);
return 0;
}
Native Program Run Code in Data
Segment with Page EXEC-settings




[root@www LLVM]# gcc self-modify.c -o self-modify
[root@www LLVM]# ./self-modify
Hello Loda!ok!
vRet=:17
int (*f2)();
char
TmpAsmCode[]={0x90,0x55,0x48,0x89,0xe5,0xb8,0x04,0x00,0x00,0x00,0xbb,0x01,0x00,0x00,0x00,0xb9,0x4
0,0x0c,0x60,0x00,0xba,0x10,0x00,0x00,0x00,0xcd,0x80,0xb8,0x11,0x00,0x00,0x00,0xc9,0xc3};
char gpHello[]="Hello Loda!ok!\n";
int main()
{
int vRet;
unsigned long vpHello=(unsigned long)gpHello;
unsigned long page = (unsigned long) TmpAsmCode & ~( 4096 - 1 );
if(mprotect((char*) page,4096,PROT_READ | PROT_WRITE | PROT_EXEC ))
perror( "mprotect failed" );
TmpAsmCode[19]=vpHello>>24 & 0xff;
TmpAsmCode[18]=vpHello>>16 & 0xff;
TmpAsmCode[17]=vpHello>>8 & 0xff;
TmpAsmCode[16]=vpHello & 0xff;
f2=(int (*)())TmpAsmCode;
vRet=f2();
printf("vRet=:%d\n",vRet);
return 0;
}
LLVM AP Run Code in Data Segment
with EXEC-settings




[root@www LLVM]# clang -O2 -emit-llvm llvm-self-modify.c -c -o llvm-self-modify.bc
[root@www LLVM]# lli llvm-self-modify.bc
Hello Loda!ok!
int (*f2)();
vRet=:17
char
TmpAsmCode[]={0x90,0x55,0x48,0x89,0xe5,0xb8,0x04,0x00,0x00,0x00,0xbb,0x01,0x00,0x00,0x00,0xb9,0x40,0x0c,0x60,0x00,0xb
a,0x10,0x00,0x00,0x00,0xcd,0x80,0xb8,0x11,0x00,0x00,0x00,0xc9,0xc3};
char gpHello[]="Hello Loda!ok!\n";
int main()
{
int vRet;
unsigned long vpHello=(unsigned long)gpHello;
unsigned long page = (unsigned long) TmpAsmCode & ~( 4096 - 1 );
if(mprotect((char*) page,4096,PROT_READ | PROT_WRITE | PROT_EXEC ))
perror( "mprotect failed" );
char *base_string=malloc(256);
strcpy(base_string,gpHello);
vpHello=(unsigned long)base_string;
TmpAsmCode[19]=vpHello>>24 & 0xff;
TmpAsmCode[18]=vpHello>>16 & 0xff;
TmpAsmCode[17]=vpHello>>8 & 0xff;
TmpAsmCode[16]=vpHello & 0xff;
f2=(int (*)())TmpAsmCode;
vRet=f2();
printf("vRet=:%d\n",vRet);
return 0;
}
LLVM AP Run Code in Data Segment
without EXEC-settings?
 [root@www LLVM]# clang -O2 -emit-llvm llvm-self-modify.c -c -o llvm-self-modify.bc
 [root@www LLVM]# lli llvm-self-modify.bc
 Hello Loda!ok!
 It still works!
 vRet=:17
int (*f2)();
char
TmpAsmCode[]={0x90,0x55,0x48,0x89,0xe5,0xb8,0x04,0x00,0x00,0x00,0xbb,0x01,0x00,0x00,0x00,0xb9,0x40,0x0c,0x60,0x00,0xb
a,0x10,0x00,0x00,0x00,0xcd,0x80,0xb8,0x11,0x00,0x00,0x00,0xc9,0xc3};
char gpHello[]="Hello Loda!ok!\n";
int main()
{
int vRet;
unsigned long vpHello=(unsigned long)gpHello;
char *base_string=malloc(256);
strcpy(base_string,gpHello);
vpHello=(unsigned long)base_string;
TmpAsmCode[19]=vpHello>>24 & 0xff;
TmpAsmCode[18]=vpHello>>16 & 0xff;
TmpAsmCode[17]=vpHello>>8 & 0xff;
TmpAsmCode[16]=vpHello & 0xff;
f2=(int (*)())TmpAsmCode;
vRet=f2();
printf("vRet=:%d\n",vRet);
return 0;
}
So…..What we got?
 LLVM could run data-segment as execution code.
 LLVM doesn’t provide a strict sandbox to prevent
the unexpected program flows.
 For installed-application, maybe it is ok. (could
protect by Android Kernel-Level Application Sandbox)
 How about LLVM running in Web Browser?
Code
LLVM BitCode AP
Running by LLI (Low Level Virtual Machine
Interpreter & Dynamic Compiler)
Bidirectional
Function Call
Data
Technology always come from
humanity!!!
Native Client(Nacl) - a vision of the
future
 Provide the browser to run web application in native code.
 Based on Google’s sandbox, it would just drop 5%
performance compared to original native application.
 Could be available in Chrome Browser already.
 The Native Client SDK only support the C/C++ on x86 32/64
bits platform.
 Provide Pepper APIs (derived from Mozilla NPAPI). Pepper
v2 added more APIs.
Hack Google's Native Client and get
$8,192
http://www.zdnet.com/blog/google/hackgoogles-native-client-and-get-8192/1295
Security of Native Client
 Data integrity
 Native Client's sandbox works by validating the untrusted
code (the compiled Native Client module) before running it
 No support for process creation / subprocesses
 You can call pthread
 No support for raw TCP/UDP sockets (websockets for TCP
and peer connect for UDP)
 No unsafe instructions
 inline assembly must be compatible with the Native Client
validator (could use ncval utility to check)
http://code.google.com/p/nativeclient/issues/list
How Native Client Work?
Browsing WebPage
with Native Client.
Chromium
Browser
Launch nacl64.exe to Execute
the NaCl Executable (*.NEXE) file.
Main Process and Dynamic Library
C:\Users\loda\AppData\Local\Temp
6934.Tmp (=libc.so.3c8d1f2e)
6922.Tmp (=libdl.so.3c8d1f2e)
6933.tmp (=libgcc_s.so.1)
6912.tmp (=libpthread.so.3c8d1f2e)
67D8.tmp (=runnable-ld.so)
66AE.tmp (=hello_loda.nmf)
6901.Tmp (= hello_loda_x86_64.nexe)
Chromium
Browser
lib64/libc.so.3c8d1f2e
lib64/libdl.so.3c8d1f2e
lib64/libgcc_s.so.1
lib64/libpthread.so.3c8d1f2e
lib64/runnable-ld.so
hello_loda.html
hello_loda.nmf
hello_loda_x86_32.nexe
hello_loda_x86_64.nexe
Download the main process and
dynamic run-time libraries.
Server provided
Native Client Page
Dynamic libraries Inheritance
relationship
Hello Loda Process (.NEXE)
libpthread.so.3c8d1f2e
libgcc_s.so.1
libdl.so.3c8d1f2e
libc.so.3c8d1f2e
runnable-ld.so =(ld-nacl-x86-64.so.1)
Portable Native Client (PNaCl)
 PNaCl (pronounced "pinnacle")
 Based on LLVM to provided an ISA-neutral format for
compiled NaCl modules supporting a wide variety of target
platforms without recompilation from source.
 Support the x86-32, x86-64 and ARM instruction sets now.
 Still under the security and performance properties of
Native Client.
LLVM and PNaCl
Refer from Google’s ‘PNaCl Portable Native Client Executables ’ document.
PNaCl Shared Libraries
libtest.c
app.c
Libtest.pso
App.bc
App.pexe
pnacl-translate
Translate to
native code
Libtest.so
pnacl-translate
App.nexe
Execute under Native Client
RunTime Environment
http://www.chromium.org/nativeclient/pnacl/pnacl-shared-libraries-final-picture
Before SFI
 Trust with Authentication
 Such as the ActiveX technology in Microsoft Windows, it would
download the native web application plug-in the browser (MS
Internet Explorer). User must authorize the application to run in
browser.
 User-ID based Access Control
 Android Application Sandbox use Linux user-based protection to
identify and isolate application resources. Each Android
application runs as that user in a separate process, and cannot
interact with each other under the limited access to the
operating system..
General User/Kernel Space Protection
Process individual
memory space
User Space RPC
Application
#1
UnTrust Code
Trust Code
Process individual
memory space
User Space RPC
Application
#2
Process individual
memory space
User Space
Application
#3
Application could use kernel
provided services by System Call
Kernel Space
Device Drivers and Kernel Modules
Fault Isolation
 CFI (CISC Fault Isolation)
 Based on x86 Code/Data Segment Register to reduce
the overhead, NaCl CFI would increase around 2%
overhead.
 SFI
 NaCl SFI would increase 5% overhead in Cortex A9 outof-order ARM Processor, and 7% overhead in x86_64
Processor.
1,ARM instruction length is fixed to 32-bits or 16bits
(depend on ARMv32,Thumb or Thumb2 ISA)
2,X86 instruction length is variable from 1 to 1x bytes.
CISC Fault Isolation
Data/Code Dedicated Register=
(Target Address & And-Mask Register) | Segment Identifier Dedicated Register
Address
SandBoxing
Running
Memory Space
Address
SandBoxing
X86_64 Native Client
running under specific
4GB memory
32bits
4GB
Memory
Space
Software Fault Isolation
Process individual memory space
Running in Software Fault Isolation Model
User Space
SFI
Trust Code
UnTrust Code
Trust Code
Call
Gate
Return
Gate
User Space
SFI
UnTrust
Code
Application could use kernel
provided services by System Call
Kernel Space
Device Drivers and Kernel Modules
NaCl SFI SandBox
 NaCl would download the whole execution environment
(with dynamic libraries)
 Would use x86_64 environment as the verification sample.
 Each x86_64 App would use 4GB memory space.
 But for ARM App, it would only use 0-1GB memory space.
 x86_64 R15 Registers would be defined as “Designated
Register RZP” (Reserved Zero-address base Pointer),and
initiate as a 4GB aligned base address to map the UnTrust
Memory space. For the UnTrust Code, R15 Registers is readonly.
RSP/RBP Register Operation
 The modification of 64bits RSP/RBP would be replaced by a
set instructions to limit the 64bits RSP/RBP would be limited
in allowed 32bits range.
testB(0x1234);
mov $0x1234,%edi
callq 400624 <testB>
int testB(int I)
{
int vRet;
void *libraryHandle;
…
}
//by x86_64 GCC
0000000000400624 <testB>:
400624:
55
push %rbp
400625:
48 89 e5
mov %rsp,%rbp
400628:
48 83 ec 20 sub $0x20,%rsp
40062c:
89 7d ec
mov %edi,-0x14(%rbp)
//by x86_64 pnacl-clang
Function in 32bytes Alignment
00000000010003e0 <testB>:
10003e0:
55
push %rbp
10003e1:
48 89 e5
mov %rsp,%rbp
Clear RSP high-32bits
10003e4:
83 ec 10
sub $0x10,%esp
10003e7:
4a 8d 24 3c
lea (%rsp,%r15,1),%rsp Use R15 to limit RSP
10003eb:
89 7d fc
mov %edi,-0x4(%rbp) in specific 4GB space
Function Call (1/3)
 The function target address would be 32 bytes alignment,
and limit the target address to allowed 32bits range by R15.
char gpHello[]="Hello Loda!\n";
int (*f1)(const char *);
……
f1 = dlsym(libraryHandle, "puts");
f1(gpHello);
//by x86_64 GCC
40067c:
e8 77 fe ff ff
callq 4004f8 <dlsym@plt>
//rax = function puts address
40068f:
bf d0 0b 60 00
mov $0x600bd0,%edi
400694:
ff d0
callq *%rax
//by x86_64 pnacl-clang
100045b:
e8 20 fd ff ff
callq 1000180 <dlsym@plt+0x120>
//rax = function puts address
1000466:
bf 00 02 01 11
mov $0x11010200,%edi
…….
Clear RAX high-32bits
1000478:
83 e0 e0
and $0xffffffe0,%eax
100047b:
4c 01 f8
add %r15,%rax
Use R15 to limit RAX
100047e:
ff d0
callq *%rax
in specific 4GB space
Function Call (2/3)
 Call function from ELF PLT (Procedure Linkage Table)
 The function target address would be 32 bytes alignment, and
limit the target address to allowed 32bits range by R15.
callq 4004f8 <dlsym@plt>
int (*f1)(const char *);
……
f1 = dlsym(libraryHandle, "puts");
//by x86_64 GCC
4004f8:
ff 25 ba 06 20 00
jmpq *0x2006ba(%rip)
# 600bb8 <_GLOBAL_OFFSET_TABLE_+0x38>
4004fe:
68 04 00 00 00
pushq $0x4
400503:
e9 a0 ff ff ff
jmpq 4004a8 <_init+0x18>
callq 1000180 <dlsym@plt+0x120>
//by x86_64 pnacl-clang
1000080: 4c 8b 1d 49 01 01 10 mov 0x10010149(%rip),%r11
#110101d0<_GLOBAL_OFFSET_TABLE_+0x20>
R11 in 32bytes Alignment
1000087: 45 89 db
mov %r11d,%r11d
100008a: 41 83 e3 e0
and $0xffffffe0,%r11d
100008e: 4d 01 fb
add %r15,%r11
Use R15 to limit R11
1000091: 41 ff e3
jmpq *%r11
in specific 4GB space
Function Call (3/3)
 For the internal UnTrust function directly calling, it doesn’t
need to filter by the R15
vRet=0x1234*testA(0x888);
//by x86_64 GCC
400696:
bf 88 08 00 00
40069b:
e8 54 ff ff ff
Directly call to
69 c0 34 12 00 00
UnTrust function. 4006a0:
//by x86_64 pnacl-clang
1000480:
bf 88 08 00 00
…………………..
100049b:
e8 e0 fe ff ff
10004a0:
69 c0 34 12 00 00
mov $0x888,%edi
mov $0x888,%edi
callq 4005f4 <testA>
imul $0x1234,%eax,%eax
Directly call to
UnTrust function.
callq 1000380 <testA>
imul $0x1234,%eax,%eax
Function Return
 The function return address would be 32 bytes alignment, and
limit the target address to allowed 32bits range by R15.
int testB(int I)
{
int vRet;
……………
vRet=0x1234*testA(0x888);
return vRet;
}
//by x86_64 pnacl-clang
100049b:
e8 e0 fe ff ff
10004a0:
69 c0 34 12 00 00
10004a6:
89 45 f8
10004a9: 83 c4 10
10004ac: 4a 8d 24 3c
10004b0:
8b 2c 24
10004b3: 4a 8d 6c 3d 00
10004b8: 83 c4 08
10004bb: 4a 8d 24 3c
10004bf: 59
10004c0: 83 e1 e0
10004c3: 4c 01 f9
10004c6: ff e1
//by x86_64 GCC
40069b: e8 54 ff ff ff
4006a0:
69 c0 34 12 00 00
4006a6:
89 45 f4
4006a9:
8b 45 f4
4006ac:
c9
4006ad:
c3
callq 4005f4 <testA>
imul $0x1234,%eax,%eax
mov %eax,-0xc(%rbp)
mov -0xc(%rbp),%eax
leaveq
retq
callq 1000380 <testA>
imul $0x1234,%eax,%eax
mov %eax,-0x8(%rbp)
add $0x10,%esp
Clear RSP high-32bits
lea (%rsp,%r15,1),%rsp
Use R15 to limit RSP
mov (%rsp),%ebp
lea 0x0(%rbp,%r15,1),%rbp in specific 4GB space
add $0x8,%esp
lea (%rsp,%r15,1),%rsp
pop %rcx
RCX in 32bytes Alignment
and $0xffffffe0,%ecx
add %r15,%rcx
Use R15 to limit RCX (return address)
jmpq *%rcx
in specific 4GB space
Read/Write Memory
 The Read/Write Memory address would be limited in the 4GB
32bits range by R15.

int testA(int I)
{
int vRet;
char *p1=0x10000000;
char *p2=0x20000000;
char *p3;
char p4;
p3=(char *)((int)p1*2+p2);
p3[0x11]=0x66; //Write
p4=p3[0x22]; //Read
….
}
0000000001000380 <testA>:
……
1000387:
c7 45 f4 00 00 00 10 movl $0x10000000,-0xc(%rbp)
100038e:
c7 45 f0 00 00 00 20 movl $0x20000000,-0x10(%rbp)
………………..
//Write
Clear RAX high-32bits
Use R15 to limit RAX
in specific 4GB space
10003a3:
89 c0
mov %eax,%eax
10003a5:
41 c6 84 47 11 00 00 movb $0x66,0x20000011(%r15,%rax,2)
………………
10003b1:
10003b3:
10003b8:
89 c0
41 8a 44 07 22
88 45 eb
//Read
Clear RAX high-32bits
Use R15 to limit RAX
in specific 4GB space
mov %eax,%eax
mov 0x22(%r15,%rax,1),%al
mov %al,-0x15(%rbp)
NaCl Software Fault Isolation in ARM
Trust Code Region
UnTrust Code Region
3-1GB
0-1GB
NaCl ARMv32 Instruction
Software Fault Isolation Memory
SP Register Operation in ARMv32
 The modification of SP would be replaced by a set instructions
to limit the 32bits SP would be below 1GB address
testB(0x1234);
250: e3010234
..............
25c: ebfffffe
int testB(int I)
{
int vRet;
void *libraryHandle;
…
}
movw r0, #4660
bl
; 0x1234
160 <testB>
Function in 16bytes Alignment
//by armv7 pnacl-clang
00000160 <testB>:
160: e92d4800
push {fp, lr}
164: e24dd010
sub sp, sp, #16
168: e3cdd103
bic sp, sp, #-1073741824 ; 0xc0000000
limit the SP address below 1GB address
Function Call in ARMv32
 The function target address would be 16 bytes alignment,
and limit the target address below 1GB address.
char gpHello[]="Hello Loda!\n";
int (*f1)(const char *);
……
f1 = dlsym(libraryHandle, "puts");
f1(gpHello);
//by armv7 pnacl-clang
1bc: ebfffffe
bl <dlsym>
……
r1 in 16bytes Alignment
and limit the address below 1GB address
1e8: e3c1113f bic r1, r1, #-1073741809 ; 0xc000000f
1ec: e12fff31 blx r1
Function Return in ARMv32
 Don’t allow “pop {pc}” directly.
 The function return address would be 16 bytes alignment,
and limit the target address in LR below 1GB address.
int testB(int I)
{
int vRet;
……………
vRet=0x1234*testA(0x888);
return vRet;
}
limit the SP address below 1GB address
//by armv7 pnacl-clang
214: e3cdd103 bic sp, sp, #-1073741824 ; 0xc0000000
218: e8bd4800 pop {fp, lr}
21c: e320f000 nop {0}
220: e3cee13f bic lr, lr, #-1073741809 ; 0xc000000f
224: e12fff1e bx lr
LR in 16bytes Alignment
and limit the address below 1GB address
Read/Write Memory in ARMv32
 The Read/Write Memory address would be limited below 1GB
address by R0. 00000110 <testA>:
110: e24dd018
114: e3cdd103
118: e3a01201
............
int testA(int I)
120: e58d100c
{
124: e3a01202
128: e59d000c
int vRet;
12c: e3a0207b
char *p1=0x10000000;
130: e58d1008
char *p2=0x20000000;
...............
char *p3;
148: e3a01066
char p4;
14c: e320f000
p3=(char *)((int)p1*2+p2); 150: e3c00103
p3[0x11]=0x66; //Write
154: e5c01000
.............
p4=p3[0x22]; //Read
….
}
164: e3c00103
168: e5d00022
16c: e5cd0003
...............
sub sp, sp, #24
bic sp, sp, #-1073741824 ; 0xc0000000
mov r1, #268435456 ; 0x10000000
str
mov
ldr
mov
str
r1, [sp, #12]
r1, #536870912 ; 0x20000000
r0, [sp, #12]
r2, #123
; 0x7b
r1, [sp, #8]
//Write
mov r1, #102
; 0x66
limit the ro address
nop {0}
below 1GB address
bic r0, r0, #-1073741824 ; 0xc0000000
strb r1, [r0]
//Read
limit the ro address
below 1GB address
bic r0, r0, #-1073741824 ; 0xc0000000
ldrb r0, [r0, #34] ; 0x22
strb r0, [sp, #3]
For Hacker’s View
Inner sandbox: binary validation
Would limit stack
in specific 4GB memory space
int func(…..)
{
Use R15 to limit
UnTrust function call
in specific 4GB space
………………………….
}
Return to caller function.
Would limit in 32bytes alignment
and specific 4GB memory space.
outer sandbox: OS system-call interception
Conclusion
 LLVM support IR and could run on variable processor
platforms.
 Portable native client with LLVM should be a good
candidate to play role in Browser usage ,and the LLVM
would have a good role on Android platform.
 It is a new security protection model, use user-space
Sandbox to run native code and validate the native
instruction without kernel-level privilege involved.
 Break the Sandbox!!!
Appendix
The differences of Dalvik and LLVM (1/2)
 From compiled execution code
 LLVM transfer to 100% native code. Dalvik VM need to based
on the JIT Trace-Run Counter.
 From the JIT native-code re-used
 After Dalvik VM process restart, the JIT Trace-Run procedures
need to perform again. But after LLVM application transfer to
100% native code, it could run as native application always.
 From CPU run-time loading
 Dalvik application need to calculate the Trance-Run Counter in
run-time and perform JIT. LLVM-based native application
could save this extra CPU loading.
The differences of Dalvik and LLVM (2/2)
 From the run-time memory footprint
 Dalvik application convert to JIT native code would need extra
memory as JIT-Cache. If user use Clang to compile C code as
BitCode and then use LLVM compiler to compile the BitCode to
native assembly, it could save more run-time memory usage.
 If Dalvik application transfer the loading to JNI native .so library, it
would need extra loading for developer to provide .so for different
target processors’ instruction.
 From the Storage usage
 General Dalvik application need a original APK with .dex file and
extra .odex file in dalvik-cache. But LLVM application doesn’t need
it.
 From the system security view of point
 LLVM support pointer/function-pointer/inline-assembly and have
the more potential security concern than Java.
Native Client Page Content
{
"files": {
"libgcc_s.so.1": {
"x86-64": {
"url": "lib64/libgcc_s.so.1"
},
.....
},
"main.nexe": {
"x86-64": {
"url": "hello_loda_x86_64.nexe"
},
.....
},
<html>
<body ...>
....
<div id="listener">
.....
<embed name="nacl_module"
id="hello_loda"
width=200 height=200
src="hello_loda.nmf"
type="application/x-nacl" />
</div>
</body>
</html>
"libdl.so.3c8d1f2e": {
.....
},
"libc.so.3c8d1f2e": {
.....
},
"libpthread.so.3c8d1f2e": {
.....
},
"program": {
"x86-64": {
"url": "lib64/runnable-ld.so"
},
.....
}
}
End