Presentation - Computing and Information Sciences

Transcript Presentation - Computing and Information Sciences

Bytecode Verification on Java
Smart Cards
Xavier Leroy
Presentation(day 1)
-Nithya
Overview






JavaCard Architecture
Sandbox model
Sun’s Bytecode verification
Our Verification Algorithm
Performance Analysis of both
Differences
Overview of JVM

Stack based abstract machine
- Instructions pop their arguments off the stack
- Push their results on the stack


Set of registers (local variables) can be
accessed via “load” and “store”
- push the value of a register on the stack
- store the top of the stack in the register
Stack + register -> activation record for a method
Type-correctness eg. of JVM

Iadd :
- Stack initially contains at least 2 elements of type
int and it pushes back result of int



Getfield C.f. τ :
- Top of the stack contains a reference to an instance
of class C
- it then pops it and pushes back a value of type
(the value of the field f)
τ
Conditions for JVM’s proper
operation





Type Correctness
No stack overflow or underflow
Code containment
Register initialization
Object initialization
Smart Cards




What are Smart Cards?
Security tokens - applications
Traditional Smart Cards
- C or Assembler
Challenged by Open Architectures
- Multos, Java Card
Java Card Architecture



Applications written in Java and so are
portable across all Java cards
Java Cards can run multiple applications
which can communicate thru shared
objects
New applications called applets can be
downloaded on the card post issuance
Security Issues (by malicious
applet)



Leaking confidential information outside
(eg: PINS and secret cryptographic
keys)
Modifying sensitive information (eg:
balance of an electonic purse)
Interfering with other honest
application on the card (Trojan Attack)
Solution




Put forward by Java Programming
Environment
Execute the applets in a so called
“Sandbox”
Its an insulation layer preventing direct
access to the hardware resources
Also implements a suitable access
control policy
Sandbox Model-3 components



Applets --> Bytecode
VM manipulates secure abstractions of
data than hardware processor
No direct access to hardware
resources but to a set of API classes
Upon Downloading, applet subject to
static analysis called Bytecode
Verification
Bytecode Verification- Purpose


1.
2.
3.
4.
To check ->code is well typed
Doesn’t bypass protections 1 & 2 by
performing ill typed operations at runtime:
Forging object references from integers
Illegal casting of obj ref from one class to
another
Calling private methods of API
Jumping in middle of API method or
jumping to data as if it were code
Java Card Arch & Sandbox



Component 1: Applets executed by JVM
Component 2: Java Card runtime
environment provides access control
through its “firewall”
Component 3: Bytecode Verifier
missing!
-
complex and expensive process
requires large amounts of working memory
Approach 1




Relying on off-card tools
Attesting a cryptographic signature on
well-typedness of the applet
Oncard downloading restricted to
signed applets
Drawbacks:
- to extend the trusted computing base to off card
components
- practical issues: (how to deploy the signature keys?)
- legal issues: (who takes liability for a buggy applet?)
Defensive VM Approach






To type-check dynamically during applet
execution
VM computes bytecode instructions
Keeps track of all data it manipulates
Arguments- correct types?stack overflow or
underflow?
Are class member accesses allowed?
Drawback:
- Dynamic Type-checks expensive in terms of execution speed
and memory
Traditional Bytecode Verifn

For Straight line code?
-Verifier checks every instruction of the method code
- Checks whether stack has enough entries
- Checks whether the entries are of correct types
- Effect of the instruction on Operand stack and registers
Intialisation:
- Stack ->empty
- Reg (0…n-1) -> method parameters
- Reg(n…m-1)-> T (uninitialised reg)

Method invocations ?
Traditional Bytecode Verifn



For Branches- Forks and Joins
Forks?
- must propagate inferred stack & reg type to all successors
Joins?
- makes sure the types of the stack and registers along all
paths agree
Handling joins
Dictionary





A data structure associating a stack and register type
to each program point that is the target of a branch
or exception handler
During analysis, Updates ->type associated with
target of branch
Replacing by -> LUB(prev type,type inferred for the
instruction)
If changed -> the corr.instrn & successors are
reanalysed until a fixpoint is reached
This way, the dictionary entry for a branch target
contains LUB inferred on all branches of control that
lead to this instrn.
Errors
1.
Stack heights may differ
An instruction can be reached through several paths
with inconsistent operand stacks
1.
Types for a particular stack entry or
register may be incompatible
eg: branch 1: short
branch 2 : Obj Ref
Type set to T in dictionary
Semi-Lattice
Interfaces and LUB


Property: Every pair of types possesses
a smallest common supertype (LUB)
Does not hold ->
Interfaces are types and classes can
implement several interfaces
Example
interface I { ... }
interface J { ... }
class C1 implements I, J { ... }
class C2 implements I, J { ... }
Subtyping relation
Approach
Solution to Interface problem





Bytecode verification ignores interfaces
Treats all interface types as class type
“Object”
Verifier contains proper classes and subtyping
between them is simply inheritance relation
Subtyping relation is tree shaped
->forms a semilattice.
LUB of 2 classes ->closest common ancestor
in inheritance tree
Performance Analysis


Straight line Code:
Time: as fast as executing instruction
Space: 1 stack and 1 set of reg types
(if each type-3 bytes,
then memory=3S+3N bytes)
Similar for method invocation:
Space for method invocation = space
for verification
Performance Analysis




1.
2.
3.
4.
Branches-
costly since instructions are analyzed several
times to reach fixpoint
Real issue: Memory space for storing dictionary
If B ->no of branch targets, memory =(3S+3N+3)*B bytes [3
bytes of overhead for each Dic. Entry]
Drawbacks:
No. of branch targets proportional to method size ->
dictionary size increases linearly
Dictionary entries quite often redundant
Merging several methods into a larger one result in nonverifiable code
Storing the dictionary in EEPROM not an option.
Questions?
Part II – Novel Bytecode
Verification Algorithm
Background for Our
Verification Algorithm

Operand stack is empty at the
beginning and end of every statement
JCVM for e1 ? e2 : e3 :

code to evaluate e1
ifeq lbl1
code to evaluate e2
goto lbl2
lbl1: code to evaluate e3
Lbl2: …
(Branch to lbl2 occurs with a non-empty operand stack)
Background…



In Java source, a local variable has only one
type throughout the method: τ
In JCVM bytecode, register with type T
acquires the type τ at 1st store in reg.
For Java code,
A x;
if (cond)
x = new B(); // B is a subclass of A
else
x = new C(); // C is another subclass of A
Register x has type A (ie,LUB(B,C)) when two arms merge
Background
{ short x; ... }
{ C y; ... }
 Scopes of x & y disjoint ->store x & y in
same register
Requirements



R1: Operand stack is empty at all
branch and branch target instructions
R2: Each register has only one type
throughout the method code
R3: On method entry,the VM initialises
all registers that are not parameters to
the bit pattern representing null object
reference
The Algorithm





Global variables:
Nr,Ns,r[Nr],s[Ns],sp,chg
Initialisation:
Set sp <- 0
Set r[0],…, r[n - 1] to the types of the method arguments

Set r[n],…, r[Nr - 1] to ┴

Set chg <- true
Algorithm
While chg:
Set chg -> false
For each instruction i of the method, in code order:
If i is the target of a branch instruction:
If sp != 0 and the previous instruction falls through, error
Set sp -> 0
If i is the target of a JSR instruction:
If the previous instruction falls through, error
Set s[0] -> retaddr and sp -> 1
If i is a handler for exceptions of class C:
If the previous instruction falls through, error
Set s[0] -> C and sp -> 1
If two or more of the cases above apply, error
Algorithm
Determine the types a1,…, an of the arguments of i
If sp < n, error (stack underflow)
For k = 1,…,n: If s[sp - n + k - 1] is not subtype of ak, error
Set sp -> sp - n
Determine the types r1,…, rm of the results of i
If sp + m > Ns, error (stack overflow)
For k = 1,…,m: Set s[sp + k - 1] -> rk
Set
sp -> sp + m
If i is a store to register number k:
Determine the type t of the value written to the register
Set r[k] -> lub(T, r[k])
If r[k] changed, set chg -> true
If i is a branch instruction and sp ! = 0, error
Differences with straight line
code verification
Stack empty?
 When checking Store,reg type ->LUB
 Typechecking until a fixpoint
 Initialisation:
Stack -> Empty
reg(0..n-1) -> method parameter types
reg(n..m-1)-> ┴ & not T (REQ R3)

Next class




Subroutines
Object Initialisation
Off card code transformations
Bytecode Transformation on small
computers

Presentation - Computing and Information Sciences

Transcript Presentation - Computing and Information Sciences

Directory