Beyond Java - Tolerant Systems

Download Report

Transcript Beyond Java - Tolerant Systems

UC Irvine – project transprose: transporting programs securely
New Approaches to Mobile Code:
Reconciling Execution Efficiency
with Provable Security
Michael Franz
University of California at Irvine
Technical Objective (1)

design the third line of defense in a mobile-code system
false
authentication
intrusion
malicious
mobile program
prevent
execution unless
provably secure
new third line of defense
second line of defense: authentication
first line of defense: access control (physical, logical)
Technical Objective (2)

make this “third line of defense” a pervasive property
of every computer system, not just a luxury good
afforded by only a few expensive ultra-secure highend installations
 rather
than simply demonstrating the viability of
mobile-code security, also make it practical across a
wide spectrum of applications
 in
this context, practical means scalable to large
applications, with excellent final code quality, at
resonable just-in-time compilation speed and cost
Existing Practice: Java
“Java” is the de-facto standard format for distributing
mobile programs
 when we speak of “distributing mobile programs using
Java”, we in fact usually mean “using the Java Virtual
Machine”
 the JVM has an instruction set that has been designed
specifically for representing Java programs

– interestingly enough, there still are JVM programs for which
no legal equivalent Java source program exists
Existing Practice: Java Security

although the Java programming environment is typesafe, programs compiled from Java into JVM-code
must be re-checked upon arrival because they may
have been corrupted in transit
class MyClient {
import MyLibrary;
{...MyLibrary.NoSecret();…}
}
JVM-code stream
call
MyLibrary.NoSecret
...
class MyLibrary {
public void NoSecret();
private void ASecret();
}
Existing Practice: Java Security

although the Java programming environment is typesafe, programs compiled from Java into JVM-code
must be re-checked upon arrival because they may
have been corrupted in transit
class MyClient {
import MyLibrary;
{...MyLibrary.NoSecret();…}
}
corrupted JVM-code stream
call
MyLibrary.ASecret
...
class MyLibrary {
public void NoSecret();
private void ASecret();
}
Existing Practice: Java Security

Java’s byte-code security model requires timeconsuming static verification and/or dynamic checking
while the code is being executed
IF
THEN ...

ELSE
MyLibrary.Asecret()
systematic study of security issues is still in its infancy
Existing Practice: JVM Performance
upon arrival at a target machine, most JVM code is
translated into the appropriate native code “just-intime”
 performance resulting from “just-in-time” compilation
is not competitive with off-line compilers

– compilation systems such as Sun’s HotSpot are incredibly
complex and haven’t delivered on their promise

JVM approach is unlikely to scale to large programs
requiring top-level performance
Raising JVM Performance
raising the performance of JVM-code has been
addressed by “annotating” the byte-code stream with
compiler back-end related information
 “annotated” class-files run much faster if an
annotation-aware byte-code compiler is available on
the target platform
 security is lost: the “annotations” are not optional to
the annotation-aware compiler; if an adversary
falsifies them, the compiler will create a program that
may be unsafe!

Emerging Practice: PCC
ship a native program along with a “proof” that it
doesn’t violate a given security policy
 although more general security policies are
imaginable, current PCC systems essentially use type
safety (and concomitant memory safety) as their
security policy (our approach does the same)
 PCC drastically reduces the size of the trusted
computing base

Emerging Practice: PCC - Problems

PCC is based on native code
– (otherwise the trusted computing base would become larger
again, defeating the main advantage of PCC)

PCC has the performance advantages of fully
optimized code, but requires multiple versions for
multiple platforms

also, in the long run, dynamically generated code
(using feedback from dynamic profiling) will generally
outperform native code
Our Technical Approach

study the interaction of security-related information,
optimization-enhancing information, and compression,
rather than considering them separately
– use syntax-directed compression as a means of obtaining
guaranteed referential integrity
– transport compiler-related annotations to obtain top-level
performance on the eventual target machine
– use a proof-based approach to guard the compiler-related
annotations from falsification in transit
Our Technical Approach (2)
no single focus on security, code-quality, or encoding
density, but attempt to study their interaction and make
progress along all three dimensions
 preliminary evidence suggests that these three topics
are strongly interrelated and that representations based
on adaptive compression of syntax trees are ideally
suited for transporting mobile programs
 this research is orthogonal and complementary to work
on authentication and security policies

Our Policy Assumptions

type safety using the typing model of the source
language
– all of the host’s library routines are guaranteed to be called
with parameters of the correct types
– capabilities (object pointers) owned by the host can be
manipulated by the mobile client application only as
specified in the host’s interface definition
(private, protected, …) and cannot be forged

type safety is guaranteed by our mobile code
transportation scheme
Compression vs. Security
code compression and security may often be
complimentary
 idea: choose an encoding that can express only legal
programs
 example:

int i, j, k, l;
float r, s, t, u;
{i=j}
Compression vs. Security
code compression and security may often be
complimentary
 idea: choose an encoding that can express only legal
programs
 example:

int i, j, k, l;
float r, s, t, u;
{i=j}
:=
operator
Compression vs. Security
code compression and security may often be
complimentary
 idea: choose an encoding that can express only legal
programs
 example:

int i, j, k, l;
float r, s, t, u;
{i=j}
:=
operator
i
first
operand
(1 out of 8)
Compression vs. Security
code compression and security may often be
complimentary
 idea: choose an encoding that can express only legal
programs
 example:

int i, j, k, l;
float r, s, t, u;
{i=j}
:=
operator
i
first
operand
(1 out of 8)
j
second
operand
(1 out of 3 or 4!)
Compression vs. Security
code compression and security may often be
complimentary
 idea: choose an encoding that can express only legal
programs
 example:

int i, j, k, l;
float r, s, t, u;
{i=j}

:=
operator
i
first
operand
(1 out of 8)
j
second
operand
(1 out of 3 or 4!)
higher-level encodings: enumerate all legal
n
assignments = at most 2 *
possibilities
(2)
Virtual Machines vs. Graphs
information is lost when compiling to the “flat”
representation of virtual machines
 many native code optimizations require this
information to be re-discovered

Graph-Based Representation
...
bra +2
Virtual Machine Representation
...
Performance-Enhancing Information
compiler-related information intended for improving
code quality re-introduces redundancy that can be
exploited by an adversary
 for example, a program can be encoded with
guaranteed referential integrity using a grammar close
to the semantics of the source language
 but in order to allow optimizations, the grammar needs
to be relaxed
 the “holes” in the relaxed grammar need to be guarded
by other means based on proof-carrying code concepts

General Approach Taken
use encoding-inherent security wherever possible
(a well-formedness property of the encoding itself)
 use proof-based security where necessary to support
optimizations

– transporting results of alias analysis
– removing range or type checks
this approach applies regardless of the semantic level
on which the program is being transported
 but the correct choice of such a semantic level must
also be considered!

Highest-Level Encoding
simple and easily understood security policy based on
type-safety
 ultra-compact representation using grammar-based
compression
 guaranteed referential integrity provided essentially
“for free” by the encoding

– relatively small amount of proof-based security required only
for additional performance-enhancing annotations
– e.g., exceptions, alias analysis, escape analysis,
dynamic type safety

time required for dynamic compilation may be a problem
Project Workflow “High Level Thread”
2. compression of
Java programs
Compression
P2K
JAG
arithmetic
encoding
arithmetic
encoding
dictionary
encoding
dictionary
encoding
Encoding
Java abstract
grammar
Proofs
theorem
prover
well
formedness
combination
heuristics
(enhanced)
static semantics
annotated
JAG
1. guarantee complete
static semantics
through encoding
efficient
annotated JAG
3. reduced
verification effort due
to abstract grammar
encoding
Lowest-Level Encoding
compiler-oriented intermediate representation
 goal is to provide much better code quality with far
less effort at the code consumer’s site
 requires more proof-based security than the “highlevel” approach, but still far less than the “original
PCC idea” where the goal is to reduce the TCB
 more voluminous transportation format
 could be more difficult to reason about safety because
further removed from the source language

Project Workflow “Low Level Thread”
1. universal (sourcelanguage neutral) abstract
syntax tree representation
Compression
SSA-directed
encoding
Encoding
typed
SSA
Proofs
theorem
prover
2. UAST after performing all
target-machine independent
optimizations
annotation
encoding
secure
annotated
typed SSA
annotated
typed SSA
annotation
encoding
3. encoding for the proofs
required to guard the TASSA
4. provably secure targetmachine independent low-level
representation
Third Way: Core Calculus

two-stage mapping of the mobile code
– source constructs are mapped to the core calculus
– mapping may be transported as well,
or assumed global shared knowledge
simple and easily understood security policy
 only approach that is easily extensible even by third
parties
 not clear if this approach will yield adequate native code
quality at the consumer’s site
 the relative trade-offs are as of yet unknown

Current Status and Rationale
developed a comprehensive library of stream
compressors in Java
 “high-level” encoding prototype is up and running

– working on a contribution to PLDI 2001 on Java
compression
“low-level” encoding and “core calculus” prototypes
will be operational over the summer
 the relative trade-offs (encoding density vs.
decoding/dynamic compilation speed vs. code quality)
can only be determined by collecting experience with
actual prototypes

Quantitative Metrics

security
– publish complete design specification and rationale and open
the design to public scrutiny and external validation

efficiency
– measure by comparing generated code quality with that of
existing on-the-fly compilers

code density
– measure by comparing with competing proof-carrying code
and mobile-code distribution formats
Expected Major Achievements
demonstrate that graph-based encoding formats are
superior to virtual machines
 explore the relative trade-offs that can only be
determined by building an actual prototype

– encoding density/network transfer speed vs.
– decoding/dynamic compilation speed
– code quality, especially when using the core-calculus
approach
 publish
a design rationale that can form the basis of a
subsequent standardization effort
Long-Term Impact
enable an educated choice of a replacement technology
at the end of the Java Virtual Machine’s life-cycle
 royalty-free and free of particular proprietary
intellectual property claims
 developed under the scrutiny of and in dialogue with
the security community

Task Schedule
Y1 Milestones:
•source-level representation
=> Java compression
•low-level representation
•core calculus representation
1999
investigate:
•multiple source languages
•graph-based encoding
schemes
•proof-carrying code
Y2 Milestones:
•3 system prototypes
•trade-off analysis
•encoding format
comprehensive definition
2000
2001
investigate:
•requirements of
optimizing code generators
•integration of security vs.
compiler-related data
End of Project:
•system deliverable
•comprehensive
documentation
2002
investigate:
•mutual interaction of
security, efficiency,
and compression density
•security of system
Transition of Technology
the final design rationale document will provide
enough detail that unrelated third parties will be able
to replicate our code-transportation scheme(s)
 our prototype implementation(s) will be made
available in source form
 the graduate students involved in this work are likely
to transfer into the industrial sector

Thank You