Coherence - Department of Computer Science
Download
Report
Transcript Coherence - Department of Computer Science
Types, Proofs, and
Safe Mobile Code
The unusual effectiveness of logic
in programming language research
Peter Lee
Carnegie Mellon University
January 22, 2001
NSF/CISE Workshop on The Unusual Effectiveness of Logic in Computer Science
Carnegie
Mellon
Logic in PL research
Logic is the foundation of modern PL research.
Carnegie
Mellon
This talk
A somewhat personal account, by
necessity a rather narrow slice.
But similar stories can be found
in almost all areas of PL research.
Logic as Language
Carnegie
Mellon
Logic and languages
To PL researchers, logics and
languages are often interchangeable.
A vivid example of this is in formal
proofs.
Consider:
Write “x is a proof of P” as x:P.
Carnegie
Mellon
Formal proofs
We can write formal proofs by
application of inference rules.
An example:
If we have a proof x of P and a proof
y of Q, then x and y together
constitute a proof of P Q.
Or, more compactly:
Given x:P, y:Q then (x,y):P*Q.
Carnegie
Mellon
Formal proofs
Another familiar example:
Assume we have a proof x of P. If
we can then obtain a proof b of Q,
then we have a proof of P Q.
Given [x:P] b:Q then
fn (x:P) => b : P Q.
More:
Given x:P*Q then fst(x):P
Given y:P*Q then snd(y):Q
Carnegie
Mellon
Proofs and types
So, for example:
fn (x:P*Q) => (snd(x), fst(x))
: P*Q Q*P
We can develop full metalanguages
based on this principle of proofs as
programs, propositions as types.
Typechecking gives us proofchecking!
Codified in languages such as ML.
Carnegie
Mellon
Applications
This isomorphism has had many
applications in logic and in CS.
Proof development systems.
NuPrl, Coq, LCF, …
Advanced programming languages.
Prolog.
Logical framework languages.
Edinburgh Logical Framework.
Carnegie
Mellon
Logical frameworks
The Edinburgh Logical Framework
(LF) is a particularly useful
language for specifying logics.
Carnegie
Mellon
LF example
Fragment of first-order logic, Pfenning’s Elf syntax.
exp : type
pred : type
pf
: pred -> type
true
/\
=>
all
:
:
:
:
truei
andi
andel
impi
alli
alle
pred
pred -> pred -> pred
pred -> pred -> pred
(exp -> pred) -> pred
:
:
:
:
:
:
pf true
{P:pred} {R:pred} pf P -> pf R -> pf (/\ P R)
{P:pred} {R:pred} pf (/\ P R) -> pf P
{P:pred} {R:pred} (pf P -> pf R) -> pf (=> P R)
{P:exp -> pred} ({X:exp} pf (P X)) -> pf (all P)
{P:exp -> pred} {E:exp} pf (all P) -> pf (P E)
Carnegie
Mellon
LF example
The representation of P P P
for some predicate P:
(=> P (/\ P P))
The proof of this predicate has
the following Elf representation:
(impi P (/\ P P)
([X:pf P] andi P P x x))
Language as Logic
Carnegie
Mellon
Languages and logic
To PL researchers, languages and
logics are often interchangeable.
A vivid example of this is in type
theory.
Carnegie
Mellon
Type theory
A standard application of type
theory involves the following:
Operational (run-time) semantics is
defined by an inference system.
Type system is also defined by an
inference system.
Logic is used to prove the soundness
of the type system wrt the semantics.
A programming language is a logic.
Carnegie
Mellon
Soundness
Soundness:
Well-typed programs are guaranteed
to stay within the boundaries defined
by the operational semantics.
Well-typed programs won’t go wrong.
Carnegie
Mellon
Practical benefits
Soundness can be hard to prove.
But it essentially converts the very
difficult negative property (program
won’t go wrong) into a positive
property (program is well-typed).
Only need to prove soundness once.
Carnegie
Mellon
Applications
Current research often involves
defining the logical core of a
language and then studying its
properties.
Existing languages.
ML, Haskell, JVML, …
New design and implementation
features.
Type-directed compiling, region
inference, linear typing, …
Proofs, Types, and
Safe Mobile Code
Carnegie
Mellon
The code safety problem
Please install
and execute
this.
Full cartoon
Carnegie
Mellon
Code Safety
Code
Is this safe to execute?
CPU
Trusted Host
Carnegie
Mellon
Formal verification
Code
Theorem
Prover
CPU
Trusted Host
Flexible and
powerful.
But really really
really hard and
must be correct.
Carnegie
Mellon
A Key Idea: Explicit Proofs
Code
Proof
Checker
CPU
Trusted Host
Proof
Certifying
Prover
Carnegie
Mellon
Proof-Carrying Code
[Necula & Lee, OSDI’96]
Typically native
Formal proof safety in LF
A
or VM code
B
rlrrllrrllrlrlrllrlrrllrrll…
Carnegie
Mellon
Proof-Carrying Code
Reasonable in size (0-10%).
Code
Proof
Checker
CPU
Proof
Simple,
small (<52KB),
and fast.
Certifying
Prover
No longer need to
trust this component.
Carnegie
Mellon
The Role of
Languages and Logic
Civilized programming languages can
provide “safety for free”.
Well-formed/well-typed safe.
Idea: Arrange for the compiler to
“explain” why the target code it
generates preserves the safety
properties of the source program.
Automation via
Certifying Compilation
Carnegie
Mellon
Object
code
Certifying
Prover
Certifying
Compiler
Source
code
Proof
Proof
Checker
CPU
Looks and smells like a compiler.
% spjc foo.java bar.class baz.c -ljdk1.2.2
Carnegie
Mellon
Safety Policies in LF
Fragment of rules for the Java type system.
jfloat : exp.
jinstof : exp -> exp.
of
: exp -> exp -> pred.
faddf
: {E:exp} {E':exp}
pf (of E jfloat) ->
pf (of E' jfloat) ->
pf (of (fadd E E') jfloat).
ext
: {E:exp} {C:exp} {D:exp}
pf (jextends C D) ->
pf (of E (jinstof C)) ->
pf (of E (jinstof D)).
Carnegie
Mellon
Program checking
A proof for
(saferd4 (add src_1 (add (imul edx_1 4) 8)))
in the Java specification looks like
this (excerpt):
(rdArray4 A0 A2 (sub0chk A3) szint
(aidxi 4 (below1 A7)))
This proof can be easily validated via
LF type checking.
Themes and Conclusions
Carnegie
Mellon
Coherence
Main Entry: co·her·ence
Pronunciation: kO-'hir-&n(t)s, -'herFunction: noun
Date: 1580
1 : the quality or state of cohering: as
a : systematic or logical connection or
consistency b : integration of diverse
elements, relationships, or values
2 : the property of being coherent
Research in programming languages
is largely directed towards achieving
coherence in software systems.
Carnegie
Mellon
Coherence
Coherence requires:
ability to analyze/verify
components, and
communicate descriptions of
components.
Logic is canonical, in the sense of
being the only foundation for this.
Carnegie
Mellon
Esthetics vs. pragmatics
Many of the methods and results are
motivated as much by esthetic as
they are by pragmatic concerns.
Practical engineering consequences:
Minimality and clarity of expression.
Scaling up by study of “core” logics.
Can sometimes be divorced from realworld problems, but hard to predict.
Carnegie
Mellon
Practicality
In recent years, a trend towards
picking the low-hanging fruit.
Eliminate “simple errors”.
Exposed by the Web, plug-ins,
embedded systems, etc.
Even machine languages!
Small theorems about big programs.
Carnegie
Mellon
Conclusions
For much of the history of CS, PL
research meant design of
languages and compiler
technology.
Today, PL technology and concepts
advance logic and are applied
directly to software artifacts.
“LF in the Unix kernel.”
Carnegie
Mellon
A logical approach
Please install
and execute
this.
Code producer
OK, but let me
quickly look over the
instructions first.
Host
Carnegie
Mellon
Code producer
A logical approach
Host
Carnegie
Mellon
A logical approach
This store
instruction is
dangerous!
Code producer
Host
Carnegie
Mellon
A logical approach
Can you prove
that it is
always safe?
Code producer
Host
Carnegie
Mellon
Yes! Here’s the proof I
got from my certifying
Java compiler!
Code producer
A logical approach
Can you prove
that it is
always safe?
Host
Carnegie
Mellon
A logical approach
Code producer
Your proof checks
out. I believe you
because I believe
in logic.
Host