Transcript Document

Typed Assembly Languages
COS 441, Fall 2004
Frances Spalding
Based on slides from Dave Walker and Greg Morrisett
Type Safety

Progress: if a state is well-typed then it is not stuck
If  ` e: then either e is a value or
9 e’ such that e !* e’

Preservation: each step in evaluation preserves
typing
If  ` e: and e !* e’ then  ` e’:
12/10/2004
COS441 - Typed Assembly Languages
2
High Level vs. Low Level
Type Safe Source Code
Very large,
complex compiler
that’s almost
certainly full of all
sorts of insidious
bugs
Assembly Code
12/10/2004

If the source language has
been proved to be type
safe, what do we know
about the safety of the code
we actually run?

Nothing!

What if the compiler
produced a typed assembly
language?
COS441 - Typed Assembly Languages
3
Proof Carrying Code
Source Program

Compilation
&
Certification
Native
Code
Safety
Proof
Code
Producer

Code
Consumer
Proof Validation

CPU
12/10/2004
The code producer
provides a proof of
safety along with the
executable
The code consumer
checks that proof
before executing the
code
A typing derivation of
the low-level code can
be a safety proof
COS441 - Typed Assembly Languages
4
High-level code for factorial
12/10/2004
COS441 - Typed Assembly Languages
5
Assembly code for factorial
12/10/2004
COS441 - Typed Assembly Languages
6
TAL0 Syntax








Registers: r1, r2, …
Labels: L, …
Integers: n
Operands: v ::= r | n | L
Arithmetic Ops: aop ::= add | sub | mul | …
Branch Ops: bop ::= beq | bgt | …
Instructions:  ::= aop rd, rs, v | bop r, v | mov
r, v
Blocks: B ::= jmp v | ; B
12/10/2004
COS441 - Typed Assembly Languages
7
TAL0 Dynamic Semantics

Machine State  = (H,R,B)




H maps labels to basic blocks
R maps registers to values (integers and labels)
B is a basic block (corresponding to the current
program counter)
Model evaluation as a transition function
mapping machine states to machine states:

12/10/2004
COS441 - Typed Assembly Languages
8
TAL0 Dynamic Semantics
12/10/2004
COS441 - Typed Assembly Languages
9
TAL0 Static Semantics





 ::= int |  ! {}
 ::= {r1:1, r2:2, …}
Code labels have type {r1:1, r2:2, …} ! {}
To jump to code with this type, r1 must
contain a value of type , etc
Notice that functions never return – the code
block always ends with a jump to another
label
12/10/2004
COS441 - Typed Assembly Languages
10
Example Program with Types
12/10/2004
COS441 - Typed Assembly Languages
11
Type Checking Basics

We need to keep track of:




The types of the registers at each point in the
code
The types of the labels on the code
Heap Types:  will map labels to label types
Register Types:  will map registers to types
12/10/2004
COS441 - Typed Assembly Languages
12
Basic Typing Judgments



;  ` n: int
;  ` r: (r)
;  ` L : (L)
12/10/2004
COS441 - Typed Assembly Languages
13
Subtyping on Register Files


Our program will never crash if the register
file contains more values than necessary to
satisfy some typing precondition
Width subtyping:
{r1:1,…,ri-1:i-1,ri:i} <: {r1:1,…,ri-1:i-1}
12/10/2004
COS441 - Typed Assembly Languages
14
Subtyping on Code Types

Like ordinary function types, code types are
contravariant
’ <: 
 ! {} <: ’ ! {}
12/10/2004
COS441 - Typed Assembly Languages
15
Typing Instructions

The typing judgment for instructions
describes the registers on input to the
function and the registers on output
 `  : 1  2

 is invariant. The types of heap objects
cannot change as the program executes
12/10/2004
COS441 - Typed Assembly Languages
16
Typing Instructions
; ` rs : int
; ` v : int
 ` add rd, rs, v :  ! [rd := int]
; ` r : int
; ` v :  ! {}
 ` ble r, v :  ! 
; ` v : 
 ` mov rd, v :  ! [rd := ]
12/10/2004
COS441 - Typed Assembly Languages
17
Basic Block Typing

All basic blocks end with jump instructions
;  ` v :  ! {}
 ` jmp v :  ! {}

Instruction Sequences
 `  : 1 ! 2  ` B : 2 ! {}
 ` ;B : 1 ! {}
12/10/2004
COS441 - Typed Assembly Languages
18
Heap and Register File Typing
Heap Typing
Dom(H)=Dom() 8 L 2 Dom(H). ` H(L):(L)
`H:


Register File Typing
8 r 2 Dom(). ; ` R(r) : (r)
`R:
12/10/2004
COS441 - Typed Assembly Languages
19
Machine Typing

Main Typing Judgment
` H :   ` R :   ` B :  ! {}
` (H,R,B)
12/10/2004
COS441 - Typed Assembly Languages
20
Type Safety for TAL0

Progress:
If ` 1 then 9 2 such that 1  2

Preservation
If ` 1 and 1  2 then ` 2
12/10/2004
COS441 - Typed Assembly Languages
21
Limitations of TAL0

What about situations where we don’t care
which specific type is in a register?


Eg. A function that swaps the contents of two
registers
We can use polymorphic types to express
this.
12/10/2004
COS441 - Typed Assembly Languages
22
TAL1 : Polymorphism

 ::= … | 
8 , . { r1:, r2:, r31: {r1:, r2:} ! {}} ! {}


Describes a function that swaps the values in r1
and r2 for values of any two types
To jump to polymorphic functions, first
explicitly instantiate type variables: v[]
12/10/2004
COS441 - Typed Assembly Languages
23
Polymorphism Example
12/10/2004
COS441 - Typed Assembly Languages
24
Callee-Saved Registers

We can use polymorphism to implement
callee-saved registers
We’d like to be able to enforce that a function
doesn’t change certain registers. In other words,
when it returns, certain registers have the same
values as when it started
8 . { r2: int, …, r5:, …., r31: {r1: int, …, r5:, …} ! {}
} ! {}
 Why does this enforce the invariant we want?

12/10/2004
COS441 - Typed Assembly Languages
25
TAL1 Semantics

Changes to the static semantics to support
polymorphism



Add the judgment  `  ok
Modify the other judgments to take this into
account (just like homework 8!)
Changes to the dynamic semantics

Add rule to do instantiation
(H, R, jmp v[])  (H, R, B[/])
12/10/2004
COS441 - Typed Assembly Languages
26
Full TAL Syntax
12/10/2004
COS441 - Typed Assembly Languages
27
Limitations of TAL

Almost every compiler uses a stack


As storage space for local variables and other
temporaries when we run out of space in registers
To keep track of control flow


12/10/2004
Control Links
Activation Links
COS441 - Typed Assembly Languages
28
STAL: Add a Stack

Machine States: (H, R, S, B)



Stack S ::= nil | v :: S
A designated register sp points to the top of the
stack
 ::= salloc n | sfree n | sld rd, n | sst rs, n


push v ´ salloc 1; sst v, 1
pop v ´ sld r, 1; sfree 1
12/10/2004
COS441 - Typed Assembly Languages
29
Factorial Example
12/10/2004
COS441 - Typed Assembly Languages
30
Extensions for STAL

Stack types  ::= nil | :: | 




nil represents the empty stack
:: represents a stack v::S where v: and S:
 is a stack variable which we can use to
describe an unknown stack tail
Use polymorphism
8 . {sp: int::, r1: int} ! {}
12/10/2004
COS441 - Typed Assembly Languages
31
Stack Instruction Typing

Stack Allocation
;  ` salloc n : [sp := ]![sp := ?::…::?::]

Stack Free
;  ` sfree : [sp := 1::…::n::]![sp := ]
12/10/2004
COS441 - Typed Assembly Languages
32
Stack Instruction Typing

Stack load
(sp) = 1 :: … :: n :: 
;  ` sld r, n :  ! [r := n]

Stack store
; ;  ` v :  (sp) = 1 :: … :: n :: 
;  ` sst v, n :  ! [sp := 1 :: … ::  :: ]
12/10/2004
COS441 - Typed Assembly Languages
33
STAL Features



Polymorphism and polymorphic recursion are
crucial for encoding standard procedure call/return
We didn’t have to bake in the notion of procedure
call/return. Basic jumps were good enough
Can combine features to encode up a variety of
calling conventions




Arguments on stack or in registers?
Results on stack or in registers?
Return address? Caller pops? Callee pops?
Caller saves? Callee saves?
12/10/2004
COS441 - Typed Assembly Languages
34
STAL Syntax
12/10/2004
COS441 - Typed Assembly Languages
35
General Limitations of TAL/STAL


More functionality can be added to TAL
There are still some limitations

Must know the order of data allocated on the
stack


sp: 1 @  @ 2 @ ’ @ 3
Must distinguish between heap and stack pointers

12/10/2004
Can’t have a function that takes in two generic int
pointers and adds their contents
COS441 - Typed Assembly Languages
36
For more information on TAL/STAL


From System F to Typed Assembly
Language. G. Morrisett, D. Walker, K. Crary,
and N. Glew.
Stack-Based Typed Assembly Language. G.
Morrisett, K. Crary, N. Glew, D. Walker.
12/10/2004
COS441 - Typed Assembly Languages
37
What Makes Stack Typing So Tricky?

In the heap, we use the abstraction that a
location never changes type



The garbage collector handles the reuse
When modeling instructions like push/pop,
we have to break this abstraction and use
strong updates to change types
Aliasing can make this unsound if we’re not
careful
12/10/2004
COS441 - Typed Assembly Languages
38
My Research


A different approach to typed assembly
language
Use logic to describe what memory looks like
before and after each instruction
(r1 ) int) (r2 ) S(ℓ)) (ℓ ) bool)
load [r2], r1
(r1 ) int) (r2 ) S(ℓ)) (ℓ ) int)
12/10/2004
COS441 - Typed Assembly Languages
39
Modeling Stack “Evolution”


Keep track of the different “versions” of each
stack location and how they are related
We can do this using a pair of a tree and a
pointer to a node in the tree
12/10/2004
COS441 - Typed Assembly Languages
40
Stack Depth
Time
ℓ0
k1
TOP
k2
k3
k5
k4
ℓ1
k6
ℓ2
ℓ3
…
12/10/2004
COS441 - Typed Assembly Languages
41
Tagged Locations

All locations are “tagged” with their version
number


(r1 ) S(k3. ℓ2)) is different from (r1 ) S(k6. ℓ2))
The tags act as “guards” on the location


Before accessing a tagged location, you must
prove that it is “live”
In other words, the tag must be on the path
between the root and Top
12/10/2004
COS441 - Typed Assembly Languages
42
For more information…
Certifying Compilation for a Language with
Stack Allocation [Draft]. L. Jia, F. Spalding, D.
Walker, N. Glew.
12/10/2004
COS441 - Typed Assembly Languages
43