Department of Computer Science

Download Report

Transcript Department of Computer Science

Alfred V. Aho
[email protected]
E6998-2: Advanced Topics in
Programming Languages and Compilers
Lecture 1 – Introduction to Course
September 6, 2011
1
Al Aho
Introduction
Professor Alfred V. Aho
http://www.cs.columbia.edu/~aho
[email protected]
Lectures: Tuesdays, 4:10-6:00pm, 337 Mudd
Office hours: Tuesdays 2:45-3:45pm,
513 Computer Science Building
Course webpage:
http://www.cs.columbia.edu/~aho/cs6998
2
Al Aho
Course Objectives
• Understanding how modern language and compiler
technology can be used to make more reliable software
• Learning the major concepts and design principles
underlying programming languages
• Understanding modern program analysis techniques and
tools
• Awareness of language and compiler issues in dealing
with parallelism and concurrency
• A highlight of this course is a semester-long project in
which you can explore an advanced topic of your own
choosing in more depth.
3
Al Aho
Potential Project Topics
•
•
•
•
•
•
•
•
•
•
•
Computational thinking in language design
Lambda calculus and functional languages
Concurrency and parallelism
Program analysis techniques
Interprocedural analysis
Pointer analysis
Binary decision diagrams
Model checking
Satisfiability modulo theory solvers
Abstract interpretation
Report on a “most influential PLDI paper”
– http://www.sigplan.org/award-pldi.htm
4
Al Aho
Recent Most Influential PLDI Papers
•
Automatic predicate abstraction of C programs
•
Dynamo: A Transparent Dynamic Optimization System
•
A Fast Fourier Transform Compiler
•
The Implementation of the Cilk-5 Multithreaded Language
•
Exploiting Hardware Performance Counters with Flow and Context Sensitive
Profiling
5
•
TIL: A Type-Directed Optimizing Compiler for ML
•
Selective Specialization for Object-Oriented Languages
•
ATOM: a system for building customized program analysis tools
•
Space Efficient Conservative Garbage Collection
•
Lazy Code Motion
•
A data locality optimizing algorithm
• Al Aho
Profile guided code positioning
Additional Project Topics
• Garbage collection
• Data-flow analysis schemas
• Instruction-level parallelism
• Optimizing for parallelism and locality
• Interprocedural analysis
• New intermediate representations
– Static single-assignment form
• New compiler development tools
– Phoenix
• Compiler collections
– GCC
– LLVM
– .NET
6
Al Aho
Prerequisites and Background Text
• Fluency in at least one major programming language such
as C, C++, C#, Java, or OCaml
• COMS W4115: Programming Languages and Translators,
or equivalent
• Text: Compilers, Techniques, and Tools
(Second Edition), Aho, Lam, Sethi, and
Ullman, Addison-Wesley, 2007.
7
Al Aho
Course Project and Grade
• Each student should select a programming language or
compiler topic to pursue in more depth.
• Students will regularly discuss their projects with the class
and, at the end of the semester, will present their project
and hand in a final project report summarizing their
findings.
• The project and classroom discussions will determine the
final grade.
8
Al Aho
The Age of Computational Thinking
Computational advertising
Computational biology
Computational chemistry
Computational legal studies
Computational linguistics
Computational physics
Computational science
Computational thinking in programming
language design
9
Al Aho
Computational Thinking – Jeannette Wing
Computational thinking is a fundamental skill
for everyone, not just for computer scientists.
To reading, writing, and arithmetic, we should
add computational thinking to every child’s
analytical ability. Just as the printing press
facilitated the spread of the three Rs, what is
appropriately incestuous about this vision is
that computing and computers facilitate the
spread of computational thinking.
[Jeannette Wing,
Computational
Thinking, CACM,
March, 2006]
10
Al Aho
Computational thinking involves solving
problems, designing systems, and
understanding human behavior, by drawing
on the concepts fundamental to computer
science. Computational thinking includes a
range of mental tools that reflect the breadth
of the field of computer science.
Computational Thinking
The thought processes involved in formulating
problems so their solutions can be represented
as computation steps and algorithms.
A. V. Aho
Computation and Computational Thinking
Ubiquity Symposium, ACM, 2010
http://ubiquity.acm.org/symposia.cfm
11
Al Aho
A Good Way to Learn Computational Thinking
Design and implement your own programming
language!
12
Al Aho
Computational Thinking in Language Design
Problem
Domain
Conceptual
Formulation
Algorithms+
Computational Model
Programming
Language
13
Al Aho
Evolution of Programming Languages
1971
2011
Fortran
Java
Lisp
C
Cobol
C++
Algol 60
PHP
APL
C#
Snobol 4
Objective-C
Simula 67
Basic
Basic
Python
PL/1
Perl
Pascal
JavaScript
[http://www.tiobe.com, September 2011]
14
Al Aho
Issues in Language Design
• Domain of application
– exploit domain restrictions for expressiveness, performance
• Computational model
– simplicity, ease of expression
• Abstraction mechanisms
– reuse, suggestivity
• Type system
– reliability, security
• Usability
– readability, writability, efficiency
15
Al Aho
Computational Thinking in Language Design
Underlying every programming language is a model of
computation:
C, C++, C#, Java: Von Neumann
SQL: Relational algebra
Prolog: Logic
Haskell, ML: Lambda calculus
16
Al Aho
Computational Model of AWK
AWK is a simple language designed to perform routine
data-processing tasks on strings and numbers
Problem: given a list of name-value pairs, print the total value
associated with each name.
alice 10
eve 20
bob 15
alice 30
An AWK program
is a sequence of
pattern-action statements
{ total[$1] += $2 }
END { for (x in total) print x, total[x] }
eve 20
bob 15
alice 40
17
Al Aho
Theory in Practice: Regular Expression Pattern
Matching in Perl, Python, Ruby vs. AWK
Time to check whether a?nan matches an
regular expression and text size n
Russ Cox, Regular expression matching can be simple and fast (but is slow in Java,
Perl, PHP, Python, Ruby, ...) [http://swtch.com/~rsc/regexp/regexp1.html, 2007]
18
Al Aho
Evolutionary Forces on Languages and Compilers
Increasing diversity of applications
Stress on increasing productivity
Need to improve software reliability
and maintainability
Target machines more diverse
Parallel machine architectures
Massive compiler collections
19
Al Aho
Target Languages
Another programming language
CISCs
RISCs
Parallel machines
Multicores
GPUs
Quantum computers
20
Al Aho
Programming Languages Today
Today there are thousands of programming languages.
The website http://www.99-bottles-of-beer.net
has programs in 1,434 different programming
languages to print the lyrics to the song
“99 Bottles of Beer.”
21
Al Aho
“99 Bottles of Beer”
99 bottles of beer on the wall, 99 bottles of beer.
Take one down and pass it around, 98 bottles of beer on the wall.
98 bottles of beer on the wall, 98 bottles of beer.
Take one down and pass it around, 97 bottles of beer on the wall.
.
.
.
2 bottles of beer on the wall, 2 bottles of beer.
Take one down and pass it around, 1 bottle of beer on the wall.
1 bottle of beer on the wall, 1 bottle of beer.
Take one down and pass it around, no more bottles of beer on the wall.
No more bottles of beer on the wall, no more bottles of beer.
Go to the store and buy some more, 99 bottles of beer on the wall.
[Traditional]
22
Al Aho
“99 Bottles of Beer” in AWK
BEGIN {
for(i = 99; i >= 0; i--) {
print ubottle(i), "on the wall,", lbottle(i) "."
print action(i), lbottle(inext(i)), "on the wall."
print
}
}
function ubottle(n) {
return sprintf("%s bottle%s of beer", n ? n : "No more", n - 1 ? "s" : "")
}
function lbottle(n) {
return sprintf("%s bottle%s of beer", n ? n : "no more", n - 1 ? "s" : "")
}
function action(n) {
return sprintf("%s", n ? "Take one down and pass it around," : \
"Go to the store and buy some more,")
}
function inext(n) {
return n ? n - 1 : 99
}
[Osamu Aoki, http://people.debian.org/~osamu]
23
Al Aho
“99 Bottles of Beer” in Perl
''=~(
.('`'
.'=='
^'+')
.';-'
.('['
.'_\\{'
).(('`')|
).('['^'/')
'\\"'.('['^
'{'^"\[").(
('{'^'[').(
'`'|"\%").(
'\\"\\}'.+(
'+_,\\",'.(
'`'|"\+").(
'{'^"\[").(
'[').("\["^
')').("\["^
'.').("\`"|
'+').("\!"^
'`'|('%')).
'(?{'
|'!')
.('['
.'||'
.'-'.
^'.')
.'(\\$'
'/').').'
.('['^'/').
'#').'!!--'
'`'|"\"").(
'`'|"\/").(
'{'^"\[").(
'['^"\+").(
'{'^('[')).
'`'|"\%").(
'`'|"\$").(
'+').("\`"|
'/').("\{"^
'.').("\`"|
'+').'\\"'.
'++\\$="})'
.('`'
.('`'
^'+')
.(';'
'\\$'
.('`'
.';=('.
.'\\"'.+(
('`'|',').(
.'\\$=.\\"'
'`'|"\%").(
'`'|"\.").(
'['^"\,").(
'['^"\)").(
('\\$;!').(
'{'^"\[").(
'`'|"\/").(
'!').("\["^
'[').("\`"|
'$')."\,".(
('['^',').(
);$:=('.')^
|'%')
|',')
.('`'
&'=')
.'=;'
|'"')
'\\$=|'
'{'^'[').
'`'|('%')).
.('{'^'[').
'`'|"\%").(
'{'^"\[").(
'`'|"\!").(
'`'|"\)").(
'!'^"\+").(
'`'|"\/").(
'['^"\,").(
'(').("\["^
'!').("\["^
'!'^('+')).
'`'|"\(").(
'~';$~='@'|
.('['
.'"'.
|'/')
.(';'
.('['
.('!'
."\|".(
('`'|'"')
'\\".\\"'.(
('`'|'/').(
'['^(')')).
'['^"\/").(
'`'|"\,").(
'`'|"\.").(
'{'^"\/").(
'`'|"\.").(
'`'|('.')).
'(').("\{"^
')').("\`"|
'\\",_,\\"'
'`'|"\)").(
'(';$^=')'^
^'-')
'\\$'
.('['
&'=')
^'(')
^'+')
'`'^'.'
.('`'|'/'
'['^('(')).
'`'|"\&").(
'\\").\\"'.
'`'|"\(").(
'`'|(',')).
'['^('/')).
'`'|"\!").(
'`'|"\%").(
','.(('{')^
'[').("\`"|
'/').("\["^
.'!'.("\!"^
'`'|"\,").(
'[';$/='`';
[Andrew Savage, http://search.cpan.org/dist/Acme-EyeDrops/lib/Acme/EyeDrops.pm]
24
Al Aho
“99 Bottles of Beer” in the Whitespace Language
[Andrew Kemp, http://compsoc.dur.ac.uk/whitespace/]
25
Al Aho
Conlangs: Made-Up Languages
Okrent lists 500 invented languages including:
• Lingua Ignota [Hildegaard of Bingen, c. 1150]
• Esperanto [L. Zamenhof, 1887]
• Klingon [M. Okrand, 1984]
Huq Us'pty G'm (I love you)
• Proto-Central Mountain [J. Burke, 2007]
• Dritok [D. Boozer, 2007]
Language of the Drushek, long-tailed beings with
large ears and no vocal cords
[Arika Okrent, In the Land of Invented Languages, 2009]
[http://www.inthelandofinventedlanguages.com]
26
Al Aho
Grammars are Used for Specifying Syntax
The grammar S → aSbS | bSaS | ε generates all strings of
a’s and b’s with the same number of a’s as b’s.
This grammar is ambiguous: abab has two parse trees.
S
b
a
S
b
S
S
a
S
ε
ε
(ab)n
27
Al Aho
S
a
ε
1  2n 
  parse trees
has
n 1 n 
S
b
ε
a
S
S
ε
b
S
ε
Programming Languages are not
Inherently Ambiguous
This grammar G generates the same language
S → aAbS | bBaS | ε
A → aAbA | ε
B → bBaB | ε
S
a
G is unambiguous and has
only one parse tree for
every sentence in L(G).
A
b
S
ε
a
A
ε
28
Al Aho
b
S
ε
Natural Languages are Inherently Ambiguous
I made her duck.
[5 meanings: D. Jurafsky and J. Martin, 2000]
One morning I shot an elephant in my pajamas. How he got
into my pajamas I don’t know.
[Groucho Marx, Animal Crackers, 1930]
List the sales of the products produced in 1973 with the
products produced in 1972.
[455 parses: W. Martin, K. Church, R. Patil, 1987]
29
Al Aho
Methods for Specifying the Semantics of
Programming Languages
Operational semantics
Program constructs are translated to an understood language.
Axiomatic semantics
Assertions called preconditions and postconditions specify
the properties of statements.
Denotational semantics
Semantic functions map syntactic objects to semantic values.
30
Al Aho
Phases of a Compiler
source
program
Lexical
Analyzer
target
program
Syntax
Analyzer
token
stream
Semantic
Analyzer
syntax
tree
Interm.
Code
Gen.
annotated
syntax
tree
Code
Optimizer
interm.
rep.
Code
Gen.
interm.
rep.
Symbol Table
[A. V. Aho, M. S. Lam, R. Sethi, J. D. Ullman, Compilers: Principles, Techniques, & Tools, 2007]
31
Al Aho
Compiler Component Generators
source
program
32
Al Aho
lex
specification
yacc
specification
Lexical
Analyzer
Generator
(lex)
Syntax
Analyzer
Generator
(yacc)
Lexical
Analyzer
token
stream
Syntax
Analyzer
syntax
tree
Lex Specification for a Desk Calculator
number
[0-9]+\.?|[0-9]*\.[0-9]+
%%
[ ]
{ /* skip blanks */ }
{number}
{ sscanf(yytext, "%lf", &yylval);
return NUMBER; }
\n|.
{ return yytext[0]; }
[M. E. Lesk and E. Schmidt, Lex – A Lexical Analyzer Generator]
33
Al Aho
Yacc Specification for a Desk Calculator
%token NUMBER
%left '+'
%left '*'
%%
lines : lines expr '\n'
| /* empty */
;
expr : expr '+' expr
| expr '*' expr
| '(' expr ')'
| NUMBER
;
%%
#include "lex.yy.c"
{ printf("%g\n", $2); }
{ $$ = $1 + $3; }
{ $$ = $1 * $3; }
{ $$ = $2; }
[Stephen C. Johnson, Yacc: Yet Another Compiler-Compiler ]
34
Al Aho
Creating the Desk Calculator
Invoke the commands
lex desk.l
yacc desk.y
cc y.tab.c –ly –ll
Result
1.2 * (3.4 + 5.6)
35
Al Aho
Desk
Calculator
10.8
Computational Thinking for Quantum Computing
36
Al Aho
Physical
System
Subatomic particles
Mathematical
Formulation
Quantum mechanics
Algorithms
Shor, Grover, etc.
Model of
Computation
Quantum circuits
Computational Thinking for Quantum Computing
The Four Postulates of Quantum Mechanics
M. A. Nielsen and I. L. Chuang
Quantum Computation and Quantum Information
Cambridge University Press, 2000
37
Al Aho
State Space Postulate
Postulate 1
The state of an isolated quantum system can be described
by a unit vector in a complex Hilbert space.
38
Al Aho
Qubit: Quantum Bit
• The state of a quantum bit can be described by a unit vector in a 2dimensional complex Hilbert space (in Dirac notation)
  0   1
where α and β are complex coefficients called the amplitudes of the
basis states | 0  and | 1  , and
   1
2
2
• In linear algebra
 0 
1 
 0
         
 0
1 
 1 
39
Al Aho
Time-Evolution Postulate
Postulate 2
The evolution of a closed quantum system
can be described by a unitary operator U.
(An operator U is unitary if U†U = I.)

state of
the system
at time t1
40
Al Aho
U
U
state of
the system
at time t2
Useful Quantum Operators: Hadamard Operator
The Hadamard operator has the matrix representation
1
H
2
1 1 
1  1


H maps the computational basis states as follows
1
H0 
(0 1)
2
1
H1 
(0 1)
2
Note that HH = I.
41
Al Aho
Composition-of-Systems Postulate
Postulate 3
The state space of a combined physical system is
the tensor product space of the state spaces of the
component subsystems.
If one system is in the state  1 and another is in
the state  2 , then the combined system is in the
state  1   2 .
 1   2 is often written as  1  2 or as  1 2 .
42
Al Aho
Useful Quantum Operators: the CNOT Operator
1
0

0

0
The two-qubit CNOT
(controlled-NOT) operator
has the matrix
representation:
CNOT flips the target bit t
iff the control bit c has
the value 1:
c
0 0 0
1 0 0
0 0 1

0 1 0
.
t
c
c t
The CNOT gate maps
00  00 ,
43
Al Aho
01  01 ,
10  11 ,
11  10
Measurement Postulate
Postulate 4
Quantum measurements can be described by a
collection {M m } of operators acting on the state space
of the system being measured. If the state of the
system is  before the measurement, then the
probability that the result m occurs is
p(m)   M M m 
†
m
and the state of the system after measurement is
Mm 
 M Mm 
†
m
44
Al Aho
Measurement
The measurement operators satisfy the
completeness equation:
M
†
m
Mm  I
m
The completeness equation says the
probabilities sum to one:
 p(m)   
m
45
Al Aho
m
M M  1
†
m
Computational Abstraction: Quantum Circuits
Quantum circuit to create Bell (Einstein-Podulsky-Rosen)
states:
x
H
 xy
y
Circuit maps
00 
( 00  11 )
2
, 01 
( 01  10 )
2
, 10 
( 00  11 )
2
, 11 
( 01  10 )
2
Output is an entangled state, one that cannot be written in
a product form. (Einstein: “Spooky action at a distance.”)
46
Al Aho
Alice and Bob’s Qubit-State Delivery Problem
• Alice knows that she will need to send to Bob the state
of an important secret qubit sometime in the future.
• Her friend Bob is moving far away and will have a very
low bandwidth Internet connection.
• Therefore Alice will need to send her qubit state to Bob
cheaply.
• How can Alice and Bob solve their problem?
47
Al Aho
Alice and Bob’s Solution: Quantum Teleportation!

 00
H
M1
M2
X
Z

• Alice and Bob generate an EPR pair  .
00
• Alice takes one half of the pair; Bob the other half. Bob moves far away.
• Alice interacts her secret qubit  with her EPR-half and measures the
two qubits.
• Alice sends the two resulting classical measurement bits to Bob.
• Bob decodes his half of the EPR pair using the two bits to discover  .
48
Al Aho
Quantum Computer Architecture
Quantum
Memory
Quantum
Logic Unit
Classical Computer
Knill [1996]: Quantum RAM, a classical computer combined with a
quantum device with operations for initializing registers of qubits
and applying quantum operations and measurements
E. Knill
Conventions for Quantum Pseudocode
Los Alamos National Laboratory, LAUR-96-2724, 1996
49
Al Aho
Quantum Computer Compiler
QIR: quantum intermediate representation
QASM: quantum assembly language
QPOL: quantum physical operations language
quantum
source
program
QIR
Front
End
quantum
mechanics
quantum
circuit
Technology
Independent
CG+Optimizer
QASM
Technology
Dependent
CG+Optimizer
quantum
circuit
QPOL
Technology
Simulator
quantum
device
Computational abstractions
50
Al Aho
K. Svore, A. Aho, A. Cross, I. Chuang, I. Markov
A Layered Software Architecture for Quantum Computing Design Tools
IEEE Computer, 2006, vol. 39, no. 1, pp.74-83
MIT Ion Trap Simulator
51
Al Aho
Design Flow with Fault Tolerance and
Error Correction
Mathematical Model:
Quantum mechanics,
unitary operators,
tensor products
Computational
Formulation:
Quantum bits,
gates, and circuits
EPR Pair Creation
Quantum Circuit Model
QCC:
QIR,
QASM
QIR
Software:
QPOL
QASM
QPOL
Physical System:
Laser pulses
applied
to ions in traps
Machine Instructions
Physical Device
B
A 123 B
A
123
Fault Tolerance and Error Correction (QEC)
QEC
Moves
QEC
52
Al Aho
Moves
K. Svore
PhD Thesis
Columbia, 2006
Why Quantum Computation is Still Challenging
• States are superpositions
• Operators are unitary transforms
• States of qubits can become entangled
• Measurements are destructive
• No-cloning theorem: you cannot copy an unknown
quantum state!
53
Al Aho
Some Computational Thinking Lessons
Learned in COMS W4115
• “Designing a language is hard and designing a simple
language is extremely hard!”
• “During this course we realized how naïve and
overambitious we were, and we all gained a newfound
respect for the work and good decisions that went into
languages like C and Java which we’ve taken for
granted for years.”
54
Al Aho
Open Question: Is computational thinking innate?
55
Al Aho