Transcript pptx

Research in Compilers and
How it Relates to
Software Engineering
Part I: Compiler Research
Tomofumi Yuki
EJCP 2015
June 22, Nancy
Background
 Defended Ph.D. in C.S. on October 2012


Colorado State University
Advisor: Dr. Sanjay Rajopadhye
 Currently Inria Chargé de Recherche

Rhône-Alpes, LIP @ ENS Lyon
 Optimizing compiler + programming language



static analysis (polyhedral model)
parallel programming models
High-Level Synthesis
2
What is this Course About?
 Research in compilers

a bit about compiler itself
 Understand compiler research
what are the problems?
 what
techniques?
Be
ableare
to the
(partially)
understand work
 what
are thepeople”
applications?
by
“compiler
at conferences.


may be do research in compilers later on!
EJCP 2015, June 22, Nancy
3
What is a Compiler?
 What does compiler mean to you?
EJCP 2015, June 22, Nancy
4
Compiler Advances
 Old compiler vs recent compiler


modern architecture
different versions of gcc
 How much speedup by compiler alone after
20 years of research?
EJCP 2015, June 22, Nancy
5
Compiler Advances
 Old compiler vs recent compiler

modern architecture
different versions of gcc

2x difference after 20 years (anecdotal)

 Not so much?
EJCP 2015, June 22, Nancy
6
Compiler Advances
 Old compiler vs recent compiler

modern architecture
different versions of gcc

2x difference after 20 years (anecdotal)

 Not so much?
“The most remarkable accomplishment by far
of the compiler field is the widespread use of
high-level languages.”
by Mary Hall, David Padua, and Keshav Pingali
[Compiler Research: The Next 50 Years, 2009]
EJCP 2015, June 22, Nancy
7
Earlier Accomplishments
 Getting efficient assembly



register allocation
instruction scheduling
...
 High-level language features




object-orientation
dynamic types
automated memory management
...
EJCP 2015, June 22, Nancy
8
What is Left?
 Parallelism


multi-cores, GPUs, ...
language features for parallelism
 Security/Reliability


verification
certified compilers
 Power/Energy


data movement
voltage scaling
EJCP 2015, June 22, Nancy
9
Agenda for today
 Part I: What is Compiler Research?
 Part II: Compiler Optimizations
 Part III: Connection to Software Engineering
 Lab: Introduction to Loop Transformations
EJCP 2015, June 22, Nancy
10
What is a Compiler?
 Bridge between “source” and “target”
source
compile
target
EJCP 2015, June 22, Nancy
11
Compiler vs Assembler
 What are the differences?
source
compile
target
assembl
y
assemble
object
EJCP 2015, June 22, Nancy
12
Compiler vs Assembler
 Compiler


Many possible targets (semi-portable)
Many decisions are taken
 Assembler


Specialized output (non-portable)
Usually a “translation”
EJCP 2015, June 22, Nancy
13
Goals of the Compiler
 Higher abstraction


No more writing assemblies!
enables language features

loops, functions, classes, aspects, ...
 Performance



while increasing productivity
speed, space, energy, ...
compiler optimizations
EJCP 2015, June 22, Nancy
14
Productivity vs Performance
 Higher Abstraction ≈ Less Performance
Python
Abstractio
n
Java
C
Fortran
Assembly
PerformancEJCP 2015, June 22, Nancy
e
15
Productivity vs Performance
 How much can you regain?
Python
Python
Abstractio
n
Java
Java
C
C
FortranFortran
Assembly
PerformancEJCP 2015, June 22, Nancy
e
16
Productivity vs Performance
 How sloppy can you write code?
PythonPython
Abstractio
n
Java
Java
C
C
Fortran
Fortran
Assembly
PerformancEJCP 2015, June 22, Nancy
e
17
Compiler Research
 Branch of Programming Languages






Program Analysis, Transformations
Formal Semantics
Type Theory
Runtime Systems
Compilers
...
EJCP 2015, June 22, Nancy
18
Current Uses of Compiler
 Optimization



important for vendors
many things are better left to the compiler
parallelism, energy, resiliency, ...
 Code Analysis


IDEs
static vs dynamic analysis
 New Architecture

IBM Cell, GPU, Xeon-Phi, ...
EJCP 2015, June 22, Nancy
19
Case 1: Register Allocation
 Classical optimization problem
3 registers
8 instructions
naïve translation
load
load
add
store
load
load
add
store
%r1,
%r2,
%r3,
%r3,
%r1,
%r2,
%r3,
%r3,
A
B
%r1, %r2
C
B
C
%r1, %r2
D
C = A + B;
D = B + C;
2 registers
6 instructions
smart compilation
load
load
add
store
add
store
%r1,
%r2,
%r1,
%r1,
%r1,
%r1,
A
B
%r1, %r2
C
%r2, %r1
D
EJCP 2015, June 22, Nancy
20
Register Allocation in 5min.
 Often viewed as graph coloring
a
a
b
c
c
d
c = a + b;
d = b + c;
d
Interference
Graph
b
add
add
%r1, %r1, %r2
%r1, %r2, %r1
Live Range
Analysis
 Live Range: when a value is “in use”
 Interference: both values are “in use”

e.g., two operands of an instruction
 Coloring: conflicting nodes to different reg.
EJCP 2015, June 22, Nancy
21
Register Allocation in 5min.
 Registers are limited
a
b
c
d
x
y
a
b
c
d
x
z
c
d
x
y
=
=
=
=
a
b
c
a
+
+
+
+
a
b
c
d
x
y
a
b
c
d
x
y
b;
c;
d;
x;
Live Range Splitting
a
c
d
x
z
y
=
=
=
=
=
=
load A;
a + b;
b + c;
c + d;
load A;
z + x;
z
EJCP 2015, June 22, Nancy
22
Research in Register Allocation
 How to do a good allocation


which variables to split
which values to
spill
“Solved”
 How to do it fast?


Graph-coloring is expensive
Just-in-Time compilation
EJCP 2015, June 22, Nancy
23
Case 2: Instruction Scheduling
 Another classical problem
X = A * B * C;
Y = D * E * F;
naïve translation
R
X
S
Y
=
=
=
=
A
R
D
S
*
*
*
*
B;
C;
E;
F;
smart compilation
Pipeline Stall
(if mult. takes 2 cycles)
R
S
X
Y
=
=
=
=
A
D
R
S
*
*
*
*
B;
E;
C;
F;
Also done in hardware
(out-of-order)
EJCP 2015, June 22, Nancy
24
Instruction Scheduling in 5min.
 DAG Scheduling + Resource Constraints
A
B
C
×
×
X
X = A * B * C;
Y = D * E * F;
Scheduling:
- dependences
- latency of ops
Resource Allocation:
- #of ALUs
- #of mem.
requests
D
E
F
×
×
Y
EJCP 2015, June 22, Nancy
25
Research in Instruction
Scheduling
 Not much anymore for speed/parallelism


beaten to death
hardware does it for you
 Remains interesting in specific contexts



faster methods for JIT
energy optimization
“predictable” execution

in-order cores, VLIW, etc.
EJCP 2015, June 22, Nancy
26
Case 1+2: Phase Ordering
 Yet another classical problem

practically no solution
 Given optimization A and B



A after B vs A before B
which order is better?
can you solve the problem globally?
 Parallelism requires more memory

trade-off: register pressure vs parallelism
EJCP 2015, June 22, Nancy
27
Case 3: Bug Detection
 Static Analysis to help programming
short A = 30000;
short B = 30000;
short C = A + B;
Is there a problem?
EJCP 2015, June 22, Nancy
28
Case 3: Bug Detection
EJCP 2015, June 22, Nancy
29
Case 4: High-Level Synthesis
 Hardware Description ≈ Assembly for HW

Still in heavy use!
for (i=0; i<N; i++)
for (j=i; j<N; j++)
C[i] += A[i][j] * B[i];
i,j
if j=N-1
i+1,i
if j<N-1
i,j+1
A
×
+
C
B
EJCP 2015, June 22, Nancy
30
Tracks in PLDI 2015
 Correctness
 Verification
 Optimization
 Concurrency I/II
 Parallelism
 Performance
 Synthesis I/II
 Semantics I/II
 Analysis
 Logic
EJCP 2015, June 22, Nancy
31
Job Market
 Where do they work at?




IBM
Mathworks
amazon
start-ups
 Many opportunities in France


Mathworks @ Grenoble
Many start-ups
EJCP 2015, June 22, Nancy
32