Lecture 0 - Introduction, Course Overview, and Computer Organization

Download Report

Transcript Lecture 0 - Introduction, Course Overview, and Computer Organization

Lecture 0: Introduction
EEN 312: Processors:
Hardware, Software, and
Interfacing
Department of Electrical and Computer Engineering
Fall 2012, Dr. Rozier (UM)
Welcome to EEN 312!
Professor Eric Rozier
ROSE-E-A
Who am I?
• BS in Computer Science
from William and Mary
Who am I?
• BS in Computer Science
from William and Mary
• Studied models of
agricultural pests (flour
beetles).
Who am I?
• BS in Computer Science
from William and Mary
• Studied models of
agricultural pests (flour
beetles).
• And load balancing of
super computers.
Who am I?
• First job – NASA Langley
Research Center
Who am I?
• First job – NASA Langley
Research Center
• Researched problems in
aeroacoustics
Who am I?
• First job – NASA Langley
Research Center
• Researched problems in
aeroacoustics
– Primarily on the XV-15
Who am I?
• First job – NASA Langley
Research Center
• Researched problems in
aeroacoustics
– Primarily on the XV-15
– Precursor to the better
known V-22
Who am I?
• PhD in CS/ECE from the
University of Illinois
Who am I?
• PhD in CS/ECE from the
University of Illinois
• Studied non-linear
dynamics of
transactivation
networks in
economically important
species…
Who am I?
• PhD in CS/ECE from the
University of Illinois
• Studied non-linear
dynamics of
transactivation
networks in
economically important
species… corn…
Who am I?
• PhD in CS/ECE from the
University of Illinois
• Worked with the NCSA
on problems in super
computing, reliability,
and big data.
Who am I?
• PhD in CS/ECE from the
University of Illinois
• Worked with the NCSA
on problems in super
computing, reliability,
and big data.
• Research led to
patented advances with
IBM
Who am I?
• Served as a visiting
scientist and IBM Fellow
at the IBM Almaden
Research Center in San
Jose, CA
• Helped advance state of
the art in faulttolerance, and our
understanding of why
systems fail
Who am I?
• Postdoctoral work at
the Information Trust
Institute
– Worked on Blue Waters
Super Computer, first
sustained Petaflop
machine
– Designed new faulttolerant methods for
data protection on largescale systems
Who am I?
• Assistant Professor at
UM ECE
Who am I?
• Assistant Professor at
UM ECE
– Head of the Trustworthy
Systems Lab
Who am I?
• Assistant Professor at
UM ECE
– Head of the Trustworthy
Systems Lab
– Working on problems in:
•
•
•
•
•
Cloud computing
Big Data
Reliability
Security
Compliance
How to get in touch with me?
• Office
– Department of Electrical and Computer Engineering
– Fifth Floor, Room 517
• Contact Information
– Email: [email protected]
– Phone: 8-9752
• Currently looking for motivated students
– Research projects and papers
Office Hours
• Office
– Department of Electrical and Computer Engineering
– Fifth Floor, Room 517
Day
Hours
Tuesday
10:00a – 11:00a
Thursday
10:00a – 11:00a
Or by appointment
COURSE SUMMARY AND OVERVIEW
EEN 312
• Processors: Hardware, Software, and
Interfacing
– Class: MM 102
– Lab: McArthur Engineering Building 402
• Class website
– http://performalumni.org/erozier2/een312.html
The syllabus…
Grades
Grade Component
Percentage
Midterm I
10%
Midterm II
10%
Laboratory Projects
50%
Final Exam
30%
Grades
• Guaranteed Grades
• A+’s are assigned on the basis of exceptional
work, scoring 99 or 100 for the entire course.
Labs
• Labs are a HUGE component of this course
– Lab sessions will be held based on the session you have
been assigned and registered for.
– Labs for this class will be very demanding. It is unlikely you
will finish them during the assigned sessions.
– You will need to make good use of your assigned
laboratory time to seek guidance from your TAs, but you
should expect to spend significant time outside of lab
working on your lab assignments.
Active Learning
• After 2 weeks we tend to remember:
– Passive learning
•
•
•
•
10% of what we read
20% of what we hear
30% of what we see
50% of what we hear and see
– Active learning
• 70% of what we say
• 90% of what we say and do
Bloom’s Taxonomy
Evaluation
Synthesis
Analysis
Application
Comprehension
Knowledge
Training Good Engineers
• Understanding processors isn’t our only goal
– Critical Reading
– Critical Reasoning
• Ask questions!
• Think through problems!
• Challenge assumptions!
Prerequisite
Pre or Corequisite
Computer Engineering (ECN) (128 Cr) 2012 - 2013
EEN 111 - 3 Cr.
Intro. to
Engineering I
EEN 118 - 3 Cr.
Introduction to
Programming
EEN 218 - 3 Cr.
Data Structures
EEN 112 - 2 Cr.
Intro. to
Engineering II
EEN 318 - 3 Cr.
Advanced Computer
Programming
EEN 304 - 3 Cr.
Logic Design
312
EEN 312 - 4 Cr.
Microprocessor
EEN 567 - 3 Cr.
Database Design
& Management
C.E. Tech. Elect.
3 Cr.
EEN 521 - 3 Cr.
Computer Operating
Systems
MTH 309 - 3 Cr.
Discrete
Mathematics I
EEN 316 - 1 Cr
Structured Digital
Design
EEN 414 - 3 Cr
Comp Organization
& Design
EEN 514 - 3 Cr.
Computer
Architecture
B.S.Cp.E. Core
PHY 205 - 3 Cr.
University Physics I
MTH 162 - 4 Cr.
Calculus II
MTH 210-3 Cr.
Vectors & Matrices
EEN 315 - 1 Cr.
Digital Design
Lab
EEN 424 - 3 Cr.
Unix Systems &
Servers
MTH 151 - 5 Cr.
Calculus for Engineers
1
IEN/EEN 310 - 3 Cr.
Engineering
Probability
EEN 204 - 1 Cr.
Electrical Circuits
Lab.
EEN 454 - 2 Cr.
Digital System
Design & Testing
EEN 417 - 2 Cr.
Embedded µProc.
System Design
EEN 419 - 2 Cr.
C. E.
Senior Project
EEN 455 - 1 Cr.
Design-forTestability Lab
EEN 418 - 1 Cr.
Senior Project
Planning
June 2012
EEN 201- 3 Cr.
Electrical Circuit
Theory
EEN 306 - 3 Cr.
Electronics II
1
EEN 311 - 1 Cr.
Electronics Lab
C.E. Tech. Elect.
3 Cr.
14
CR
ENG 107 - 3 Cr.
Writing About
Science
15
CR
PHY 206 /7- 3 Cr.
PHY 208/9 - 1 Cr.
University
Physics II orIII+Lab
Hum/Arts
Elective
3 Cr.
Basic Science Elect
3 Cr.
People & Soc
Elective
3 Cr.
Basic Science Elect
& Lab
3 Cr + 1 Cr
EEN 305 - 3 Cr.
Electronics I
ENG 105 - 3Cr.
English
Composition I
EEN 307 - 3 Cr.
Linear Circuits and
Signals
MTH 311 - 3 Cr.
Ordinary Diff.
Equations
C.E. Tech. Elect.
3 Cr.
16
CR
16
CR
Hum/Arts
Elective
3 Cr.
16
CR
People & Soc
Elective
3 Cr.
18
CR
Adv
HA/PS
Elective
3 Cr.
16
CR
Adv
HA/PS
Elective
3 Cr.
17
CR
Prerequisite
Pre or Corequisite
Computer Engineering (ECN) (128 Cr) 2012 - 2013
EEN 111 - 3 Cr.
Intro. to
Engineering I
EEN 118 - 3 Cr.
Introduction to
Programming
118
EEN 218 - 3 Cr.
Data Structures
EEN 112 - 2 Cr.
Intro. to
Engineering II
EEN 318 - 3 Cr.
Advanced Computer
Programming
304
EEN 304 - 3 Cr.
Logic Design
312
EEN 312 - 4 Cr.
Microprocessor
EEN 567 - 3 Cr.
Database Design
& Management
C.E. Tech. Elect.
3 Cr.
EEN 521 - 3 Cr.
Computer Operating
Systems
MTH 309 - 3 Cr.
Discrete
Mathematics I
EEN 316 - 1 Cr
Structured Digital
Design
EEN 414 - 3 Cr
Comp Organization
& Design
EEN 514 - 3 Cr.
Computer
Architecture
B.S.Cp.E. Core
PHY 205 - 3 Cr.
University Physics I
MTH 162 - 4 Cr.
Calculus II
MTH 210-3 Cr.
Vectors & Matrices
EEN 315 - 1 Cr.
Digital Design
Lab
EEN 424 - 3 Cr.
Unix Systems &
Servers
MTH 151 - 5 Cr.
Calculus for Engineers
1
IEN/EEN 310 - 3 Cr.
Engineering
Probability
EEN 204 - 1 Cr.
Electrical Circuits
Lab.
EEN 454 - 2 Cr.
Digital System
Design & Testing
EEN 417 - 2 Cr.
Embedded µProc.
System Design
EEN 419 - 2 Cr.
C. E.
Senior Project
EEN 455 - 1 Cr.
Design-forTestability Lab
EEN 418 - 1 Cr.
Senior Project
Planning
June 2012
EEN 201- 3 Cr.
Electrical Circuit
Theory
EEN 306 - 3 Cr.
Electronics II
1
EEN 311 - 1 Cr.
Electronics Lab
C.E. Tech. Elect.
3 Cr.
14
CR
ENG 107 - 3 Cr.
Writing About
Science
15
CR
PHY 206 /7- 3 Cr.
PHY 208/9 - 1 Cr.
University
Physics II orIII+Lab
Hum/Arts
Elective
3 Cr.
Basic Science Elect
3 Cr.
People & Soc
Elective
3 Cr.
Basic Science Elect
& Lab
3 Cr + 1 Cr
EEN 305 - 3 Cr.
Electronics I
ENG 105 - 3Cr.
English
Composition I
EEN 307 - 3 Cr.
Linear Circuits and
Signals
MTH 311 - 3 Cr.
Ordinary Diff.
Equations
C.E. Tech. Elect.
3 Cr.
16
CR
16
CR
Hum/Arts
Elective
3 Cr.
16
CR
People & Soc
Elective
3 Cr.
18
CR
Adv
HA/PS
Elective
3 Cr.
16
CR
Adv
HA/PS
Elective
3 Cr.
17
CR
Course overview
• Understanding the abstractions beneath your
applications and programs.
• We will focus on:
– How programs are translated into machine
language.
– How hardware executes machine
instructions.
– How computers are organized and
designed.
Course Components
• Class time
– High level concepts
– Hands on exercises and application
– Discussions
• Labs
– The heart of the course
– 1-2 weeks each
– Indepth exploration of an aspect of system design and organization
• Exams
– 2 Midterms + 1 Final
– Test your understanding of concepts and mathematics
Textbook
Textbook
• Be sure to get the 4th edition!
• Available from the bookstore
– New: $89.95
– Used: $67.50
• Available online
– Softback: $61.98 (Amazon)
– Kindle: $71.99
– Kindle Rental: ~$35
• The textbook is essential for this course.
LABORATORIES
Laboratories
• TAs
– Yilin Yan
• [email protected]
– Murat Aykin
• [email protected]
• Lab Sections
– Wednesday 2:30 – 4:50p
– Friday 2:30 – 4:50p
Lab Procedure
• Labs will be completed in groups of 2-3.
– You may complete labs as a group, but you must
each hand in a separate lab assignment.
– You may change groups with each lab.
Raspberry Pi
Lab Pis
Lab Pis
• We have a set of 16 Raspberry Pis available for
the class. Each group will be assigned one for
each lab.
– Don’t use an unassigned Pi!
– Some of our labs will have the potential to reboot
the platform, or worse! One group per Pi!
• Pis used for the lab are accessible from the
school network.
Laboratory Assignments
• The labs for this class will require a lot of time.
• Start them early.
• Labs will be assigned in class on Tuesday
before the first lab session.
– It is recommended you prepare any questions for
your first laboratory session in advance!
• Labs are typically due at the beginning of your
lab session, 2 weeks after they are assigned.
Laboratory Assignments
• Each student is allocated 3 slip days for the semester.
– A slip day can be used to extend the due date for a laboratory by 1 day,
no questions asked.
– You should indicate on your submitted assignment how many slip days
are being used.
• No other extensions will be granted except in the case of a
documented emergency.
– Late work suffers
•
•
•
•
A -20% on the first day it is late.
A -40% on the second day it is late.
A -60% on the third day it is late.
No credit for four days or more late.
Examinations
• Examinations
– Midterm I – February 13th in class
– Midterm II – March 3rd in class
– Final Exam – May 1st from 11:00a – 1:30p in MM102.
Course Plan
Week
1
Introduction, Computer Organization, Performance
Lab 0
2
Instructions, operands, load/store, and numbers
Lab 1
4
Arithmetic, ALUs, Processors
– Open hands
policy on assignments
Lab 2
•3 University ofBranches,
Miami
Honor
Code
conditions,
loops, procedures
and the is
stackin effect
5
•6 Late policy
7
8
3/11
Data path, control, pipelining
MIDTERM I
Jumps, branches, and pipelines
Lab 3
Pipeline hazards,
branch
prediction,
exceptions
– Late assignments
are
only
accepted
if
Memory hierarchy, caches, addressing
arrangements
are made ahead of time
Lab 4
Spring Break
•9 Electronic device
policy
Cache performance,
block replacement, caching algorithms
Lab 4
13
Virtual memory, paging, page faults, protection
5
– Laptops and
tablets are ok as long as they’re Lab
being
Intro to storage systems
MIDTERM II
used for class
Storage systems, reliability, deduplication, RAID, flash and PCM
Lab 6
– Silence cellConnecting
phones
please
processors,
memory, and I/O
14
Parallel processing, concurrency, and course synthesis
10
11
12
No Lab
ON ABSTRACTIONS
Abstraction and Reality
• Most courses in CS/ECE emphasize
abstractions
– Abstract data types
– Abstract analysis
• Abstractions have limits
– Reality raises its ugly heads as bugs in design and
implementation.
– Understanding the details of underlying systems
becomes important!
Some Realities
What is an int?
What is a float?
Some Realities
Reality #1
• An int is not an integer!
• A float is not a real number!
• Example: Is x^2 >= 0?
Some Realities
• An int is not an integer!
• A float is not a real number!
• Example: Is x^2 >= 0?
– Floats? Yes.
– Ints?
• 40000 * 40000 -> 1600000000
• 50000 * 50000 -> ??
Doesn’t behave like an
integer!
Some Realities
• Is addition ?communicative?
– Does x + y = y + x?
– Ints? Yes.
– Floats? No!
– ADD SOMETHING HERE
Computer Arithmetic
• It isn’t random!
– Operations have mathematical properties they
adhere to.
• May not be the ones we assume as “usual”
– Finite representation in the hardware matters!
• Observation:
– Understanding the hardware implementation is
necessary.
What kind of abstractions are we
using?
• Code found in BSD implementation of
getpeername
/* Kernel memory region holding user-accessible data */
#define KSIZE 1024
char kbuf[KSIZE];
/* Copy at most maxlen bytes from kernel region to user buffer */
int copy_from_kernel(void *user_dest, int maxlen) {
/* Byte count len is minimum of buffer size and maxlen */
int len = KSIZE < maxlen ? KSIZE : maxlen;
memcpy(user_dest, kbuf, len);
return len;
}
Intended Usage
• Code found in BSD implementation of
getpeername
/* Kernel memory region holding user-accessible data */
#define KSIZE 1024
char kbuf[KSIZE];
/* Copy at most maxlen bytes from kernel region to user buffer */
int copy_from_kernel(void *user_dest, int maxlen) {
/* Byte count len is minimum of buffer size and maxlen */
int len = KSIZE < maxlen ? KSIZE : maxlen;
memcpy(user_dest, kbuf, len);
return len;
#define MSIZE 528
}
void getstuff() {
char mybuf[MSIZE];
copy_from_kernel(mybuf, MSIZE);
printf(“%s\n”, mybuf);
}
Malicious Usage
• Code found in BSD implementation of
getpeername
/* Kernel memory region holding user-accessible data */
#define KSIZE 1024
char kbuf[KSIZE];
/* Copy at most maxlen bytes from kernel region to user buffer */
int copy_from_kernel(void *user_dest, int maxlen) {
/* Byte count len is minimum of buffer size and maxlen */
int len = KSIZE < maxlen ? KSIZE : maxlen;
memcpy(user_dest, kbuf, len);
return len;
#define MSIZE 528
}
void getstuff() {
char mybuf[MSIZE];
copy_from_kernel(mybuf, -MSIZE);
. . .
}
Some Realities
Reality #2
• Knowing assembly is essential to your future!
Some Realities
Reality #2
•Knowing assembly is essential to your future!
– You will probably NEVER write assembly programs
outside of this class.
• Compilers are better at it than you are.
– But…
Some Realities
Reality #2
• Knowing assembly is essential to your future!
– You will probably NEVER write assembly programs
outside of this class.
• Compilers are better at it than you are.
– Understanding assembly is key to understanding
machine execution.
•
•
•
•
Behavior of programs with bugs.
Performance tuning.
System Software.
Malware analysis.
Some Realities
Reality #3
• Memory matters!
– Random access memory is an abstraction with
little basis in the physical world.
– Memory is not unbounded.
• Memory must be allocated and managed.
• Many applications are memory dominated.
– Memory reference bugs are very difficult.
– Memory performance is non-uniform.
Memory Referencing Bug Example
double fun(int i)
{
volatile double d[1] = {3.14};
volatile long int a[2];
a[i] = 1073741824; /* Possibly out of bounds */
return d[0];
}
fun(0)
fun(1)
fun(2)
fun(3)
fun(4)
➙
➙
➙
➙
➙
3.14
3.14
3.1399998664856
2.00000061035156
3.14, then segmentation fault
• Result varies based on architecture.
Memory Referencing Bug Example
double fun(int i)
{
volatile double d[1] = {3.14};
volatile long int a[2];
a[i] = 1073741824; /* Possibly out of bounds */
return d[0];
}
fun(0)
fun(1)
fun(2)
fun(3)
fun(4)
➙
➙
➙
➙
➙
3.14
3.14
3.1399998664856
2.00000061035156
3.14, then segmentation fault
Explanation:
Saved State
4
d7 ... d4
3
d3 ... d0
2
a[1]
1
a[0]
0
Location accessed by
fun(i)
Memory Referencing Errors
• C/C++ do not provide memory protection
– Out of bound references
– Invalid pointer values
– Abuses of allocation
• Can lead to hard to debug situations
– Dependent on architecture and compiler
– Action at a distance
Memory Performance Example
• How big of a difference does this simple
change make?
• 21x times slowdown!
void copyij(int src[2048][2048],
int dst[2048][2048])
{
int i,j;
for (i = 0; i < 2048; i++)
for (j = 0; j < 2048; j++)
dst[i][j] = src[i][j];
}
void copyji(int src[2048][2048],
int dst[2048][2048])
{
int i,j;
for (j = 0; j < 2048; j++)
for (i = 0; i < 2048; i++)
dst[i][j] = src[i][j];
}
Memory Performance Example
7000
L1
copyij
5000
4000
L2
3000
L3
2000
1000
16K
128K
1M
8M
64M
s32
s15
s11
s9
s7
Mem
s13
Stride (x8 bytes)
s5
s3
0
2K
copyji
s1
Read throughput (MB/s)
6000
Size (bytes)
Some Realities
Reality #4
•There is more to performance than asymptotic
complexity.
– Constants matter too!
– Exact op count doesn’t even full describe the
situation!
– Must optimize at all levels, algorithm, data
representation, functions, loops.
Some Realities
Reality #4
• There is more to performance than asymptotic
complexity.
– Need to understand how systems work to
optimize them!
•
•
•
•
•
How are programs compiled?
How are they executed?
How do we measure performance?
How do we find bottlenecks?
How do we improve performance without affecting the
code?
Performance of Matrix Multiply
Best code (K. Goto)
160x
Triple loop
•
•
•
•
•
Same computer.
Same compiler.
Same flags.
Exactly the same number of operations
Why does this happen?
Performance of Matrix Multiply
Multiple threads: 2x
Vector instructions: 4x
Memory hierarchy and other optimizations: 20x
• Reasons for 20x improvement: Blocking or tiling, loop unrolling, array
scalarization, instruction scheduling, search to find best choice.
• Effect: Fewer register spills, L1/L2 cache misses, and TLB misses.
Some Realities
Reality #5
• Computers do more than execute programs
– Need to get data in and out
• I/O systems are critical to performance and reliability.
– Communicate over networks
• How to cope with unreliable media?
• Dealing with concurrency?
• Cross platform issues?
COMPUTER ORGANIZATION
Classes of Computers
• Desktop Computers
– General purpose, run a variety of software for many
applications
– Subject to cost/performance tradeoffs
• Server Computers
– Network based
– High capacity, performance, and reliability
– Range from small servers to super computers
• Embedded Computers
– Parts of systems, cyberphysical controllers
– Power/performance/cost constraints
Market Trends
Components of a Computer
• All computers have similar
philosophies of organization.
• Get input, perform computation,
produce output.
– User interface: Display,
keyboard, sensors
– Storage devices: Hard disk,
CD/DVD, flash
– Communication: Network,
wifi, etc.
– Compute: CPU, GPU, Memory
Internals of a Computer
Internals of a Processor (CPU)
• Datapath: Performs operations on data.
• Control: sequences datapath, memory
• Cache memory
– Small fast SRAM memory for immediate access to
data.
Abstractions and the CPU
• Abstractions help us deal with complexity and
hide low level details.
• Instruction set architecture (ISA)
– The hardware/software interface
• Application binary interface
– ISA + system software interface
Why Abstractions?
• What is an instruction?
Why Abstractions?
• What is an instruction?
– A collection of bits the computer understands and
can “execute” or perform.
– Example:
00000010001100100100000000100000
– Tells a computer to add two numbers.
– How does the computer know?
Why Abstractions
00000010001100100100000000100000
Op code
•
•
•
•
op code
rs
rt
rd
rs
rt
rd
shamt
function
code
- Code for the basic operation of the instruction
- The first register source operand
- The second register source command
- The register destination operand, gets the result of the
operands
• shamt
- Shift amount
• function code - Function code, selects the specific variant of the operation
indicated by the op code.
What the heck does
this even mean?
add $t0, $s1, $s2
The Hardware
Why abstractions
This:
add $t0, $s1, $s2
Is easier than this:
00000010001100100100000000100000
Why abstractions
• You know what is even easier?
This:
c = a + b;
• For a human at least…
Abstraction Layers
• High-level language
– Level of abstraction is
close to the problem
domain.
– Allows us to be
productive!
– Allows the code to be
machine portable
• Different machines have
different instructions!
Abstraction Layers
• Assembly language
– Assembly language is
created from a compiler.
– Compiler takes a highlevel language and
compiles the instructions
necessary to accomplish
the indicated algorithm.
– Assembly language is a
symbolic version of
binary instructions.
Abstraction Layers
• Machine Language
– Created by the
assembler.
– Translates from symbolic
assembly language into
the binary
representation in
machine language which
the computer actually
understands.
WRAP UP
For next time
• Read Chapter 1, Sections 1.1 – 1.5,
and 1.8.
• Start Lab 0 early!!!