CIS314-chapter2_1
Download
Report
Transcript CIS314-chapter2_1
EI 209
Computer Organization
Fall 2015
Chapter 2: Instructions:
Language of the Computer
Haojin Zhu (http://tdt.sjtu.edu.cn/~hjzhu/ )
Course Website:
http://nsec.sjtu.edu.cn/teaching/ei209.html
[Adapted from Computer Organization and Design, 4th Edition,
Patterson & Hennessy, © 2012, MK]
EI209 Chapter 2.1
Haojin Zhu, SJTU, 2016
Assignment I
Assignment I
Chapter I: 1.16
Coding Assignment I:
Design a program to check if your computer is big
endian or little endian?
Please submit the source code and the execution
results.
Submission Due: Oct. 8
EI209 Chapter 2.2
Haojin Zhu, SJTU, 2016
The Language a Computer Understands
Word a computer understands: instruction
Vocabulary of all words a computer understands:
instruction set (aka instruction set architecture or ISA)
Different computers may have different vocabularies (i.e.,
different ISAs)
iPhone (ARM) not same as Macbook (x86)
Or the same vocabulary (i.e., same ISA)
iPhone and iPad computers have same instruction set (ARM)
EI209 Chapter 2.3
Haojin Zhu, SJTU, 2016
The Language a Computer Understands
Why not all the same? Why not all different? What might
be pros and cons?
Single ISA (to rule them all):
- Leverage common compilers, operating systems, etc.
- BUT fairly easy to retarget these for different ISAs (e.g., Linux, gcc)
Multiple ISAs:
- Specialized instructions for specialized applications
- Different tradeoffs in resources used (e.g., functionality, memory
demands, complexity, power consumption, etc.)
- Competition and innovation is good, especially in emerging
environments (e.g., mobile devices)
EI209 Chapter 2.4
Haojin Zhu, SJTU, 2016
MIPS: Instruction Set for EI209
MIPS is a real-world ISA (see www.mips.com)
Standard instruction set for networking equipment
Was also used in original Nintendo-64!
Elegant example of a Reduced Instruction Set Computer
(RISC) instruction set
Invented by John Hennessy @ Stanford
Why not Berkeley/Sun RISC invented by Dave Patterson? Ask him!
EI209 Chapter 2.5
Haojin Zhu, SJTU, 2016
RISC Design Principles
Basic RISC principle: “A simpler CPU (the hardware that
interprets machine language) is a faster CPU” (CPU Core)
Focus of the RISC design is reduction of the number and
complexity of instructions in the ISA
A number of the more common strategies include:
Fixed instruction length, generally a single word;
Simplifies process of fetching instructions from memory
Simplified addressing modes;
Simplifies process of fetching operands from memory
Fewer and simpler instructions in the instruction set;
Simplifies process of executing instructions
Only load and store instructions access memory;
E.g., no add memory to register, add memory to memory, etc.
Let the compiler do it. Use a good compiler to break complex highlevel language statements into a number of simple assembly
language statements
EI209 Chapter 2.6
Haojin Zhu, SJTU, 2016
Mainstream ISAs
ARM (Advanced RISC Machine) is most popular RISC
In every smart phone-like device (e.g., iPhone, iPad, iPod, …)
Intel 80x86 is another popular ISA and is used in Macbook
and PCs (Core i3, Core i5, Core i7, …)
x86 is a Complex Instruction Set Computer (CISC)
20x ARM sold vs. 80x86 (i.e., 5 billion vs. 0.3 billion)
EI209 Chapter 2.7
Haojin Zhu, SJTU, 2016
MIPS
Green Card
EI209 Chapter 2.8
Fall 2012 -- Lecture #6
8
Haojin Zhu, SJTU, 2016
MIPS
Green Card
EI209 Chapter 2.9
Fall 2012 -- Lecture #6
9
Haojin Zhu, SJTU, 2016
Two Key Principles of Machine Design
1.
Instructions are represented as numbers and, as
such, are indistinguishable from data
2.
Programs are stored in alterable memory (that can
be read or written to)
Memory
just like data
Stored-program concept
Programs can be shipped as files
of binary numbers – binary
compatibility
Computers can inherit ready-made
software provided they are
compatible with an existing ISA –
leads industry to align around a
small number of ISAs
EI209 Chapter 2.10
Accounting prg
(machine code)
C compiler
(machine code)
Payroll
data
Source code in
C for Acct prg
Haojin Zhu, SJTU, 2016
MIPS-32 ISA
Registers
Instruction Categories
Computational
Load/Store
Jump and Branch
Floating Point
-
R0 - R31
coprocessor
PC
HI
Memory Management
Special
LO
3 Instruction Formats: all 32 bits wide
op
rs
rt
op
rs
rt
op
EI209 Chapter 2.11
rd
sa
immediate
jump target
funct
R format
I format
J format
Haojin Zhu, SJTU, 2016
MIPS (RISC) Design Principles
Simplicity favors regularity
Smaller is faster
limited instruction set
limited number of registers in register file
limited number of addressing modes
Make the common case fast
fixed size instructions
small number of instruction formats
opcode always the first 6 bits
arithmetic operands from the register file (load-store machine)
allow instructions to contain immediate operands
Good design demands good compromises
three instruction formats
EI209 Chapter 2.12
Haojin Zhu, SJTU, 2016
MIPS Arithmetic Instructions
MIPS assembly language arithmetic statement
add
$t0, $s1, $s2
sub
$t0, $s1, $s2
Each arithmetic instruction performs one operation
Each specifies exactly three operands that are all
contained in the datapath’s register file ($t0,$s1,$s2)
destination source1
op
source2
0
0x22
Instruction Format (R format)
0
EI209 Chapter 2.14
17
18
8
Haojin Zhu, SJTU, 2016
MIPS Instruction Fields
MIPS fields are given names to make them easier to
refer to
op
rs
rt
rd
shamt
funct
op
6-bits
opcode that specifies the operation
rs
5-bits
register file address of the first source operand
rt
5-bits
register file address of the second source operand
rd
5-bits
register file address of the result’s destination
shamt 5-bits
shift amount (for shift instructions)
funct
function code augmenting the opcode
EI209 Chapter 2.15
6-bits
Haojin Zhu, SJTU, 2016
Computer Hardware Operands
High-Level Programming languages:
could have millions of variables
Instruction sets have fixed, small number
Called registers
“Bricks” of computer hardware
Fastest way to store data in computer hardware
Visible to (the “assembly language”) programmer
MIPS Instruction Set has 32 integer registers
EI209 Chapter 2.16
Haojin Zhu, SJTU, 2016
Why Just 32 Registers?
RISC Design Principle: Smaller is faster
But you can be too small …
Hardware would likely be slower with 64, 128, or 256
registers
32 is enough for compiler to translate typical C programs,
and not run out of registers very often
ARM instruction set has only 16 registers
May be faster, but compiler may run out of registers
too often (aka “spilling registers to memory”)
EI209 Chapter 2.17
Haojin Zhu, SJTU, 2016
Size of Registers
Bit is the atom of Computer Hardware:
contains either 0 or 1
True “alphabet” of computer hardware is 0, 1
Will eventually express MIPS instructions as
combinations of 0s and 1s (in Machine Language)
MIPS registers are 32 bits wide
MIPS calls this quantity a word
Some computers use 16-bit or 64-bit wide words
E.g., Intel 8086 (16-bit), MIPS64 (64-bit)
EI209 Chapter 2.18
Haojin Zhu, SJTU, 2016
Some Examples of 64-bit processors
What is the first 64-bit
Apple phone?
Iphone 5s
EI209 Chapter 2.19
What is the first 64-bit
android phone?
HTC Desire 820
Haojin Zhu, SJTU, 2016
MIPS Register File
Register File
Holds thirty-two 32-bit registers
Two read ports and
One write port
Registers are
Faster than main memory
src1 addr
src2 addr
dst addr
write data
32 bits
5
32 src1
data
5
5
32
locations
32 src2
32
data
- But register files with more locations
write control
are slower (e.g., a 64 word file could
be as much as 50% slower than a 32 word file)
- Read/write port increase impacts speed quadratically
Easier for a compiler to use
- e.g., (A*B) – (C*D) – (E*F) can do multiplies in any order vs.
stack
Can hold variables so that
- code density improves (since register are named with fewer bits
than a memory location)
EI209 Chapter 2.20
Haojin Zhu, SJTU, 2016
Aside: MIPS Register Convention
Name
Register
Number
$zero
0
$at
1
$v0 - $v1
2-3
$a0 - $a3
4-7
$t0 - $t7
8-15
$s0 - $s7
16-23
$t8 - $t9
24-25
$gp
28
$sp
29
$fp
30
$ra
31
EI209 Chapter 2.21
Usage
Preserve
on call?
constant 0 (hardware)
n.a.
reserved for assembler
n.a.
returned values
no
arguments
yes
temporaries
no
saved values
yes
temporaries
no
global pointer
yes
stack pointer
yes
frame pointer
yes
return addr (hardware)
yes
Haojin Zhu, SJTU, 2016
MIPS Memory Access Instructions
MIPS has two basic data transfer instructions for
accessing memory
lw
$t0, 4($s3)
#load word from memory
sw
$t0, 8($s3)
#store word to memory
The data is loaded into (lw) or stored from (sw) a register
in the register file – a 5 bit address
The memory address – a 32 bit address – is formed by
adding the contents of the base address register to the
offset value
A 16-bit field meaning access is limited to memory locations
within a region of 213 or 8,192 words (215 or 32,768 bytes) of
the address in the base register
EI209 Chapter 2.22
Haojin Zhu, SJTU, 2016
Machine Language - Load Instruction
Load/Store Instruction Format (I format):
lw $t0, 24($s3)
35
19
8
2410
Memory
2410 + $s3 =
. . . 0001 1000
+ . . . 1001 0100
. . . 1010 1100 =
0x120040ac
0xf f f f f f f f
0x120040ac
$t0
0x12004094
$s3
data
EI209 Chapter 2.23
0x0000000c
0x00000008
0x00000004
0x00000000
word address (hex)
Haojin Zhu, SJTU, 2016
Byte Addresses
Since 8-bit bytes are so useful, most architectures
address individual bytes in memory
Alignment restriction - the memory address of a word must be
on natural word boundaries (a multiple of 4 in MIPS-32)
Big Endian:
leftmost byte is word address
IBM 360/370, Motorola 68k, MIPS, Sparc, HP PA
Little Endian:
rightmost byte is word address
Intel 80x86, DEC Vax, DEC Alpha (Windows NT)
3
2
1
little endian byte 0
0
msb
0
big endian byte 0
EI209 Chapter 2.24
lsb
1
2
3
Haojin Zhu, SJTU, 2016
Aside: Loading and Storing Bytes
MIPS provides special instructions to move bytes
lb
$t0, 1($s3)
#load byte from memory
sb
$t0, 6($s3)
#store byte to
0x28
19
8
memory
16 bit offset
What 8 bits get loaded and stored?
load byte places the byte from memory in the rightmost 8 bits of
the destination register
- what happens to the other bits in the register?
store byte takes the byte from the rightmost 8 bits of a register
and writes it to a byte in memory
- what happens to the other bits in the memory word?
EI209 Chapter 2.25
Haojin Zhu, SJTU, 2016
Example of Loading and Storing Bytes
Given following code sequence and memory state what is
the state of the memory after executing the code?
add
lb
sb
$s3, $zero, $zero
$t0, 1($s3)
$t0, 6($s3)
What value is left in $t0?
Memory
0x 0 0 0 0 0 0 0 0
24
0x 0 0 0 0 0 0 0 0
20
0x 0 0 0 0 0 0 0 0
16
0x 1 0 0 0 0 0 1 0
12
0x 0 1 0 0 0 4 0 2
8
0x F F F F F F F F
4
0x 0 0 9 0 1 2 A 0
0
Data
EI209 Chapter 2.26
What word is changed in Memory
and to what?
What if the machine was little
Endian?
Word
Address (Decimal)
Haojin Zhu, SJTU, 2016
Example of Loading and Storing Bytes
Given following code sequence and memory state what is
the state of the memory after executing the code?
add
lb
sb
$s3, $zero, $zero
$t0, 1($s3)
$t0, 6($s3)
What value is left in $t0?
Memory
$t0 = 0x00000090
0x 0 0 0 0 0 0 0 0
24
0x 0 0 0 0 0 0 0 0
20
0x 0 0 0 0 0 0 0 0
16
0x 1 0 0 0 0 0 1 0
12
0x 0 1 0 0 0 4 0 2
8
0x F F F F F F F F
4
0x 0 0 9 0 1 2 A 0
0
Data
EI209 Chapter 2.27
What word is changed in Memory
and to what?
mem(4) = 0xFFFF90FF
What if the machine was little
Endian?
$t0 = 0x00000012
Word
Address (Decimal)
mem(4) = 0xFF12FFFF
Haojin Zhu, SJTU, 2016
Speed of Registers vs. Memory
Given that
Registers: 32 words (128 Bytes)
Memory: Billions of bytes (2 GB to 8 GB on laptop)
and the RISC principle is…
Smaller is faster
How much faster are registers than memory??
About 100-500 times faster!
in terms of latency of one access
EI209 Chapter 2.28
Haojin Zhu, SJTU, 2016
MIPS Immediate Instructions
Small constants are used often in typical code
Possible approaches?
put “typical constants” in memory and load them
create hard-wired registers (like $zero) for constants like 1
have special instructions that contain constants !
addi $sp, $sp, 4
#$sp = $sp + 4
slti $t0, $s2, 15
#$t0 = 1 if $s2<15
Machine format (I format):
0x0A
18
8
0x0F
The constant is kept inside the instruction itself!
Immediate format limits values to the range +215–1 to -215
what about upper 16 bits?
EI209 Chapter 2.29
Haojin Zhu, SJTU, 2016
Aside: How About Larger Constants?
We'd also like to be able to load a 32 bit constant into a
register, for this we must use two instructions
a new "load upper immediate" instruction
lui $t0, 1010101010101010
16
0
8
10101010101010102
Then must get the lower order bits right, use
ori $t0, $t0, 1010101010101010
1010101010101010
0000000000000000
0000000000000000
1010101010101010
1010101010101010
1010101010101010
why can’t addi be used as the second instruction for this 32 bit constant?
EI209 Chapter 2.30
Haojin Zhu, SJTU, 2016