Transcript Week14_2

Other Processors
• Having learnt MIPS, we can learn other major
processors.
• Not going to be able to cover everything; will
pick on the interesting aspects.
ARM
• Advanced RISC Machine
• The major processor for mobile and
embedded electronics, like iPad
• Simple, low power, low cost
ARM
• One of the most interesting features is the
conditional execution.
• That is, an instruction will execute if some
condition is true, otherwise it will not do
anything (turned into a nop).
ARM
• A set of flags, showing the relation of two
numbers : gt, equal, lt.
– cmp Ri, Rj # set the flags depending on the
values in Ri and Rj
– subgt Ri, Ri, Rj # i = i – j if flag is gt
– sublt Ri, Ri, Rj # i = i – j if flag is lt
– bne Label # goto Label if flag is not equal
ARM
• How to implement
while (i != j) {
if (i > j)
i -= j;
else
j -= i;
}
ARM
• In MIPS, assume i is in $s0, j in $s1:
Loop: beq $s0, $s1, Done
slt $t0, $s0, $s1
beq $t0, $0, L1
sub $s0, $s0, $s1
j Loop
L1:
sub $s1, $s1, $s0
L2:
j Loop
ARM
• In ARM,
Loop: cmp Ri, Rj
subgt Ri, Ri, Rj
sublt Rj, Rj, Ri
bne Loop
ARM
• Discussion: Given the MIPS hardware setup,
can we support conditional execution?
Introduction to Intel IA-32 and IA-64
Instruction Set Architectures
History
9/27/2007 11:23:26 PM
week06-3.ppt
10
Recent Intel Processors
•
•
•
•
The Intel® Pentium® 4 Processor Family (2000-2006)
The Intel® Xeon® Processor (2001-2006)
The Intel® Pentium® M Processor (2003-Current)
The Intel® Pentium® Processor Extreme Edition (20052007)
• The Intel® Core™ Duo and Intel® Core™ Solo
Processors (2006-Current)
• The Intel® Xeon® Processor 5100 Series and Intel®
Core™2 Processor Family (2006-Current)
9/27/2007 11:23:26 PM
week06-3.ppt
11
History
9/27/2007 11:23:27 PM
week06-3.ppt
12
Recent Intel Processors
9/27/2007 11:23:27 PM
week06-3.ppt
13
Intel Core 2 Duo Processors
9/27/2007 11:23:28 PM
week06-3.ppt
14
Intel Core 2 Quad Processors
9/27/2007 11:23:28 PM
week06-3.ppt
15
Bit and Byte Ordering
9/27/2007 11:23:29 PM
week06-3.ppt
16
Intel Assembly
• Each instruction is represented by
– Where label presents the line
– A mnemonic is a reserved name for a class of
instruction opcodes which have the same
function.
– The operands argument1, argument2, and
argument3 are optional. There may be from zero
to three operands, depending on the instruction
9/27/2007 11:23:29 PM
week06-3.ppt
17
Memory Modes
9/27/2007 11:23:30 PM
week06-3.ppt
18
Addressing
• The processors use byte addressing
• Intel processors support segmented addressing
– Each address is specified by a segment register and byte
address within the segment
9/27/2007 11:23:30 PM
week06-3.ppt
19
9/27/2007 11:23:30 PM
week06-3.ppt
20
Intel Registers
9/27/2007 11:23:31 PM
week06-3.ppt
21
Basic Program Execution Registers
• General purpose registers
– There are eight registers (note that they are not
quite general purpose as some instructions
assume certain registers)
• Segment registers
– They define up to six segment selectors
• EIP register – Effective instruction pointer
• EFLAGS – Program status and control register
9/27/2007 11:23:31 PM
week06-3.ppt
22
General Purpose and Segment Registers
9/27/2007 11:23:32 PM
week06-3.ppt
23
General Purpose Registers
•
•
•
•
•
EAX — Accumulator for operands and results data
EBX — Pointer to data in the DS segment
ECX — Counter for string and loop operations
EDX — I/O pointer
ESI — Pointer to data in the segment pointed to by the DS
register; source pointer for string operations
• EDI — Pointer to data (or destination) in the segment pointed
to by the ES register; destination pointer for string operations
• ESP — Stack pointer (in the SS segment)
• EBP — Pointer to data on the stack (in the SS segment)
24
Alternative General Purpose Register Names
25
Registers in IA-64
26
Segment Registers
27
Operand Addressing
• Immediate addressing
– Maximum value allowed varies among
instructions and it can be 8-bit, 16-bit, or 32-bit
• Register addressing
– Register addressing depends on the mode (IA-32
or IA-64)
28
Register Addressing
29
Memory Operand
• Memory operand is specified by a segment and offset
30
Offset
• Displacement - An 8-, 16-, or 32-bit value.
• Base - The value in a general-purpose register.
• Index — The value in a general-purpose
register.
• Scale factor — A value of 2, 4, or 8 that is
multiplied by the index value.
31
Effective Address
32
Effective Address
• Common combinations
– Displacement
– Base
– Base + displacement
– (Index * scale) + displacement
– Base + index + displacement
– Base + (Index * scale) + displacement
33
Addressing Mode Encoding
34
Fundamental Data Types
35
Example
36
Pointer Data Types
• Near pointer
• Far pointer
37
128-Bit SIMD Data Types
38
BCD Integers
• Intel also supports BCD integers, where each
digit (0-9) is represented by 4 bits
39
Floating Point Numbers
40
General Purpose Instructions
• Data transfer instructions
41
Data Transfer Instructions
42
Data Transfer Instructions
43
Binary Arithmetic Instructions
44
Decimal Arithmetic Instructions
45
Logical Instructions
46
Shift and Rotate Instructions
47
Bit and Byte Instructions
48
Bit and Byte Instructions
49
Control Transfer Instructions
50
51
String Instructions
52
I/O Instructions
• These instructions move data between the
processor’s I/O ports and a register or memory
53
Enter and Leave Instructions
• These instructions provide machine-language support
for procedure calls in block structured languages
54
Segment Register Instructions
• The segment register instructions allow far pointers
(segment addresses) to be loaded into the segment
registers
55
Procedure Call Types
• The processor supports procedure calls in
the following two different ways:
– CALL and RET instructions.
– ENTER and LEAVE instructions, in conjunction
with the CALL and RET instructions
56
Stack
57
Calling Procedures Using CALL and RET
• Near call (within the current code segment)
• Near return
58
Far Call and Far Return
• Far call
• Far return
59
Stack During Call and Return
60
Parameter Passing
• Passing parameters through the general-purpose
registers
– Can pass up to six parameters by copying the parameters
to the general-purpose registers
• Passing parameters on the stack
– Stack can be used to pass a large number of parameters
and also return a large number of values
• Passing parameters in an argument list
– Place the parameters in an argument list
– A pointer to the argument list can then be passed
to the called procedure
61
Saving Procedure State Information
• The processor does not save general purpose
registers
– A calling procedure should explicitly save the values in any
of the general-purpose registers that it will need when it
resumes execution after a return
– One can use PUSHA and POPA to save and restore all the
general purpose registers (except ESP)
62
Calls to Other Privilege Levels
63
Stack For Calling and Called Procedure
64
Procedure Calls For Block-structured Languages
• ENTER and LEAVE instructions automatically
create and release, respectively, stack frames
for called procedures
– The ENTER instruction creates a stack frame
compatible with the scope rules typically used in
block-structured languages
– The LEAVE instruction, which does not have any
operands, reverses the action of the previous
ENTER instruction
65
ENTER Instruction
66
IA-32 and IA-64 Instruction Format
67
Examples of Instruction Formats
68
ADD Instructions
69
ADD Instructions
70
Add Instruction Description
71
SCAS/SCASB/SCASW/SCASD—Scan String
72
SIMD in IA-32 and IA-64
• To improve performance, Intel adopted SIMD (single
instruction multiple data) instructions
– MMX technology (Pentium II processor family)
– SSE
10/7/2007 9:37:48 PM
week07-1.ppt
73
MMX
• MMX introduced
– Eight new 64-bit data registers,
called MMX registers
– Three new packed data types:
• 64-bit packed byte integers (signed
and unsigned)
• 64-bit packed word integers
(signed and unsigned)
• 64-bit packed double word
integers (signed and unsigned)
– Instructions that support the
new data types
10/7/2007 9:37:59 PM
week07-1.ppt
74
MMX
• Packed integer types allow operations to be applied
on multiple integers
10/7/2007 9:38:00 PM
week07-1.ppt
75
SSE
• SSE introduced eight 128-bit data registers (called
XMM registers)
– In 64-bit modes, they are available as 16 64-bit registers
– The 128-bit packed single-precision floating-point data
type, which allows four single-precision operations to be
performed simultaneously
• They can be used in parallel with MMX registers
10/7/2007 9:38:04 PM
week07-1.ppt
76
SSE Execution Environment
10/7/2007 9:38:04 PM
week07-1.ppt
77
XMM Registers
• In certain modes, additional eight 64 bit
registers are also available (XMM8 - XMM15)
10/7/2007 9:38:05 PM
week07-1.ppt
78
SSE Data Type
• SSE extensions introduced one new data type
– 128-Bit Packed Single-Precision Floating-Point Data Type
– SSE 2 introduced five data types
10/7/2007 9:38:05 PM
week07-1.ppt
79
Packed and Scalar Double-Precision Floating-Point Operations
10/7/2007 9:38:07 PM
week07-1.ppt
80
SSE Instructions
• SSE Data Transfer Instructions
10/7/2007 9:38:10 PM
week07-1.ppt
81
SSE Packed Arithmetic Instructions
10/7/2007 9:38:11 PM
week07-1.ppt
82
SSE Packed Arithmetic Instructions
10/7/2007 9:38:12 PM
week07-1.ppt
83
SSE Comparison, Logical, and Shuffle Instructions
10/7/2007 9:38:12 PM
week07-1.ppt
84
SSE2 Instructions
10/7/2007 9:38:14 PM
week07-1.ppt
85
SSE2 128-Bit SIMD Integer Instructions
10/7/2007 9:38:14 PM
week07-1.ppt
86
Horizontal Addition/Subtraction
10/7/2007 9:38:16 PM
week07-1.ppt
87
Horizontal Data Movements
10/7/2007 9:38:17 PM
week07-1.ppt
88
Conversion Between Different Types
10/7/2007 9:38:17 PM
week07-1.ppt
89
Using MMX/SSE Instructions in C/C++ Programs
• Data types for MMX and SSE instructions
– These types are defined in C/C++
• /usr/lib/gcc/i386-redhat-linux/3.4.3/include/mmintrin.h
• /usr/lib/gcc/i386-redhat-linux/3.4.3/include/pmmintrin.h
• /usr/lib/gcc/i386-redhat-linux/3.4.3/include/emmintrin.h
10/7/2007 9:38:18 PM
week07-1.ppt
90
Built-in Functions
• Built-in functions are C-style functional interfaces
to MMX/SSE instructions
See http://developer.apple.com/documentation/DeveloperTools/gcc-4.0.1/gcc/X86-Built_002din-Functions.html
10/7/2007 9:38:29 PM
week07-1.ppt
91
Intel MMX/SSE Intrinsics
• Intrinsics are C/C++ functions and procedures for
MMX/SSE instructions
– With instrinsics, one can program using these instructions
indirectly using the provided intrinsics
– In general, there is a one-to-one correspondence between
MMX/SSE instructions and intrinsics
10/7/2007 10:01:41 PM
week07-1.ppt
92
GCC Inline Assembly
• GCC inline assembly allows us to insert inline
functions written in assembly
– GCC provides the utility to specify input and output
operands as C variables
– Basic inline
– Extended inline assembly
10/7/2007 10:01:43 PM
week07-1.ppt
93
GCC Inline Assembly
• Some examples
10/7/2007 10:03:01 PM
week07-1.ppt
94
GCC Inline Assembly
10/7/2007 10:03:04 PM
week07-1.ppt
95
GCC Inline Assembly
10/7/2007 10:03:06 PM
week07-1.ppt
96