The Arrival of the 64bit CPUs

Download Report

Transcript The Arrival of the 64bit CPUs

The Arrival of the 64bit CPUs - Itanium
นายชนินท์
นายสุ นัย
วงษ์ ใหญ่ รหัส 43650076
สุ ขเอนก รหัส 43650340
The Arrival of the 64bit CPUs - Itanium
1
What is the Intel® Itanium™ Architecture?

EPIC (Explicitly Parallel Instruction Computing)
IA - 64
compatibility with the IA-32 instruction set.

CPU speed 800 MHz , 6.4 GFLOPS

3 Level Cache


 On-Die Lv1 128 Kb, Lv2 256 Kb
 Lv3 4 Mb

264 Address space

25.4 Million Transistors
The Arrival of the 64bit CPUs - Itanium
2
The Arrival of the 64bit CPUs - Itanium
3
The Arrival of the 64bit CPUs - Itanium
4
Data Types



Integer: 1, 2, 4 and 8 byte(s)
Floating-point single, double and
double-extended formats
Pointers: 8 bytes
The Arrival of the 64bit CPUs - Itanium
5
Intel® Itanium™ Instruction Format
[(qp)] mnemonic[.comp1][.comp2] dests = srcs
Simple Instruction
add r1 = r2, r3
Predicated instruction
(p4)add r1 = r2, r3
Instruction with immediate
add r1 = r2, r3, 1
Instruction with completer
cmp.eq p3 = r2, r4
The Arrival of the 64bit CPUs - Itanium
6
Memory Organization
defines a single, uniform, linear address space of 264 bytes
 Single space means that both data and instructions share the
same memory range.
 Uniform means that there are no address regions with predefined
functionality.
 Linear means that the address space contains no segments; all
264 bytes are consecutive

Support 2 byte order: Little-endian and Big-endian
The Arrival of the 64bit CPUs - Itanium
7
Instruction Level Parallelism(ILP)



Enabling the compiler/assembly writer to explicitly indicate
parallelism.
Providing a three-instruction-wide word, called a bundle, that
facilitates parallel processing of instructions.
Providing a large number of registers, enabling using different
registers for different variables and avoiding register contention.
The Arrival of the 64bit CPUs - Itanium
8
Instruction Groups


An instruction group is a set of instructions which do not have
read-after-write or write-after-write dependencies between them
and may execute in parallel.
An instruction group must contain at least one instruction; the
number of instructions in an instruction group is not limited.
Instruction groups are indicated in the code by cycle breaks. An
instruction group may also end dynamically during run-time by a
taken branch.
The Arrival of the 64bit CPUs - Itanium
9
Instruction Bundles




Instruction groups are composed of instructions contained in
bundles.
Each bundle contains three instructions, and a template field, which
are set during code generation, by a compiler, or the assembler.
Template allows the processor to dispatch all three instructions in
parallel.
Bundles are aligned at 16-byte boundaries.
The Arrival of the 64bit CPUs - Itanium
10
Registers

128 General registers

128 Floating-point registers

64 Predicate registers

8 Branch registers

128 Application registers

Instruction Pointer (IP) register
The Arrival of the 64bit CPUs - Itanium
11
Register Validity



enable propagating validity/invalidity of
a speculative load result.
Each general register has an a
corresponding NaT (Not a Thing) Bit.
Floating-point registers use a special
instance of pseudo-zero, called NaTVal.
The Arrival of the 64bit CPUs - Itanium
12
Branching in the Intel® Itanium™ Architecture


Relative direct branches, using 21-bit
displacement that is appended to the
instruction pointer of the bundle containing the
branch.
Indirect branches, using 64-bit addresses in the
branch registers.
The Arrival of the 64bit CPUs - Itanium
13
Predication
Allowing the processor to execute all possible branch paths in parallel.
2
Instruction 4 (P1)
Instruction 5 (P1)
Instruction 6 (P1)
Instruction 1
Instruction 2
Instruction 3(branch)
3
12
Instruction 7 (P2)
Instruction 8(P2)
Instruction 9 (P2)
The Arrival of the 64bit CPUs - Itanium
14
Predication
The compiler rearrange instruction in this order, paring
instruction 4 and 7, 5 and 8 and 6 and 9 for parallel execution.
Instruction 1
Instruction 4 (P1)
Instruction 8(P2)
Instruction 2
Instruction 3(branch) 128 bit long
Instruction 7 (P2) Instruction 5 (P1) instruction
Instruction 6 (P1)
Instruction 9 (P2)
word
The Arrival of the 64bit CPUs - Itanium
15
Reduced Memory Access Costs


Hiding memory latency. This enables the
processor to bring the data in time, and avoid
stalling the processor.
Memory latency is hidden through the use of:
 Data speculation
- the execution of an
operation before its data dependency is
resolved.
 Control speculation
- the execution of an
instruction before its control dependencyis
resolved.
The Arrival of the 64bit CPUs - Itanium
16
Hiding Memory Latencies

speculative loads, error/exception
detection is deferred until final result is
actually required:
If no error/exception is detected the
latency is hidden.
If an error/exception is detected then
memory accesses and dependent
instructions must be redone by an
exception handler.
The Arrival of the 64bit CPUs - Itanium
17
Speculative loading
Fetch data before the program needs it, even beyond a branch that hasn't executed.
Instruction 1
Instruction 2
1
2
Speculative loading
5
Instruction 3(branch)
Instruction 4
Instruction 5
Instruction 6
3
4
Instruction 7
Instruction 8(Load data)
Speculative Check
Instruction 9 (Use data)
The Arrival of the 64bit CPUs - Itanium
18
Floating Point and Multimedia


support for single, double, and double-extended IEEE formats.
support for multimedia, or data-parallel applications:
integer data and SIMD computations, similar to the MMX™
technology.
floating-point data and SIMD-FP computations, similar to IA32 Streaming SIMD Extensions.
The Arrival of the 64bit CPUs - Itanium
19
Itanium™ Architecture Floating-point Features



128 floating-point registers
A multiply and accumulate instruction (fma), with four different floating-point
registers for operands (f=a * b + c). This instruction enables performing a multiply
and add in the same number of cycles as one add or multiply instruction.
Load and store to and from memory. You can also load from memory into two
floating-point registers.

Data transfer between floating-point and general registers.

Multiple status fields register, enables speculation on floating-point operations.

Quick conversion from integer to floating-point and vice-versa.

Rotating floating-point registers.
The Arrival of the 64bit CPUs - Itanium
20
Multimedia Support


Integer multimedia is provided by
defining a set of instructions which
treat the general registers 8x8, 4x16, or
2x32 bit elements, and by providing
specific instructions for operating on
these data elements.
support is semantically compatible
with the MMX™ Technology.
The Arrival of the 64bit CPUs - Itanium
21
Chipset





AL460GX (support 4 CPU),
BS460GX (support 2 CPU)
400 MHz Bus speed
DDR-SDRAM, RAMBUS only
FSB slot support Itanium CPU
Board
10 PCI Slot
The Arrival of the 64bit CPUs - Itanium
22
The Arrival of the 64bit CPUs - Itanium
23
Reference



http://www.pctechguide.com/02proc3.htm#Itani
um
http://www.sysopt.com/articles/64bit/index.html
http://developer.intel.com/design/ia-64
The Arrival of the 64bit CPUs - Itanium
24
Any Question?
The Arrival of the 64bit CPUs - Itanium
25