Real instruction set architectures

Download Report

Transcript Real instruction set architectures

Real instruction set
architectures
Part 2: internal CPU storage,
overview of Intel architectures
Big-Endian vs. Little-Endian: quick
recap
• In a big-endian machine, bytes used to store a
data item are arranged left to right, so that the
MSB is found at the leftmost position (first byte
of address, the “big end”)
• Little-endian is just the opposite; bytes are
arranged right to left, with the MSB as the first bit
of the last byte (the “little end”)
• Note that, in either case, bits within each byte
are arranged left to right – so a little-endian
integer isn’t exactly the same thing as a bigendian integer backwards
Byte ordering & data movement
• Computer networks are big endian:
– Little endian machines must convert integers
(e.g. network device addresses) before they
can be passed over the network
– Little endian machines must also convert
integers retrieved from the network to the
native mode for the machine
Byte ordering & data movement
• Any program that reads/writes file data
must be aware of byte ordering
– For example, Windows BMPs were developed
on a little endian machine; an application on a
big endian machine that reads a BMP must
reverse byte order
– PhotoShop, JPEG, MacPaint, Sun raster files:
big endian
– GIF, PC Paintbrush, RTF: little endian
Internal CPU storage
• 3 choices for data storage in CPU:
– Stack architecture:
• Use stack to execute instructions; operands stored
at top of stack
• No random access
– Accumulator architecture:
• Minimum of internal complexity; short instructions
• One (implicit) operand stored in accumulator
• Involves high volume of memory traffic
– General Purpose register: see next slide
General Purpose Register (GPR)
• Set (>1) of GPRs
• Most common architecture in use today
• Registers are faster than memory; easier for
other parts of the CPU to handle register data
(than data from memory)
• Cheaper hardware tends to mean an increased
number of registers in the CPU
• GPRs mean longer instructions, because
register(s) must be specified; takes more time to
fetch/decode longer instructions
Classification of GPR architectures
• Memory to memory (VAX):
– Instruction uses 2-3 operands, stored in memory
– Instructions can perform operations without involving
registers
• Register to memory (Intel, Motorola): at least
one operand must be in a register
• Load-store (SPARC, MIPS, Alpha, PowerPC):
Requires movement of data to registers before
any operations performed
Operand number / instruction
length
• Instructions can be formatted 2 ways:
– Fixed-length: fast, but wastes space
– Variable-length: more complex to decode, but
saves space
• Real-life compromise often involves 2-3
instruction lengths (so fixed, but variable)
Some historical architectures
• VAX: Digital’s line of midsize computers,
dominant in academia in the 70s and 80s
• Characteristics:
– Variable-length instructions; anywhere from 2 to 5
operands
– Full set of addressing modes: operands can be
anywhere; single instruction could take up to 31 bytes
– “High level” instructions: complexity built into
instruction set to make programmers’ task easier
– Extensive set of data types at machine level
Some historical architectures
• Motorola’s 68000 series
– Initial Apple MacIntosh, early Sun
workstations
– Variable-length instructions: 0-2 operands
– Wide variety of addressing modes (but not as
many as VAX)
– Could not start an instruction until previous
one was completed
Intel architectures
• 8086 chip: first produced in 1979
– Handled 16-bit data, 20-bit addresses
– Could address 1 million bytes of memory
– CPU split into 2 parts:
• Execution unit: contained GPRs & ALU
• Bus interface unit: included instruction queue,
segment registers, instruction pointer (SR & IP are
special-purpose registers)
8086 GPRs
• AX: accumulator
• BX: base register: could be used to extend
addressing
• CX: count register
• DX: data register
• Some 8086 instructions require use of
specific GPR, but in general, could use
any of these to hold data
Byte-level addressing
• Each GPR addressable at word or byte
level
• For example, AX divided into:
– AH (contains MSB)
– AL (contains LSB)
• Same for BX, CX, DX
Other registers in 8086
• Pointer registers:
– SP: stack pointer: used as offset into stack
– BP: base pointer: used to reference parameters
pushed on stack; indicates lowest value SP can reach
– IP: holds address of next instruction (like Pep/8’s PC)
• Index registers:
– SI: source index; used as source pointer for string
operations
– DI: destination index; used as destination pointer for
string operations
– Both SI & DI sometimes used to supplement GPRs
Other registers in 8086
• Status flags register: bits indicate CPU
status & results (overflow, carry, negative,
etc.)
• Segment registers
– 8086 assembly language programs divided
into specialized blocks of code called
segments
– Each segment holds specific types of
information
8086 Segments
• Code segment: program itself
(instructions)
• Data segment: program data
• Stack segment: program’s runtime stack
(for procedure calls)
8086 segments
• To access information in a segment, had to specify
item’s offset from segment start
• Segment needed to store segment addresses – these
were stored in segment registers:
–
–
–
–
CS: code segment
DS: data segment
SS: stack segment
ES: extra segment (used by some string operations to handle
memory addressing)
• Addresses specified in segment/offset form:
XXX:YYY
Where XXX is the value stored in a segment register, and YYY is
the offset from the start of the segment
Evolution of Intel platform
• Basic 8086 ISA used in many successor
chips:
– 8087
• Introduced in 1980
• Added floating-point instructions, 80-bit stack
– 80286
• Introduced 1982
• Could address up to 16Mb of memory
Evolution of Intel platform
• 80386
– Could address 4Gb of RAM
– 32-bit chip, with 32-bit bus, 32-bit word
– To achieve backward compatibility, Intel kept
same basic architecture, register sets
– Used new naming convention in registers:
EAX, EBX, etc. were 32-bit (extended)
versions of AX, BX, etc.; could still access
original 16-bit registers (and their byte
components) using original names
Evolution of Intel platform
• 80486
– Added high-speed cache memory for performance
improvement
– Integrated math co-processor
• Pentium™ series
– Intel quit using numbers: couldn’t trademark them
– 32-bit registers, 64-bit bus
– Employed superscalar design, with multiple ALUs;
could run instructions in parallel, handling more than
one instruction per clock cycle
Pentium™ series
• Pro added branch prediction
• II added MMX
• III added increased support for 3D
graphics using floating-point instructions
• P4: 1.4 GHz and higher clock rates; 42
million transistors per CPU; 400MHz (and
faster) system bus, refinements to cache &
floating-point operations
Pentium™ series
• Itanium: Intel’s first 64-bit chip
– Employs hardware emulator to maintain
backward compatibility with x86
– 4 integer ALUs, 4 floating-point ALUs, 4 cache
levels, 128 bit registers for integers and
floating-point numbers
– Multiple miscellaneous registers for dealing
with efficient instruction loading for branching
– Addresses up to 16Gb of RAM
CISC vs. RISC
• CISC: complex instruction set computing
– Employed by Intel up through Pentium Pro
– Pentium II and III used combined CISC/RISC: CISC
architecture with RISC core that could translate CISC
instructions to RISC
• RISC: reduced instruction set computing
• CISC emphasizes complexity in hardware,
simplicity in software; RISC is opposite
• RISC is generally considered superior in
performance