Document - Oman College of Management & Technology
Download
Report
Transcript Document - Oman College of Management & Technology
Assembly Language for Intel-Based
Computers, 4th Edition
Kip R. Irvine
Chapter 2: IA-32 Processor
Architecture
Slides prepared by Kip R. Irvine
Revision date: 07/21/02
• Chapter corrections (Web) Assembly language sources (Web)
• Printing a slide show
(c) Pearson Education, 2002. All rights reserved. You may modify and copy this slide show for your personal use, or for
use in the classroom, as long as this copyright statement, the author's name, and the title are not changed.
Chapter Overview
•
•
•
•
•
General Concepts
IA-32 Processor Architecture
IA-32 Memory Management
Components of an IA-32 Microcomputer
Input-Output System
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
2
General Concepts
•
•
•
•
Basic microcomputer design
Instruction execution cycle
Reading from memory
How programs run
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
3
Basic Microcomputer Design
•
•
•
•
•
•
clock synchronizes CPU operations and the other computer
components.
control unit (CU) coordinates sequence of execution steps
ALU performs arithmetic and bitwise processing
Memory unit(RAM) use to store user data and programs.
I/O devices use to communicate between the system and the user.
System Bus include data, address, and control lines which use to
transfer data, address, and control signals between CPU and the other
units
data bus
registers
Central Processor Unit
(CPU)
ALU
CU
Memory Storage
Unit
I/O
Device
#1
I/O
Device
#2
clock
control bus
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
address bus
4
Clock
• synchronizes all CPU and BUS operations
• machine (clock) cycle measures time of a single
operation
• clock is used to trigger events
one cycle
1
0
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
5
Instruction Execution Cycle
PC
I-1
memory
op1
op2
fetch
read
registers
registers
write
I-1
write
Fetch
Decode
Fetch operands
Execute
Store output
instruction
register
decode
•
•
•
•
•
program
I-2 I-3 I-4
flags
ALU
execute
(output)
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
6
Instruction Execution Cycle
1- Fetch: bring the current instruction from the user
program at memory and store it in the instruction
register inside the CPU. The PC(program counter) is
a register contain the address of the next instruction
to be executed.
2- Decode: is the process of recognize the instruction
type by the control unit to determine the steps
required to execute that instruction.
3- Fetch Operand: bring the values to be executed.
4- Execute: the control unit ask the ALU to perform the
operation.
5-Store Output: write the result of the operation either
into the memory or the registers.
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
7
Multi-Stage Pipeline
• Pipelining makes it possible for processor to execute
instructions in parallel
• Instruction execution divided into discrete stages
Stages
S1
1
S2
I-1
5
I-1
6
7
S6
I-1
4
Cycles
S5
I-1
3
For k states and n
instructions, the number
of required cycles is:
S4
I-1
2
Example of a nonpipelined processor.
Many wasted cycles.
S3
I-1
I-2
8
9
10
11
12
I-2
I-2
I-2
I-2
I-2
k*n
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
8
Pipelined Execution
• More efficient use of cycles, greater throughput of instructions:
Stages
Cycles
S1
1
I-1
2
I-2
3
4
5
S2
S3
S4
S5
S6
For k states and n
instructions, the
number of required
cycles is:
I-1
I-2
I-1
I-2
I-1
I-2
6
7
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
I-1
I-2
I-1
k + (n – 1)
I-2
9
Wasted Cycles (pipelined)
• When one of the stages requires two or more clock cycles, clock
cycles are again wasted.
Stages
Cycles
S1
S2
S3
exe
S4
1
I-1
2
I-2
I-1
3
I-3
I-2
I-1
I-3
I-2
I-1
I-3
I-1
4
5
6
I-2
7
I-2
8
I-3
9
I-3
10
S5
S6
For k states and n
instructions, the
number of required
cycles is:
I-1
I-1
k + (2n – 1)
I-2
I-2
I-3
11
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
I-3
10
Superscalar
A superscalar processor has multiple execution pipelines. In the
following, note that Stage S4 has left and right pipelines (u and v).
Stages
S4
Cycles
S1
S2
S3
u
v
S5
S6
1
I-1
2
I-2
I-1
3
I-3
I-2
I-1
4
I-4
I-3
I-2
I-1
I-4
I-3
I-1
I-2
I-4
I-3
I-2
I-1
I-3
I-4
I-2
I-1
I-4
I-3
I-2
I-4
I-3
5
6
7
8
9
10
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
For k states and n
instructions, the
number of required
cycles is:
k+n
I-4
11
Reading from Memory
•
Multiple machine cycles are required when reading from memory,
because it responds much more slowly than the CPU. The steps are:
• address placed on address bus
• Read Line (RD) set low (0).
• CPU waits one cycle for memory to respond
• Read Line (RD) goes to 1(high), indicating that the data is on the
data bus
Cycle 1
Cycle 2
Cycle 3
Cycle 4
CLK
Address
ADDR
RD
Data
DATA
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
12
Cache Memory
• High-speed expensive static RAM both inside and
outside the CPU.
• Level-1 cache: inside the CPU
• Level-2 cache: outside the CPU
• Cache hit: when data to be read is already in cache
memory
• Cache miss: when data to be read is not in cache
memory. In that case we need to transfer the
required data from RAM to cache.
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
13
How a Program Runs
User
sends program
name to
Operating
system
gets starting
cluster from
searches for
program in
returns to
System
path
loads and
starts
Directory
entry
Current
directory
Program
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
14
Multitasking
• Operating System(OS) can run multiple programs at
the same time.
• Multiple threads(parts of the same program) of
execution within the same program.
• Scheduler utility assigns a given amount of CPU time
to each running program(divide the CPU time among
several tasks).
• Rapid switching of tasks
• gives illusion that all programs are running at once
• the processor must support task switching.
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
15
IA-32 Processor Architecture
•
•
•
•
Modes of operation
Basic execution environment
Floating-point unit
Intel Microprocessor history
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
16
Modes of Operation
• Basic Modes:
• Protected mode: allow to execute programs under
control of:
• native mode (Windows, Linux)
• Real-address mode
• native MS-DOS (Micro soft Disk Operating System)
• System management mode
• power management, system security, diagnostics
• Virtual-8086 mode
• hybrid of Protected
• each program has its own 8086 computer
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
17
Basic Execution Environment
•
•
•
•
•
•
Addressable memory
General-purpose registers
Index and base registers
Specialized register uses
Status flags
Floating-point, MMX, XMM registers
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
18
Addressable Memory
• The addressable memory means how many bytes the
CPU can use.
• The amount of bytes the CPU can address depend on
the number of address lines that can be used.
• Number of memory locations = 2 Address lines
• Protected mode
• 4 GB
• 32-bit address lines (bits).
• Real-address and Virtual-8086 modes
• 1 MB space
• 20-bit address
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
19
General-Purpose Registers
Named storage locations inside the CPU, optimized for
speed. It called general purposes because it can be used
by the programmer.
32-bit General-Purpose Registers
EAX
EBP
EBX
ESP
ECX
ESI
EDX
EDI
16-bit Segment Registers
EFLAGS
EIP
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
CS
ES
SS
FS
DS
GS
20
Accessing Parts of Registers
• Use 8-bit name, 16-bit name, or 32-bit name
• Applies to EAX, EBX, ECX, and EDX
8
8
AH
AL
AX
EAX
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
8 bits + 8 bits
16 bits
32 bits
21
Index and Base Registers
• Some registers have only a 16-bit name for their
lower half:
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
22
Some Specialized Register Uses (1 of 2)
• General-Purpose
• EAX – accumulator(use for collecting results)
• ECX – loop counter(Ex: for(i=0; i<=10, i++ ) I is a counter)
• ESP – stack pointer(the stack is a memory space use for
storing data temporary)
• ESI, EDI – index registers (use to store index address
values in memory for example arrays and pointers)
• EBP – extended frame pointer (stack) (use if we need
more than one stack)
• Segment: The program is divide into number of segments (Data, code,
Stack) and each segment address stored at one segment register.
•
•
•
•
CS – code segment
DS – data segment
SS – stack segment
ES, FS, GS - additional segments
23
Some Specialized Register Uses (2 of 2)
• EIP – instruction pointer: this register store the address
of the next instruction to be executed so it sequence the
program execution( some times it called the program
counter (PC)).
• EFLAGS: That register contains number of bits each bit
represent different flag that indicate the data result
condition after the instruction execution.
• status and control flags
• each flag is a single binary bit
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
24
Status Flags
• Carry
• unsigned arithmetic out of range
• Overflow
• signed arithmetic out of range
• Sign
• If the result is negative then the sign flag is set
(1) else it reset(0).
• Zero
• If the result is zero then the zero flag is set else
it reset.
• Auxiliary Carry
• If carry from bit 3 to bit 4 is appear the auxiliary
carry is set else it reset.
• Parity
• If the sum of 1 bits is an even number the
parity flag is set else it reset.
25
Floating-Point, MMX, XMM Registers
80-bit Data Registers
• Eight 80-bit floating-point data(numbers
with integer and decimal parts) registers
ST(0)
ST(1)
• ST(0), ST(1), . . . , ST(7)
ST(2)
• arranged in a stack
ST(3)
• used for all floating-point
arithmetic
ST(4)
• Eight 64-bit MMX registers(Multi Media)
use for manipulating audio and video data.
• Eight 128-bit XMM registers for singleinstruction multiple-data (SIMD) operations
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
ST(5)
ST(6)
ST(7)
Opcode Register
26
Intel Microprocessor History
•
•
•
•
Intel 8086, 80286
IA-32 processor family
P6 processor family
CISC and RISC
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
27
Early Intel Microprocessors
• Intel 8080
• 64K addressable RAM
• 8-bit registers
• CP/M operating system
• S-100 BUS architecture
• 8-inch floppy disks!
• Intel 8086/8088
• IBM-PC Used 8088
• 1 MB addressable RAM
• 16-bit registers
• 16-bit data bus (8-bit for 8088)
• separate floating-point unit (8087)
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
28
The IBM-AT
• Intel 80286
• 16 MB addressable RAM
• Protected memory
• several times faster than 8086
• introduced IDE (Integrated Drive Electronics) bus
architecture: is a standard electronic interface used
between a computer motherboard's data paths
or bus and the computer's disk storage devices.
• 80287 floating point unit
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
29
Intel IA-32 Family
• Intel386
• 4 GB addressable RAM, 32-bit
registers, paging (virtual memory)
• Intel486
• instruction pipelining
• Pentium
• superscalar, 32-bit address bus, 64-bit
internal data path(Bus inside the
processor).
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
30
Intel P6 Family
• Pentium Pro
• advanced optimization techniques in microcode
• Pentium II
• MMX (multimedia) instruction set
• Pentium III
• SIMD (streaming extensions) instructions
• Pentium 4
• NetBurst micro-architecture, tuned for multimedia
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
31
CISC and RISC
Two approaches in design the processors:
• CISC – complex instruction set computer
• large instruction set
• high-level operations
• requires microcode interpreter(large
hardware size).
• examples: Intel 80x86 family
• RISC – reduced instruction set computer
• simple, atomic instructions
• small instruction set
• directly executed by hardware
• examples:
• ARM (Advanced RISC Machines)
• DEC Alpha (now Compaq)
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
32
IA-32 Memory Management
•
•
•
•
•
Real-address mode
Calculating linear addresses
Protected mode
Multi-segment model
Paging
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
33
Real-Address mode
• 1 MB RAM maximum addressable
• Application programs can access any area
of memory
• Single tasking ( we can execute only one
program at a time).
• Supported by MS-DOS operating system
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
34
Segmented Memory
The memory divided into 16 segment each segment size = 64k B.
Segmented memory addressing: absolute (linear) address is a
combination of a 16-bit segment value added to a 16-bit offset
F0000
E0000
8000:FFFF
D0000
C0000
one segment
B0000
A0000
90000
80000
70000
60000
8000:0250
50000
0250
40000
30000
8000:0000
20000
10000
00000
seg
ofs
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
35
Calculating Linear Addresses
• Given a segment address, multiply it by 16 (add a
hexadecimal zero), and add it to the offset
• Example: convert 08F1:0100 to a linear address
Adjusted Segment value: 0 8 F 1 0
Add the offset:
0 1 0 0
Linear address:
0 9 0 1 0
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
36
Your turn . . .
What linear address corresponds to the segment/offset
address 028F:0030?
028F0 + 0030 = 02920
Always use hexadecimal notation for addresses.
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
37
Your turn . . .
What segment addresses correspond to the linear address
28F30h?
Many different segment-offset addresses can produce the
linear address 28F30h. For example:
28F0:0030, 28F3:0000, 28B0:0430, . . .
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
38
Protected Mode (1 of 2)
• 4 GB addressable RAM
• (00000000 to FFFFFFFFh)
• Each program assigned a memory partition which
is protected from other programs( only that
program can use this part and the other programs
can not).
• Designed for multitasking or multiprogramming(
load more than one program at the RAM in the
same time).
• Supported by Linux & MS-Windows
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
39
Protected mode (2 of 2)
• Segment descriptor tables (a data structure use to
store memory information(start address and
length) of each program loaded at the memory).
• Program structure
• code, data, and stack areas
• CS, DS, SS segment descriptors
• global descriptor table (GDT) (the table that
contains the information of all the programs).
• MASM Programs use the Microsoft flat memory
model (GDT).
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
40
Multi-Segment Model
• Each program has a local descriptor table (LDT)
• holds descriptor for each segment used by the program
RAM
Local Descriptor Table
26000
base
limit
00026000
0010
00008000
000A
00003000
0002
access
8000
3000
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
41
Example
Draw the memory layout for a program its LDT contains
the following information:
Base address
Limit
00002000
000F
0000AC00
0013
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
42
Paging( Virtual Memory)
• Supported directly by the CPU and the O.S
• Divides each segment into 4096-byte (4 K) blocks
called pages
• Sum of all programs can be larger than physical
memory
• Part of running program is in memory, part is on disk
• Virtual memory manager (VMM) – OS utility that
manages the loading and unloading of pages
• Page fault – issued by CPU when a page must be
loaded from disk
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
43
Components of an IA-32 Microcomputer
•
•
•
•
Motherboard
Video output
Memory
Input-output ports
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
44
Motherboard
•
•
•
•
•
•
•
•
CPU socket (microprocessor chip).
External cache memory slots
Main memory slots
BIOS chips ( it is a ROM memory contains the basic
input/output system).
Sound synthesizer chip (optional)
Video controller chip (optional)
IDE, parallel, serial, USB, video, keyboard, joystick,
network, and mouse connectors
PCI (Peripheral Component Interconnect) bus
connectors (expansion cards) for adding new devices
to the computer system.
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
45
Intel D850MD Motherboard
Video
mouse, keyboard,
parallel, serial, and USB
connectors
Audo chip
PCI slots
memory controller hub
Intel 486 socket
AGP slot
dynamic RAM
Firmware hub
I/O Controller
Speaker
Battery
Power connector
Diskette connector
Source: Intel® Desktop Board D850MD/D850MV Technical Product
Specification
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
IDE drive connectors
46
Video Output
• Video controller
• on motherboard, or on expansion card
• AGP (accelerated graphics port technology)*
• Video memory (VRAM)
• Video CRT( Cathode Ray Tube) Display
• uses raster scanning
• horizontal retrace
• vertical retrace
• Direct digital LCD( Liquid Crystal Display)
monitors
• no raster scanning required
* This link may change over time.
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
47
Sample Video Controller (ATI Corp.)
• 128-bit 3D graphics
performance powered by
RAGE™ 128 PRO
• 3D graphics performance
• Intelligent TV-Tuner with
Digital VCR
• TV-ON-DEMAND™
• Interactive Program Guide
• Still image and MPEG-2 motion
video capture
• Video editing
• Hardware DVD video playback
• Video output to TV or VCR
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
48
Memory
•
•
•
•
•
•
•
ROM
• read-only memory
EPROM
• erasable programmable read-only memory
Dynamic RAM (DRAM)
• inexpensive; must be refreshed constantly
Static RAM (SRAM)
• expensive; used for cache memory; no refresh required
Video RAM (VRAM)
• dual ported; optimized for constant video refresh
CMOS RAM
• complimentary metal-oxide semiconductor
• system setup information
See: Intel platform memory (Intel technology brief: link address may
change)
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
49
Input-Output Ports
• USB (universal serial bus)
•
•
•
•
•
intelligent high-speed connection to devices
up to 12 megabits/second
USB hub connects multiple devices
enumeration: computer queries devices
supports hot connections
• Parallel
•
•
•
•
short cable, high speed
common for printers
bidirectional, parallel data transfer
Intel 8255 controller chip
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
50
Input-Output Ports (cont)
• Serial
•
•
•
•
RS-232 serial port
one bit at a time
uses long cables and modems
16550 UART (universal asynchronous receiver
transmitter)
• programmable in assembly language
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
51
Levels of Input-Output
• Level 3: Call a library function (C++, Java)
• easy to do; abstracted from hardware; details hidden
• slowest performance
• Level 2: Call an operating system function
• specific to one OS; device-independent
• medium performance
• Level 1: Call a BIOS (basic input-output system)
function
• may produce different results on different systems
• knowledge of hardware required
• usually good performance
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
52
Displaying a String of Characters
When a HLL program
displays a string of
characters, the
following steps take
place:
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
Application Program
Level 3
OS Function
Level 2
BIOS Function
Level 1
Hardware
Level 0
53
ASM Programming levels
ASM programs can perform input-output at
each of the following levels:
ASM Program
OS Function
Level 2
BIOS Function
Level 1
Hardware
Level 0
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
54
42 69 6E 61 72 79
Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003.
Web site
Examples
55