無投影片標題

Download Report

Transcript 無投影片標題

Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
1-1
Chapter 1. Fundamentals of Computer Design
• Introduction
– Performance Improvement due to
(1). Advances in the technology
(2). Innovation in computer design
– 1945-1970: (1) and (2) made a major contribution to
performance improvement
– 1970 ~ : 25% to 30% per year performance improvement
for the mainframes and minicomputers.
– 1975~ : 35% per year performance improvement for
microprocessors simply due to (1).
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
Performance Growth for Micro-processors
1-2
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
Changes in the Marketplaces Made a
Successful Architecture
• The virtual elimination of assembly language
reduced the need for object code compatibility
• The creation of standardized, vendor-independent
operating system, such as Unix and Linux, lowered
the cost and risk
• Consequence of the changes
– Enable the development of RISCs to focus on
• Exploitation of instruction level parallelism
• Use of caches
– Lead to 50% increase in performance per year
1-3
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
1-4
The Effect of the Growth rate in Computer
Performance
• Significantly enhanced the capability available to computer
users
• Lead to the dominance of microprocessor-based computers
across the entire range of computer design.
– Workstations and PCs have emerged as major products.
– Servers replace minicomputers.
– Multiprocessors replace mainframe computers and super
computers.
• The advance of IC technology
– Emergence of RSIC
– Renewal of CISC such as x86 (IA32) microprocessors.
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
1-5
The Changing Face of Computing
1960s
Large mainframes
» Business data processing and scientific computing
1970s
Minicomputers
» Time-sharing
1980s
1990s
2000s
Desktop computing(personal computing)
Internet and Word Wide Web (servers)
Embedded computing, mobile computing,
and pervasive computing
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
1-6
Tasks of a Computer Designer
• Determine what attributes are important for a new machine,
then design a machine to maximize performance while staying
within cost constraints.
• Task aspects: Instruction set design, functional organization,
logic design, and implementation.
• In the past, “Computer Architecture” often referred only to
instruction set design. Other aspects of computer design were
called “implementation”.
• In this book, “Computer Architecture” is intended to cover all
three aspects of computer design: instruction set architecture,
organization and hardware.
• “Instruction set architecture” refers to the actual
programmable-visible instruction set. It serves as the
boundary between the hardware and software.
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
1-7
• “organization” includes the high-level aspects of a computer’s
design, such as the memory system, the bus structure, and the
internal CPU.
• NEC VR5432 and NEC VR 4122 have the same instruction
set architecture but with different organization.
• “Hardware” would include the detailed logic design and
packaging technology of the machine.
– For example: different Pentium microprocessors running
in different frequency have the same instruction set
architecture and organization but with different hardware
implementation
• Organization and hardware are two components of
implementations.
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
Functional Requirements (Fig. 1.4)
• Application Area:
– General purpose, scientific and server, commercial,
embedded computing
• Level of Software Compatibility
– At programming language, object code or binary code
compatibility
• Operating System Requirements
– Size of address space, memory management, protection
• Standards
– Floating point, I/O bus, operating systems, networks,
programming languages
1-8
Rung-Bin Lin
Chapter 1: Fundamental of Computer Design
Technology Trends
• A successful instruction set architecture must be designed to survive
changes in computer implementation technology.
• Trends in implementation technology:
– Integrated circuit logic technology:
•
•
•
•
Transistor density: 35% increase per year, quadruple in 4 years.
Die size: 10%~20% increase per year
Transistor count/per chip: 55% increase per year.
Transistor speed: scales more slowly.
– DRAM:
• Density: 40%~60% increase per year recently.
• Cycle time : decrease 1/3 in 10 years.
– Magnetic disk:
• Density: 100% increase per year recently.
30% increase per year, double in 3 years, prior to 1990.
– Network technology
• Ethernet: 10M to 100M to 1G byte band width.
1-9
Chapter 1: Fundamental of Computer Design
Scaling of IC Technology
• IC Process Technology
– 10um(1971) 0.18um(2001)
• IC Technology and Computer Performance
– Transistor performance
– Wire delay
– Power consumption
Rung-Bin Lin
1-10
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
1-11
Cost, Price and Their Trends
• Cost reduction factors
– Learning curve drives the cost down; manufacturing costs over time,
i.e., yield improvement.
– High volume (i.e. mass production)
– Commodities are products sold by multiple vendors in large volumes
and essentially identical, i.e., competition.
– Price of DRAM (fig. 1.5)
– Price of Pentium III (fig. 1.6)
• Cost of an integrated circuit
– Cost of die =f(die area)
– Computer designer affects die size both by what functions are included
on the die and by the number of I/O pins.
• Distribution of cost in a system (fig. 1.9, 1.10)
Rung-Bin Lin
Chapter 1: Fundamental of Computer Design
Prices of DRAM
1-12
Rung-Bin Lin
Chapter 1: Fundamental of Computer Design
Price of Pentium III
1-13
Rung-Bin Lin
Chapter 1: Fundamental of Computer Design
Price for $1000 PC
1-14
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
1-15
Measuring and Reporting Performance
• “X is n times faster than Y” means
Execution time(Y )
Performanc e( X )
n
Execution time( X )
Performanc e(Y )
• The term “system performance” is used to refer to
elapsed time on an unloaded system.
• CPU performance refers to user CPU time on an
unloaded system.
• To evaluate a new system is to compare the execution
time of her workload - the mixture of programs and
operating system commands run on a machine.
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
1-16
Choosing Programs to Evaluate Performance
• Best case: Measure the execution time of a system’s workload
• General case: five levels of programs are used:
– Real programs: C compiler, Tex, Spice, etc.
– Modified (scripted) applications: A collection of real applications…
– Kernels: small, key pieces from the real programs, ex., Livermore
loops and Linpack.
– Toy Benchmarks: 10 to 100 lines of code and produce a result the user
already knows, ex., puzzle, quicksort,…
– Synthetic benchmarks: try to match the average frequency of
operations and operands of a large set of programs, ex., Whetstone
and Drystone.
• Performance prediction accuracy:
– Real programs is best, wile synthetic benchmarks is worst. and
reporting performance results (fig. 1.9 &1.10)
Rung-Bin Lin
Chapter 1: Fundamental of Computer Design
Benchmark Suites
• SPEC (Standard Performance Evaluation
Corporation)
– www.spec.org
• Benchmark types
– Desktop benchmarks
– Server benchmarks
– Embedded benchmarks
1-17
Rung-Bin Lin
Chapter 1: Fundamental of Computer Design
Desktop Benchmarks
• SPEC Benchmarks
– SPEC CPU2000 (SPEC95, SPEC92, SPEC89) (Fig. 1.12)
– Graphic benchmarks
• SPECviewperf
• SPECapc
• Window’s OS benchmarks (Fig. 1.11)
– Business Winstone
– CC Winstone
– Winbench
1-18
Rung-Bin Lin
Chapter 1: Fundamental of Computer Design
1-19
Server Benchmarks
• SPEC
– File server benchmarks: SPECSFS
• Measuring NFS performance
– Web server benchmarks: SPECWeb
• Simulate multiple clients requesting both static and dynamic pages.
• TPC (Transaction-Processing Council)
– TPC-A, TPC-C, TPC-H, TPC-R, TPC-W
• Simulate a business-oriented transactions (queries)
• www.tpc.org
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
Embedded Benchmarks
• EDN Embedded Microprocessor Benchmark
Consortium (EEMBC) (Fig. 1.13)
–
–
–
–
–
Automotive/industrial
Consumer
Networking
Office automation
Telecommunications
1-20
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
1-21
Reporting Performance Results
• Guiding Principle
– The performance measurements should be reproducibility.
• Needs to tell
– Hardware configurations
– Software used
• Is source code modification for benchmarks allowed?
Rung-Bin Lin
Chapter 1: Fundamental of Computer Design
1-22
Comparing Performance
Computer A Computer B Computer C
P1 (secs)
1
10
20
P2 (secs)
1000
100
20
Total time
1001
110
40
• Total execution time: A consistent summary measure
• Another metrics
–
–
–
–
Average execution time (arithmetic mean)
Harmonic mean
Weighted execution time:  Weighti  Timei
Geometric mean: n
n
 Executiontime ratioi
i 1
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
1-23
Quantitative Principles of Computer Design
• Make the common case fast: A fundamental law, called
Amdahl’s Law, can be used to quantify this principle.
• Amdahl’s Law
– the performance improvement to be gained from using some faster
mode of execution is limited by the fraction of the time the faster mode
can be used.
– Amdahl’s Law defines speedup as
Performance with enhancement
Speedup 
Performance without enhancement
1
Executiontimeold


Fractionenhanced
Executiontimenew 1 

Fractionenhanced
Speedupenhanced
• Example on pages 41 and 42
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
CPU Performance Equation
CPU time  CPU clock cycles for a program  clock cycle time
 IC  CPI  Clock cycle time
Instructions Clock cycles
Seconds



Pr ogram
Instruction Clock cycle
– Dependency
• Clock cycle time - Hardware technology and organization
• CPI - Organization and instruction set architecture
• Instruction count - Instruction set architecture and compiler
n

CPU
time

 CPIi   Clock cycle time

– Sometimes
 i 1

n
and overall
IC i

CPI   CPIi  

i 1
 Instructio n count 
– Example on page 44.
1-24
Rung-Bin Lin
Chapter 1: Fundamental of Computer Design
1-25
Measuring the Components of CPU
Performance
• Clock cycle time: Timing simulator or timing verifiers
• IC(instruction count)
– Direct measurement from running the applications on hardware
– Instruction set simulator - slow but accurate
– Instrumentation code approach: the binary program is modified by
inserting some extra code into every basic block. Fast but need
instruction set translation if simulated machine differs from simulating
machine.
• CPI: very difficult to measure
CPI = Pipeline CPI + Memory system CPI
• Basic block
Label:xxx
Branch ***
Branch ***
Branch ***
Branch *** Label: xxx
Label: xxx
Label: xxx
• Use the CPU performance equations to compute performance
Rung-Bin Lin
Chapter 1: Fundamental of Computer Design
1-26
Principle of Locality
• Application of Amdahl’s Law
– A program spends 90% of execution time in on 10% of the
code.
– Temporal locality: Recently accessed items are likely to be
accessed in the near future.
– Spatial locality: Items whose addresses are near one
another tend to be referenced close together in time.
Rung-Bin Lin
Chapter 1: Fundamental of Computer Design
Put it All Together
• Performance and Price-Performance
– Desktop computers
– Server computers
– Embedded processors
1-27
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
Price and Performance of Desktops (1)
1-28
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
Price and Performance of Desktops (2)
1-29
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
Price and Performance of Servers (1)
1-30
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
Price and Performance of Servers (2)
1-31
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
1-32
Price&Performance of Embedded Processors (1)
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
1-33
Price&Performance of Embedded Processors (2)
Chapter 1: Fundamental of Computer Design
Rung-Bin Lin
1-34
Price&Performance of Embedded Processors (3)