thearmmicroprocessor
Download
Report
Transcript thearmmicroprocessor
The ARM Microprocessor:
A Little British Success Story
Michelle Nabavian
V22.0480 Microprocessors
Professor Robert Dewar
Spring 2002
The ARM Chip
Meant to be “MIPS for the masses”
• RISC load/store design
• Simple
• originally with a short 3-stage pipeline
• operating in either big- or little-endian mode
Leading provider of 16/32-bit embedded
RISC microprocessor solutions
•
•
•
•
High performance
Low cost
Power efficient
Established as standard in wireless communications
• Dominates mobile telephone market today
2
ARM: Then and Now
Original ARM
• ARM1, 2 and 3
• 32-bit CPU
• 26-bit addressing
ARM today
• ARMv5 instruction set and architecture is being
developed and widely used
• This slide set mostly concentrates on the ARM10
family of processors
• Completely 32 bits
• Though it has 26 bit modes for compatibility with
previous models
3
The Instruction Set
It is important to increase chip’s
capabilities through efficient code
Strengths
• Dense code unlike other load/store
processors
Shortcomings
• Low clock rate
• Relatively short pipeline
• 3-6 stages depending on the model
4
Code Efficiency and Density
Density achieved through
•
•
•
•
Thumb 16-bit instruction set
Instruction predication
Improved branch prediction
A barrel shifter
Other extensions include
• DSP instruction set
• Jazelle technology
FOR MORE INFO...
www.armdevzone.com
5
The Thumb Instruction Set
16-bit version of the ARM
A low cost solution for code density
• Provides memory savings of up to 35% over
equivalent 32-bit code
Recodes a subset of ARM instructions
into 16 bits
• Decoded to 32-bit instructions with no penalty
• 32-bit code can be mixed with 16-bit code when the
full instruction set is needed
• Retains access to full 32-bit address space
6
Instruction Predication
Instruction predication introduced in
microprocessors by ARM
• All instructions are predicated using a 4-bit condition
code
• Predicate bits suggest whether the current
instruction should or should not be executed
Memory Disambiguation: Propagation
• One bit in each instruction indicates whether the
condition codes should be set
• Prevents some intervening instructions from
changing condition codes
• Eliminates many branches and speeds execution
7
Branch Prediction
The prefetch unit
• Responsible for grabbing instructions from the
memory system that are processed as required by
the integer core
• A prefetch buffer holds up to three instructions and
allows for accurate branch prediction
• Provides branch target addresses to later stages of
the pipeline
• When a branch is predicted as not taken, it can be
removed entirely from the instruction stream
• The target address of the branch is still calculated
in case the prediction is incorrect
8
More About Branches
Branch folding
• Branch removal from instruction stream based on a
prediction
• Substitution with predicted next instruction
• Condition codes of branch are “folded” into next
predicted instruction
• Branch itself takes zero cycles
Note: The branch prediction mechanism is
static (uses no history information)
• Conditional forward branches are predicted as not
taken while conditional backward branches are
predicted as taken
• Mispredicted branches have a 3-cycle penalty
9
The Barrel Shifter
Operates on the second operand of most
ALU (Arithmetic Logic Unit) operations
Allows shifts to be combined with most
operations as well as index registers for
addressing
Can combine two or more instructions
into one
Used for decoding and scaling operations
10
DSP and Jazelle Extensions
DSP instruction set
• A set of arithmetic instructions for DSP applications
• For systems that require flexibility of a
microcontroller as well as data-processing
capabilities of a DSP (Digital Signal Processor)
• Instruction set offers 16-bit and 32-bit arithmetic
capabilities in addition to the existing capabilities
of the CPU
Jazelle technology
• Enables direct execution of Java byte-code
• Provides developers freedom to run Java code
alongside other applications on a single chip
• Offers higher performance and reduced power
consumption
11
VFP Coprocessor
Adds full vector floating point operations
to the ARM core
• Accompanied by 32 32-bit registers which can be
loaded, stored and operated on for vector-vector or
vector-scalar operations
Tools like Matlab can be used to derive the
application code
Offers increased performance for imaging
applications
• Scaling and 2D/3D transforms
• Font generation
• Digital filters
12
Power-Saving Modes
ARM10 family includes two new powersaving modes for lower power
consumption, an important feature for low
power embedded applications
• NAP: power down core (preserves state in caches)
• SLEEP: power down entire chip (state must be saved
in memory)
Importance of a low-power processor seen
as demand for performance skyrockets due
to high-security multimedia applications
• Low power processors offer fast and inexpensive way
to reduce consumption
13
Dominating the Mobile Market
ARM-architecture CPU cores account
for a 70% share of the mobile
telephone market
• Palm and Microsoft plan to use the CPU
core for their newest PDAs (Personal Digital
Assistants) simply because they know it will
be the most popular and abundantly used in
the future
FOR MORE INFO...
www.arm.com/news
www.nikkeibp.asiabiztech.com
14
A Stable Architecture
Minimal variations to the architecture
and a stable instruction set
• Strictly managed instruction set, only
steadily expanded to meet embedded
applications needs, and all the time
maintaining downward compatibility
• Made it possible for many OS and
development tools to support ARM CPUs
• Ideal for cell phones and digital home
appliances
• Simple maintenance for large number of
models
15
Attracting Licensees
Convinced integrated circuit (IC)
manufacturers to adopt the core
• Internal bus disclosure
• Permitting licensees to develop a wide variety of IC chips
using the same CPU core
Cooperative activities with semiconductor
firms
• Architecture licenses with Intel and Motorola
• Licensees gain design and manufacturing rights
• Intel’s StrongARM and newer Xscale have additional Intel
coprocessor instructions only used by Intel
• Next generation ARM architecture will be jointly
developed
16
An Expanding Foundry Program
Program builds a 3-way partnership
between ARM, an approved silicon
foundry and an original equipment
manufacturer
ARM provides design kits for its
microprocessor cores, leaving the
manufacturing to foundry partners
• Increases access to ARM technology
• Assures availability of design and manufacturing
resources
• Lowers development costs and accelerates time-tomarket
17
Targeting New Markets
Growing markets include:
•
•
•
•
•
•
•
Networking
Consumer entertainment
Storage
Imaging
Automotive
Security
Industrial Control
Success is mostly due to a reputation of
being low power and high performance
18
The Future
Future processors based on the ARMv6
architecture
• Co-developed with Intel and Texas Instrument
• Design targeted towards new markets and next
generation applications
• Maintenance of backwards compatibility is important
• As well as the slogan of higher levels of performance
and maintained power efficiency
A continuing expansion of foundry
program, licenses, and partnerships
• Ensures success introduction of new innovations in
the architecture
19