No Slide Title

Download Report

Transcript No Slide Title

D75P 34 – HNC Computer
Architecture
Lecture 15
RISC Program Analysis and Processor
Comparisons.
© C Nyssen/Aberdeen College 2004
All images © C Nyssen/Aberdeen College except where stated
HP 830e Pocket PC © Hewlett Packard, with permission
Atari Jaguar ©2004 AtariAge, with permission
Prepared 21/01/04
Last time we saw how
any computer program
could be split into three
separate “stages”, and we
were given an assembly
code program to analyse.
The program was written for a CISC (Complex) Instruction
set. The time to execute each instruction varied, depending
on how complex the instruction was.
Today we will analyse the same program, but this time it has
been specifically written for a RISC (Reduced Instruction
Set) processor.
CISC chips…
Can have 8, 16, 32 or 64 bit registers.
Use complex instructions (more than 1
machine cycle each)
Don’t use pipelining.
Instructions are interpreted by a
nanoprocessor, then executed.
Instructions vary in size and format.
Have large instruction set.
RISC chips…
Usually have 32-bit registers
Use simple instructions (1 machine cycle
each)
Are very highly pipelined
Instructions are executed directly.
Instructions are all the same size.
Have small instruction set.
There has been great debate in
recent years as to the respective
merits of each architecture. You
can read about this for yourself in
Workbook 3.
This HP 820e handheld PC uses a 190 MHz
32-bit StrongArm, with a specially-written RISC version of Windows CE.
Because the
instructions
are simpler,
the RISC
program
needs more
of them to
accomplish
the same
thing.
For this reason, RISC assembly programs are usually much
longer than their CISC equivalents.
But because RISC
instructions are quicker to
execute, RISC processors
are usually built to run at
much lower frequencies.
This Acorn is based on a 600MHz
StrongArm. The Operating System
is RISC OS 3.7.
These machines are still in great
demand by C and C++ developers.
No matter what the RISC instruction is, they all take
exactly the same time to execute - usually 1 or 2 t-cycles,
depending on the processor model.
Again, you can write it next to the source code like this -
Split the program into the three sections, using any
unconditional jumps as a clue to where the loop might be.
Everything in the red box will happen in the context of
one complete loop. This also shows us what happens
once only before and after the loop.
We can now begin to calculate the total number of t-cycles
required using the following formula (same as last week) –
Total t-cycles =
t-cycles at start +
t-cycles at end +
t-cycles in each complete loop x number
of loops
t-cycles in last loop (which may be only
a partial loop)
This time the jumps are easier to count, because we don’t
have to worry about two values. Begin to fill in the formula -
Total t-cycles =
t-cycles at start +
10
t-cycles at end +
6
t-cycles in each complete loop x
number of loops
30 x ?
t-cycles in last loop (which may be only 20
a partial loop)
The only figure now missing is for how many complete
loops are run. To establish this, we have to look at the
data provided.
We are told that the user inputs the following values each
time the prompt is displayed –
6, o, %, f and s
The stopping condition was defined on the line
So again, the condition is met on the fifth pass of the loop.
Remember that on the fifth pass, the loop is only the partial
one, as the stopping condition is triggered halfway down.
We therefore have four complete loops to count -
Total t-cycles = 156
t-cycles at start +
t-cycles at end +
t-cycles in each complete loop x
number of loops
t-cycles in last loop (which may be
only a partial loop)
10
6
30 x 4
20
You will now be asked to apply both RISC and
CISC values to the given speeds of several
processor.
Start by working out the length of the t-cycle for
each one Oyster (RISC) t-cycle =
1
X 156 (number of t-cycles required to run RISC version) = 0.78 x 10
6
200 x 10
–6
Purslane (CISC) t-cycle =
1
X 546 (number of t-cycles required to run CISC version) = 0.68 x 10
6
800 x 10
–6
seconds.
seconds.
Now apply the t-cycle figures to each of the processor
speeds. Remember that the RISC number goes with the
RISC models and the CISC total with the CISC ones!
You should now have a
table showing the
comparable times taken to
execute the program on
various speeds and models
of processor.
At it’s release in 1994, this Atari Jaguar was way ahead of it’s rivals, with it’s
64-bit RISC processor running at 26.59 MHz. Sadly, it couldn’t compete
economically and production stopped in 1996.
Processor model
Processor
speed
Oyster (RISC)
200 MHz
Scallop (RISC)
300 MHz
Mussel (RISC)
400 MHz
Pandora (RISC) 500 MHz
Purslane (CISC) 800 MHz
Samphire (CISC) 900 MHz
Dulse (CISC)
1 GHz
Bootlace (CISC) 1.1 GHz
Time taken in
seconds x 10 -6
(156/200) = 0.78
(156/300) = 0.52
(156/400) = 0.39
(156/500) = 0.31
(546/800) = 0.68
(546/900) = 0.61
(546/1000) = 0.55
(546/1100) = 0.50
The next stage is to draw a graph to demonstrate these
figures.
Time Taken to execute program in Seconds x
10^-6
This is where the scientific notation of numbers is useful you can draw your graph based on fairly large numbers,
applying a scientific notation to the axis!
0.9
0.8
0.78
0.7
0.68
0.61
0.6
0.55
0.52
0.5
0.4
0.5
0.31
0.3
0.2
CISC Models
0.39
RISC Models
0.1
0
200
300
400
500
600
700
800
900
Processor Speed in Megahertz
1000 1100
1200
Time Taken to execute program in Seconds x
10^-6
You will also be asked to provide an estimate of speed
using your graph ...
0.9
0.8
0.78
0.7
0.68
0.61
0.6
0.55
0.52
0.5
0.4
0.5
0.39
0.31
0.3
0.2
0.1
0
200
300
400
500
600
700
800
900
Processor Speed in Megahertz
The answer is approximately 1.05 GHz.
1000 1100
1200
You will also have to state how much space in memory,
each program will use.
Complex instructions are different sizes. The more
complicated the instruction, the more space it needs.
Reduced instructions are
always the same size usually 32 bits, sometimes
64, depending on processor
models.
This Psion 5MX used a 36MHz ARM710T
RISC CPU and ran EPOC as an operating
system.
You can write the sizes next to the
instructions, like this…..
And then just add them up!
The CISC program uses 40 bytes ….
… but the RISC one occupies 92!
You don’t have to worry about the loop - because every
time a jump is executed, it resets the program counter!
So the MAR just “re-visits” the same areas of memory,
over and over, until the stopping condition is met.
Summary [1]
To analyse any type of source code -
•establish how many t-cycles each instruction takes.
•establish what happens at the beginning, in the middle
and at the end of the program.
•work out how many whole loops will be completed, and
whether the last loop is a partial one.
Work out how many t-cycles the RISC and the CISC
versions will require to run.
Fill in the table, keeping the results in scientific notation.
Summary [2]
Apply the total number of t-cycles to the processor
speeds, to work out in real time, how long the source
code will take to run.
Keep all results in matching scientific notation, e.g.
0.50 x 10 –6 seconds.
This makes it easier to depict on a graph, which will
be used to estimate times for other theoretical
models.
Remember to work out how much memory space
each program will require!