ECE 353Lab C

Download Report

Transcript ECE 353Lab C

ECE 353 Lab C
Pipeline Simulator
Aims
• Further experience in C programming
• Handling strings
• Further experience in the use of assertions
• Reinforce concepts from pipelining
• Techniques for simulating parallel operations with a sequential
program
Statement of Work
• Write a C program to simulate the register contents of a MIPS
machine running an abbreviated instruction set
• Should be cycle-accurate with respect to register contents.
• Simulates a machine consisting of add, sub, addi, mul, lw, sw,
beq instructions only.
• Will need to provide some error detection wrt the assembly program input.
Simulation Approaches
• State change in simulation
• Event-driven Simulation: Follows events as they occur.
• Time-driven Simulation: Driven by a clock; does something every clock
transition.
• Computational Platform
• Sequential: Simpler to program and debug; slower.
• Parallel: More complicated program (need to take interactions into account);
much higher throughput (potentially).
Lab C involves a sequential, time-driven simulation.
Inputs to the Simulator
• MIPS assembly language program which will be in a file read by the
simulator.
• Parameters of the simulated machine:
•
•
•
•
Memory access time: c cycles.
Multiply time: m cycles.
All other EX operations: n cycles.
ID and WB stages take one cycle to complete their tasks.
Memory
• Separate instruction and data memories (separate address spaces).
• Instruction Memory:
• Addressed by the PC.
• Size: 2K bytes (512 words).
• Data Memory:
• Accessible to the program.
• Size: 2K bytes (512 words).
Implement both as
linear arrays.
Processor
• Five stages: IF, ID, EX, MEM, WB
• Each stage follows the rules you learned in ECE 232, except that IF
and EX can take multiple cycles to complete an operation.
IF
• Fetches from the Instruction Memory.
• Will have to freeze if there is an unresolved branch.
• Branches are detected in the ID stage.
• When a branch is detected, no further instructions are decoded until the
branch is resolved.
• Branches are resolved in the EX stage where a subtraction operation is carried
out and the result used to determine if the branch is taken or not.
ID
• Decodes instructions (takes 1 cycle to do this).
• Checks for data (RAW) and control hazards.
• Holds up further processing until these hazards have been resolved (no
forwarding in this machine).
• Provides the EX stage with the operands it needs.
EX
• Executes the specified operation on the operands it receives from ID.
• Can take multiple cycles depending on the operation.
• EX is not itself pipelined; only one instruction can be in EX at any time.
• Use a counter to keep track of ongoing operations.
MEM
• Is only used by the lw and sw instructions.
• Receives the data memory address from EX and carries out the data
memory read/write operation.
• Can take multiple cycles to do this (c cycles).
WB
• Writes back into the register file.
• Will have nothing to do if the instruction does not do a register write.
• Register reads/writes follow the approach in Hennessy & Patterson.
Simulator Modules
• progScanner(): Reads from the file provided by the user,
containing the assembly language program. Note that this will have
to flexible in dealing with empty spaces.
• parser():
• From the instruction passed to it by progscanner(), break it up into its
constituent fields (opcode, source registers, immediate field, destination
register).
• Detect a certain subset of errors in the program:
•
•
•
•
Illegal opcode
Unrecognized register name
Register number out of bounds
Immediate field that is too large
Simulator Modules (contd.)
• Pipeline stage modules:
•
•
•
•
•
IF
ID
EX
MEM
WB
• Pipeline stages will communicate via latches separating them. A latch
can only hold information relating to a single instruction.
• Deal with structural hazards.
• Use valid/empty bits to specify if (a) the contents of a latch are valid for use
by the stage to its right and (b) the latch is available to be written into by the
stage to its left.
Simulator Execution
• We need to simulate parallel operations with a sequential program.
How?
• Maintain a counter to represent the clock value.
• Each time the counter increments (representing an advance in time by one
cycle), call the stages in reverse order: WB, MEM, EX, ID, IF.
• Why should this work?
Assertions
• Use assertions to help with correctness.
• Assertions should have been thought through and contribute meaningfully to
the reliability of the program.
• Preconditions, postconditions, invariants can all be checked. For a good
introduction, see
http://ptolemy.eecs.berkeley.edu/~johnr/tutorials/assertions.html
Some Design Issues
• Coordinating producer and consumer operation with respect to the
latch between them.
• Multicycle instruction handling
• Reading assembly instruction lines and parsing them
• Information that needs to be carried along with the instruction down
the pipeline
• Dealing with hazards:
• Structural: Can only move from a stage if the next stage is free.
• Data: RAW.
• Control: Pending branch freezes all subsequent ops until it is resolved.
Producer/Consumer Latch Management
• Rules for correct functioning:
• Producer stage should not load into a latch before the consumer has consumed the
previous item it placed there.
• Consumer stage should not read from a latch before the producer has placed a new
item there.
• Solution: Use a validItem bit, stored in the latch itself.
• Set by the producer when it places a new item in the latch
• Cleared by the consumer
• Must be done in the same cycle in which the consumer places its result in its output latch or
writes back to the register file.
• This approach is only safe since there is no race condition between producer and
consumer stages in the simulation. Do not try this if doing a hardware
implementation.
Multicycle Operations
• IF takes c cycles to access instruction memory
• EX takes m cycles for multiply and n for other operations
• MEM takes c cycles to access data memory for lw and sw instructions
• Implementation:
• Use a counter initialized appropriately when an operation starts in a stage.
• The counter will be decremented each cycle.
Reading Assembly Code
• Need to find a suitable function from the standard C library for this.
• fscanf() is not the most convenient option: why?
• There are other options, e.g.,
char *fgets(char *string, int n, FILE *stream);
Parsing Assembly Code
• Break the code down into the opcode and other fields.
• Opcode: Is delimited to its right by at least one empty space.
• Other fields: Delimited by a comma. Empty spaces can be ignored.
• Programming Approaches:
• Walk down the string returned by fgets() looking for delimiters.
• Use a library function strtok().
• Caution: strtok() modifies its first argument…
• Some useful functions: strcmp(); atoi(); isdigit().
• Make sure you know what these do before using them!
Some error detection
• Illegal opcodes: Match the opcode field against all legal opcodes (the
MIPS subset plus haltSimulation).
• Register not recognized or number out of bounds:
• Registers can be specified either symbolically (e.g., $s0, $t5) or numerically
(e.g., $8): check first character beyond $ to see which it is.
• If numerically specified, registers must be numbered 0-31.
• If symbolically specified, the character string must match a legal register
name.
Error detection (continued)
• Immediate field must be representable in 16 bits (two’s complement).
• Check if the numerical quantity is within bounds.
• Memory accesses must be aligned to integer boundaries
• Data addresses must be a multiple of 4.
• Memory alignment problems are detected by MEM; other errors are
detected by ID.
• When an error is detected, the simulator should print an error
message specifying the error type and then stop.
Carrying information down the pipeline
• Since control signals are not required to be simulated, you can pass
the entire instruction down the pipeline.
• To handle beq, carry the PC as well.
• exResult will go from EX to MEM and then to WB.
Hazards: Structural
• Each stage can be occupied by no more than one instruction at any
time.
• A pipeline latch will be cleared by the consuming stage only in the
cycle in which it fills the latch to its right, if any.
• A completed operation with an uncleared latch to its right has to wait
until that latch has been cleared (by setting validItem=0) by the
next stage. This can cause backpressure in the pipeline.
Hazards: Control
• beq is the only instruction in this set that causes a control hazard.
• When beq is decoded in ID:
• A branchPending flag is set by ID.
• branchPending=1 causes the instruction following the one decoded to be
removed.
• IF idles while a branch is pending; it can only resume in the cycle after
branchPending has been reset (be careful about how you implement this
since you are calling IF() after EX() in your simulation!).
• The branch is resolved in EX.
• If branch is taken, PC is set to point to the target instruction, else to the next
instruction immediately following the beq.
• EX resets branchPending.
Hazards: Data
• Three types of data hazard: RAW, WAR, WAW.
• Due to the linear nature of the simple MIPS pipeline, MIPS is only
subject to RAW hazards.
• Detected in the ID stage.
• ID checks to see if any latch in front of it has an instruction producing data
that will be consumed by the instruction it is decoding.
• If so, hold that instruction in ID until all such hazards resolve.
Statistics
• Stage Utilization: Fraction of cycles for which the stage is doing useful
work.
• Useful work does NOT include waiting for some hazard to clear.
• Keep a counter in each function which counts the number of cycles the stage
has been busy.
• Global.
• Static.
• Remember to initialize all counters!
Implementation: IF Stage
• Takes the PC value and fetches the appropriate instruction from the
Instruction Memory (takes c cycles for this).
• Sends the instruction along with its PC value to the next stage.
• Freezes whenever instructed to (IF_ID latch full or branchPending
set).
• Increments PC and starts fetching next instruction unless it is made to
freeze.
• Increments usefulCyclesIF counter whenever it is doing useful
work.
Implementation: ID Stage
• When IF_ID latch has a valid element, decodes the instruction.
•
•
•
•
Detects all data hazards.
If the instruction is beq, sets branchPending flag.
Waits for any structural hazards to clear.
Fetches the operands for use by the EX stage and places them in the ID_EX
latch.
Implementation: EX Stage
• Does not need to worry about data or control hazards; these have
been resolved by the time this instruction has got to EX.
• Calculates the result of the requested operation: this takes m cycles
for MULT and n cycles for all other operations.
• If the EX_MEM latch is free, places the result in this latch and clears
the ID_EX latch for use by ID; else waits for the EX_MEM latch to
clear.
Implementation: MEM Stage
• All non-memory instructions simply pass through this stage.
• lw and sw instructions involve memory access which takes c cycles.
• sw instructions complete at MEM; the result for lw is passed to the
WB stage along with the instruction.
• Does MEM have to worry about structural hazards?
Implementation: WB Stage
• Has no latch in front of it; hence no structural hazards can impede it.
• Takes just one cycle.
• Does nothing for beq and sw instructions.
Submission
• Summary sheet on Moodle
•
•
•
•
Authorship/testing information.
Call graph of code
Testing procedures used
Assertions used
• Upload program code on quark.
• Only a single file should be uploaded: no zipfiles.