Transcript Notes

• programmable logic
• Altera devices and the Altera tools
• major tasks in the silicon programming process
• using a “.vec” file for testing
• UP3 core library and I/O modules
(note: references are to textbook by Hamblen et al)
1
"silicon compilation":
basic idea: restrict possible physical configurations; sacrifice area /
performance for "regularity" of design; use regular physical structures to
enable AUTOMATION of layout
All CAD tools will sacrifice some area/performance for automation and the
ability to do "large" designs, just as software compilers sacrifice some
efficiency for the ability to use a high-level language instead of assembly
language; designer productivity will increase substantially, however
SW Programming:
Write
Program
(HLL)
Link to
Libraries
Compile
Load/
Execute
Silicon Programming:
Write
Program
(HDL/Scm)
Compile/
Link
Fit
Simulate
Program
Device/
Execute
2
examples (Altera, Xilinx, etc.):
"cell" typically contains LUT (look-up table),
memory, I/O
"address" from a,b,c output
ex: 3-input LUT:
inputs:
a
b
c
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
f0
f1
f2
f3
f4
f5
f6
f7
output
o
"device" consists of cells, local routing, global
routing, specialized memory arrays; manufacturer
provides "families" of devices--different sizes, power
usage, operating conditions, etc.
3
Example: A Generic Programmable Logic Device
Architecture
CARRY-IN
GLOBAL
LOCAL
BUS
BUS
OUT
IN
LOGIC
BUS
BUS
(LOOK-UP
TABLE)
LE
(Logic Element)
MEMORY
CLOCK
(1-BIT)
RESET
(Altera):
LAB
(Logic Array Block)
MEM IN
maxplus2 "compiler":
netlist extractor
MEM
OUT
RAM Block
b. BLOCK OF PLD CELLS
4
Device families:
Example: “Cyclone”—we will use EP1C6 or EP1C2
features:
» logic elements (LE’s)
» RAM blocks
» Global clock + Phase locked loops for clock
configuration
» >= 170 I/O pins
Cyclone LE—figure 3.7
Cyclone LABs and interconnects: figure 3.9
5
Example: using a lookup table to describe a gate network:
f(A,B,C) = A'B'C + A'BC' + A'BC + ABC
Inputs: ABC
000
001
010
011
100
101
110
111
out
0
1
1
1
0
0
0
1
6
Altera Project Flow (“Rapid Prototyping”):
1. (Hierarchical) DESIGN
 design entry
schematic (mydesign.gdf)
vhdl (mydesign.vhd)
other formats (Verilog, AHDL, EDIF, )
IP cores
2.Compilation
 translation, optimization, synthesis (“netlist”)
 device fitting (placement and routing)
 Floorplan editor—figure 1.23
 Report generation
3.”Execution”
 Timing analysis
 simulation (functional / timing)
 device programming, hardware verification
7
After we have completed our design (schematic or
HDL), the compiler converts it into a design for an
actual working circuit using a "technology mapping":
design library  technology library
on the Altera boards we are using, there is one chip,
from the “CYCLONE” family
8
parallel port
VGA port
UP3 BOARD
PS2 port
Cyclone chip
USB port
SRAM
serial port
FLASH
invalid input
voltage LED
on/off switch
user-definable
pushbuttons
user-definable
LEDs
power
user-definable
DIP switches
+3.3V
supply
LED
global reset
+5V
supply
LED
LC Display
http://users.ece.gatech.edu/~hamblen/UP3/ and
http://users.ece.gatech.edu/~hamblen/UP3/UP3%20Reference%20Manual.pdf
9
Technology: SRAM
General description:
http://en.wikipedia.org/wiki/Static_Random_Access_Memory
General information on “programmable” devices:
http://www.tutorial-reports.com/computer-science/fpga/user-programmability.php
10
CYCLONE chips:
http://www.altera.com/literature/hb/cyc/cyc_c51002.pdf
2900-20,000 LE’s; 10 LE’s grouped into one LAB
Dedicated RAM blocks
Specs: page 2-2
LE architecture: page 2-6
Embedded memory specs; page 2-18
Global clock and up to 2 PLLs (UP3 clock: 48 MHz)
Device grades, operating conditions:
http://www.altera.com/literature/hb/cyc/cyc_c51004.pdf
11
Physical behavior of devices:
 operating conditions--recommended/absolute maximum
temperature
 gate delays
 pin voltage levels
 output loading
 power-supply management
 device programming erasure
 power evaluation: worksheet available at:
www.altera.com/support/devices/estimator/pow-powerplay.html
12
Reconfigurable computing:
We can create arrays of programmable devices
Programmability will allow us to change the
hardware capabilities "on the fly"--e.g., reprogram
some devices while others are being used for
processing; this allows us to "reconfigure" the
hardware to adapt to specific processing needs, just
as we can now rewrite software
example: Xilinx Virtex FPGA boards
13
One more useful Altera option:
note that the devices we have access to will allow us
to produce fairly "large" designs. To adequately test
these designs, we will need to input files of test
vectors rather than relying solely on inputting
waveforms (and we will need to do
HIERARCHICAL design AND testing)
A test vector file (myfile.vec) can be created in the
text editor. Here is an example file to test a module
with inputs A, B, RESET, and CLOCK and outputs
X,Y,Z.
A
X
B
Y
RESET
Z
CLOCK
14
%test vector file for above module%
% units default to ns %
START 0 ;
% time to start simulation%
STOP 1000 ;
% time to end (in ns)%
INTERVAL 100 ;
INPUTS CLOCK ;
PATTERN
01;
% pattern of clock values %
% CLOCK ticks every 100 ns %
INPUTS A B ;
PATTERN
% test every combination of %
% A and B %
0> 0 0
220> 1 0
320> 1 1
% change A,B at given times %
570> 0 1
720> 1 1
;
INPUTS RESET ;
PATTERN
0> 1
100> 0
;
OUTPUTS X Y Z ;
PATTERN % check output at every Clock pulse --these are expected values%
=XXX
=000
% relative time vector values %
=000
=100
=001
=001
=011
=011
=111
=111
=111
=111;
15
using the .vec file: open the simulator; then
on the "File" menu choose inputs/outputs;
then choose your .vec file; you must do this
BEFORE opening a .scf file
Note: results of the simulation cannot be
saved as a .vec file. To save your results,
save them as either a waveform (.vwf) or a
table output (.tbl) file.
16

The UP3 core library

input and output for the Altera board

random number generation
17
We will be covering material from
chapters 4,5,10,11 on I/O
I/O is the "hardest" type of module to
build, since it requires transition between
electrical domain and other energy
domains (e.g., mechanical, light)
We will also discuss a way to generate
pseudo random numbers (Appendix A)
Both I/O and random number generation
will probably be useful for your project.
18
UP3 functions:
an IP (intellectual property) core
described in chapter 5 of text
VHDL versions are available
8 modules--perform I/O “housekeeping” functions
use in project:
example VHDL package—UP3pack.vhd
modules must be “visible” in your path or included
in your design in some way (directly, package, etc.)
19
module names:
 Debounce
 OnePulse
pushbuttons
 LCD_Display—LCD panel character display
 Clk_Div--gives slower clock speeds
 VGA_Sync—video sync generation
output
 Char_ROM--codes for display characters
 Keyboard--keyboard connector
 Mouse--mouse connector
input
20
parallel port
VGA port
UP3 BOARD
PS2 port
Cyclone chip
USB port
SRAM
serial port
FLASH
invalid input
voltage LED
on/off switch
user-definable
pushbuttons
user-definable
LEDs
power
user-definable
DIP switches
+3.3V
supply
LED
global reset
+5V
supply
LED
LC Display
http://users.ece.gatech.edu/~hamblen/UP3/ and
http://users.ece.gatech.edu/~hamblen/UP3/UP3%20Reference%20Manual.pdf
21
COMPONENT LCD_Display
PORT (Hex_Display_Data:
IN STD_LOGIC_VECTOR (Num_Hex_Digists*4)-1 DOWNTO 0;
reset, clock_48MHz: IN std_logic;
LCD_RS, LCD_E: OUT STD_LOGIC;
DATA_BUS: INOUT STD_LOGIC_VECTOR (7 DOWNTO 0);
END COMPONENT;
input 4 bits hex digit signal values to convert to ASCII hex digits and send to LED
display (note: Appendix D contains ASCII to hex table)
Num_Hex_Digits is a Generic parameter which can be given a value in a VHDL file
or in a schematic (16 characters, 2 lines available)
Outputs
LCD_RS
LCD_E
LCD_RW
DATA_BUS (7 DOWNTO 0):
PIN (important!)
108
50
73
113, 106, 104, 102, 100, 98, 96, 94
22
COMPONENT Debounce
PORT (pb, clk_100Hz:IN STD_LOGIC;
pb_debounced:OUT STD_LOGIC);
END COMPONENT;
pb is the input from a pushbutton (see I/O pins, chapter 2)
since pushbuttons have a mechanical “bounce”, this component samples the input
over several clock cycles and filters out the bounces; it will register the pushbutton
input only when several sequential samples of the input agree
the clock input is used by the bounce filter (see example below)
when “push” is registered, output goes low: it remains low until button is released
23
COMPONENT OnePulse
PORT (PB_debounced, clock:IN STD_LOGIC;
PB_single_pulse:OUT STD_LOGIC);
END COMPONENT;
after the push button signal is “debounced”, this component can be used to ensure
that the output read from the pushbutton is high for only one clock cycle, no matter
how long the pushbutton is held down
this is useful for building finite state machines--an edge-triggered flip-flop can be
used to build a state and each input will be active for only one clock cycle
the “clock” input is the clock signal being used to drive the state machine
24
COMPONENT Clk_Div
PORT (
clock_48MHz: IN STD_LOGIC;
clock_1MHz, clock_100KHz, clock_10KHz,
clock_1KHz, clock_100Hz, clock_10Hz, clock_1Hz:
OUT STD_LOGIC)
END COMPONENT;
the input is from the (48MHz) on-board clock (pin 29 for the Cyclone chip); JP3
jumper must be set to select the 48MHz USB—this the default setting
the outputs are clock signals of various frequencies which can be used in designs
Note: actual frequency will be
(listed frequency)*(1.007 +/- .005%)
25
Example:
pushbutton
fsm
Debounce
Clock
(pin
29)
OnePulse
Clock_100Hz
Clk_Div
Clock_1MHz
26
COMPONENT Mouse
PORT ( clock_48Mhz,reset: IN STD_LOGIC;
mouse_data, mouse_clk:INOUT STD_LOGIC;
left_button,right_button: OUT STD_LOGIC;
mouse_cursor_row,mouse_cursor_column:
OUT STD_LOGIC_VECTOR(9 DOWNTO 0);
END COMPONENT;
the input is from the (48MHz) on-board clock (pin 29 for the Cyclone chip);
mouse_data is pin 13, mouse_clk is pin 12: BIDIRECTIONAL
(also used for keyboard)
cursor outputs give postion in 640 x 480 pixel screen (VGA); cursor is initialized to
the middle of the screen
button outputs are high when the corresponding button is pushed
27
COMPONENT Keyboard
PORT
( keyboard_clk,keyboard_data, clock_48Mhz,
reset, read: IN STD_LOGIC;
scan_code: OUT STD_LOGIC_VECTOR(7 DOWNTO 0);
scan_ready: OUT STD_LOGIC);
END COMPONENT;
Reads PS/2 keyboard scan code; converts serial data from keyboard to parallel
clock input is from the (48MHz) on-board clock (pin 29 for the Cyclone chip);
keyboard_data is pin 13, keyboard_clk is pin 12: INPUTS
(also used for mouse)
read clears the scan_ready signal; reset clears flip-flops for serial-to-parallel conversion
scan_code: table of values in Table 11.3;
--”make” code: key is hit; “break” code: key is released
ex: ‘A’ make = 1C, break = F01C: ‘shift’ make = 12, break = F012
(if key is held down, several makes will be sent before a break)
scan_ready goes high when new scan code is sent and can be used to make sure each scan
28
code is read only once
COMPONENT VGA_Sync
PORT (clock_48MHz, red, green, blue: IN STD_LOGIC;
red_out, green_out, blue_out,
horiz_sync_out, vert_sync_out: OUT STD_LOGIC;
pixel_row, pixel_column: OUT STD_LOGIC_VECTOR(9 DOWNTO 0));
END COMPONENT;
clock_48MHz signal must come from pin 29 (Cyclone chip)
user logic generates the input “color” (red, green, blue)
Cyclone chip:
horiz_sync --> pin 226, vert_sync --> pin 227
red_out --> pin 228, green_out --> pin 122, blue_out --> pin 170
pixel_row and pixel_column give the pixel address
how many colors are available? how many pixels?
(“dithering”: one color on odd cycles, different on even  twice as many colors
example: pattern sent (even/odd cycles)
pattern observed
29
COMPONENT Char_ROM
PORT (clock: IN STD-logic;
character_address: IN STD_LOGIC_VECTOR (5 DOWNTO 0);
font_row, font_col: IN STD_LOGIC_VECTOR (2 DOWNTO 0);
row_mux_output: OUT STD_LOGIC);
END COMPONENT;
generates text for a video display--each character requires an 8 x 8 pixel pattern (see
codes, table 9.1--a memory initialization file, tcgrom.mif, is provided; the font data
can be stored in one M4K memory block)
character_address addresses the character to be displayed
font_row and font_col step through the 64 pixels (8x8) needed to display one
character
Clock loads the address register and should be tied to the video pixel_clock
row_mux_output is the pixel value to be output for this character at this position and
can be used to generate the correct RGB pixel color
30
How does output occur (examples: chapter 10):
monitor contains CRT (cathode ray tube)
screen consists of pixels, 640 in a row and 480 in a column (VGA format)
“refresh rate”: how quickly these pixels are scanned
640
standard rate is 60 times / second (60 Hz)
(human eye can detect “flicker” below 30Hz)
480
if there are 640 X 480 pixels, with a 60Hz refresh rate, how much time is available
to scan one pixel?
What clock speed is therefore required?
What is the onboard clock speed?
(note: UP3 has PLL which can be used to obtain faster refresh rates)
Sync signals tell when to start a new row or column
31
random number generation (Appendix A):
actually generates “pseudorandom” numbers
Q: what is the difference?
Method: example: n = 32--will give 32-bit pseudorandom sequence of bits
from table, read “XOR from bits 32,22,2,1” (bits are 32--1, not 31--0)
build a 32-bit shift register that shifts left one bit per cycle
next bit to be input into lsb should be the XOR of bits 32,22,2,1
this will generate a sequence in “pseudorandom order”
initial value in the register is the “seed”; 0 should not be used (why?)
32
Example:
n = 3--table gives bits 3,2
step
pattern (bit 3) xor (bit 2)
0
1
2
3
4
5
6
7
111
110
100
001
010
101
011
111
0
0
1
0
1
1
1
0---from here, the sequence will repeat
we have a sequence of the numbers 1-7: 7,6,4,1,2,5,3
this is the longest nonrepeating sequence we can have
order will always be the same, seed only determines where we start
33
How good are the random numbers generated?
Reference: Shruthi Narayanan, M.S. 2005, ATI Technologies
Hardware implementation of genetic algorithm modules for
intelligent systems:
Random numbers generated
by one shift register
Random numbers generated
by multiple shift registers
34