Transcript Document

Introduction to PLD.
Presented by:
‫בס''ד‬
Contents:






7/18/2015
Idea;
History;
High-Capacity PLD’s Architecture &
Overview of ALTERA PLD;
Computer aided design (CAD) flow for PLD
Introduction to VHDL/VerilogHDL;
Getting started.
2
‫בס''ד‬
Contents:






7/18/2015
Idea;
History;
High-Capacity FPGA’s Architecture &
Overview of ALTERA FPGA;
Computer aided design (CAD) flow for PLD
Introduction to VHDL/VerilogHDL;
Getting started.
3
‫בס''ד‬
Idea – Definition.
Today a chips are distributed into three groups:
ASIC’s
(Application
Specific
Integrated
Circuits)
7/18/2015
Chips with hardware
realization
of data processing
algorithms
(microprocessors
& microcontrollers)
FPGA & CPLD
4
‫בס''ד‬
Idea – Definition.
A programmable logic device can be defined as …
an integrated circuit containing configurable logic
and/or storage elements, which are linked together
using programmable interconnect
In general the following resources are
distinguished:
– Logic;
– Interconnect;
– I/O.
7/18/2015
5
‫בס''ד‬
Idea – Details.


One or more resources are
configurable;
Reprogrammable or in system
programmable (ISP);
– Reduces design and debug cycle
– Allows field upgrades of existing
logic systems
– A small amount of fault
tolerance
– Raises the possibility of
reconfigurable computing – but
there are still many problems to
be solved before this is realized


7/18/2015
Nowadays configuration of all
resources;
Special structures added to the
devices, like RAMs, DLLs, …
6
‫בס''ד‬
Idea - Advantages of PLDs.
The programmability offers:
Short development time;
Short turnaround time;
Rapid prototyping;
Flexibility with respect to engineering
change orders;
 Save board space;
 Small NRE (non recurring engineering )
costs.




7/18/2015
7
‫בס''ד‬
Contents:






7/18/2015
Idea;
History;
High-Capacity FPGA’s Architecture &
Overview of ALTERA FPGA;
Computer aided design (CAD) flow for PLD
Introduction to VHDL/VerilogHDL;
Getting started.
8
‫בס''ד‬
History – PLA.
The first device developed specifically for implementing logic
circuits was PLA.
A
B
C
Programmable switch or fuse
f1  A  B  C  A  B  C
OR plane
f2  A  B  A  B  C
AND plane
A PLA consists of two logic gates levels:
7/18/2015
 programmable “wired” OR-plane;
 programmable “wired” AND-plane.
9
‫בס''ד‬
History – PAL.
Programmable switch or fuse
A
B
C
f1  A  B  C  A  B  C
f2  A  B  A  B  C
AND plane
PALs feature only a single level of programmability,
consisting of a programmable “wired” AND-plane that feeds
fixed OR-gates.
7/18/2015
10
‫בס''ד‬
History – SPLD.
A
B
C
Select
Enable
f1
Flip-flop
D
Q
MUX
Clock
AND plane
All small PLDs, including PLAs, PALs, and PAL-like devices
are grouped into a single category called Simple PLDs
(SPLDs), whose most important characteristics are low cost
and very high pin-to-pin speed-performance.
7/18/2015
11
‫בס''ד‬
History – CPLD.
PLD
Block
•
•
•
•
•
•
I/O Block
PLD
Block
I/O Block
I/O Block
•
•
•
Interconnection Matrix
I/O Block
7/18/2015
•
•
•
PLD
Block
PLD
Block
The way to provide large capacity devices is to integrate
multiple SPLDs onto a single chip and provide
interconnect to programmable
SPLD blocks together. PLD that has this basic structure
are referred as Complex PLDs (CPLDs).
12
‫בס''ד‬
History - FPGA.
Logic block
I/O
FPGAs comprise an array of
I/O
I/O
uncommitted circuit elements,
called logic blocks, and
interconnect resources. FPGA
configuration is performed
through programming by the end
user.
Interconnection switches
FPGAs are only one type of PLD that supports very high
logic capacity, It has been responsible for a major shift in the
way digital circuits are designed!I/O
7/18/2015
13
‫בס''ד‬
High-Capacity PLDs Architecture.
Difference between FPGAs and CPLDs
PLD
CPLD
FPGA
• CPLDs often consist of a limited set • FPGAs are configured at a fine
grain level from many equivalent
of complex reconfigurable blocks logic blocks or array of configurable
Complex Programmable Logic Device
gates - Field Programmable Gate
consists of multiple SPLD blocks that
Array has narrower logic choices
are interconnected to realize larger
and more memory elements. LUT
digital systems
(Lookup Table) may replace actual
logic gates.
7/18/2015
14
‫בס''ד‬
High-Capacity PLDs Architecture.
Difference between FPGAs and CPLDs
CPLD
PLD
Block
I/O Block
PLD
Block
PLD
Block
I/O
•
•
•
Interconnection Matrix
•
•
•
I/O
I/O
•
•
•
I/O Block
PLD
Block
I/O Block
I/O Block
•
•
•
FPGA
I/O
7/18/2015
15
‫בס''ד‬
History - important terminology.
Definitions of Relevant Terminology:
7/18/2015

CPLD — a Complex PLD that consists of an arrangement of

FPGA — a Field-Programmable Gate Array is an PLD featuring a

Interconnect — the wiring resources in an PLD;

Programmable Switch — a user-programmable switch that can

Logic Cell

Macrocell
multiple SPLD-like blocks on a single chip;
general structure that allows very high logic capacity;
connect a logic element to an interconnect wire, or one interconnect
wire to another;
– The basic building block of an Altera device
– The basic building block of Product Term-based device
MAX9000, MAX7000
16
‫בס''ד‬
History - important terminology.

Logic Element
– The basic building block of Look-Up Table-based
device FLEX10K, FLEX8000, FLEX6000
7/18/2015
– A collection or group of logic cells

Logic Array Block (LAB)

Logic Capacity — the capacity of an PLD is measured by the size of

Logic Density — the amount of logic blocks per unit area in an PLD.

Speed-Performance — measures the maximum operable speed of a
gate array which is comparable to logic capacity. It can be thought as
“number of 2-input NAND gates”;
circuit when implemented in an PLD.
17
‫בס''ד‬
History - Overview.
1967
- Fairchild’s “Micromosaic”
Early 70’s- First appearance of PLDs
Late 70’s - Arrival of CPLDs
1985
- First SRAM based devices
Late 80’s - Arrival of FPGAs
1991
- ISP (Lattice)
1998
- First 1 million gates device
2000
- First 3 million gates device
2005
- First 10 million gates device
The chart summarizes the categories of FPDs
by listing the logic capacities available in each
of the three categories.
7/18/2015
18
‫בס''ד‬
Contents:






7/18/2015
Idea;
History;
High-Capacity FPGA’s Architecture &
Overview of ALTERA FPGA;
Computer aided design (CAD) flow for PLD
Introduction to VHDL/VerilogHDL;
Getting started.
19
‫בס''ד‬
FPGA - Generic Structure
FPGA building blocks:
Logic
block
 Programmable logic blocks
Implement combinatorial and sequential
logic. Based on programmable logic
(LUT) and DFF. Look Up Tables made
from small RAM cells. Programmable
logic blocks can also be used as small
memory blocks
Interconnection switches
I/O
 Programmable interconnect
I/O
I/O
Wires to connect inputs and outputs to
logic blocks. Programmable interconnects
using switching matrixes. Several types of
interconnects:
 clocks,
 short distance local connections,
 long distance connections across chip
 Programmable I/O blocks
Special logic blocks at the periphery of
device for external connections. I/O
buffers have various voltage support and
tri-state option.
7/18/2015
I/O
20
‫בס''ד‬
High-Capacity PLDs Architecture.
Today’s FPGA Devices Meet
Embedded System Requirements
 Embedded RAM
 Wide range of fast I/O
 High-performance




Digital Signal Processing
(DSP) blocks
Abundant logic
Substantial embedded memory
Low Cost FPGA and
Structured ASIC families
Soft Processor cores
7/18/2015
21
‫בס''ד‬
ALTERA Cyclone II FPGA
Overview
Highest Density
• 68K Logic Elements
(700K ASIC Gates)
Highest Performance
High-Performance DSP
• 250 MHz Performance
Embedded Systems
• Software Nios II Processor
7/18/2015
22
‫בס''ד‬
ALTERA Cyclone II FPGA
Architecture
Features:
 90-nm dielectric process
 High density architecture with 4,608 to 68,416 LEs
 Up to 1.1 Mbits of RAM
– True dual-port operations (one read one write, two reads or two
writes) for x1, x2, x4, x8, x16 and x18 modes
– Variable port configurations (x1, x2, x4, x8, x16, x32 and x36)
– Up to 260MHz operation
 Embedded Multipliers
 Advanced I/O support
 Flexible clock management circuitry
– Hierarchical clock network for up to 402.5 MHz
– Up to 4 PLLs
– Up to 16 global clock lines
7/18/2015
23
‫בס''ד‬
Overview of ALTERA Cyclone II Device
Logic Array
M4K Memory
Blocks
Top & Bottom
I/O Elements with
Support for
Memory Interfaces
7/18/2015
Embedded
Multipliers
Side I/O
Elements with
Support for
PCI/PCI-X
& Memory
Interfaces
Phase-Locked
Loops
24
‫בס''ד‬
High-Capacity FPDs Architecture.
Configurable logic
LUT
MUX
1
D0
1
D1
0
D2
1
Flip-flop
Y
D
Q
S0 Clock
D3 S1
CLR
 Logic resources arranged in arrays or in rows.
 Logic blocks hold large or small amount of logic (fine
grained architecture)
7/18/2015
25
‫בס''ד‬
Look-Up Tables (LUT)
 LUT with N-inputs can be used to implement any
combinatorial function of N inputs
 LUT is programmed with the truth-table
A
B
C
D
Z
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
0
1
1
1
0
1
1
1
0
1
1
1
0
0
0
A
B
C
D
LUT
Z
LUT implementation
A
B
Z
C
D
Truth-table
7/18/2015
Gate implementation
26
‫בס''ד‬
LUT Implementation
X1
X2
 Example: 3-input LUT
0/1
 Based on multiplexers (pass
0/1
transistors)
 LUT entries stored in
configuration memory cells
0/1
0/1
0/1
F
0/1
Configuration memory
cells
0/1
0/1
X3
7/18/2015
27
ALTERA Cyclone II
‫בס''ד‬
Look-up table (LUT) to implement combinatorial logic
Register for sequential circuits
Additional logic (below):
•
Carry logic for arithmetic functions
•
Expansion logic for functions requiring more than 4
inputs
Logic Element
LUT
Chain
Carry
In0
Carry
In1
Register
Chain
Local
Routing
In1
In2
In3
In4
Clock
LUT
REG
General
Routing
General
Routing
Carry Carry
Register
Out0 Out1
Chain
The basic logic block, called a Logic Element (LE) contains a four input LUT, a flip-flop, and specialpurpose carry circuitry for arithmetic circuits. The LE also includes cascade circuitry that allows efficient
implementation of wide AND functions.
7/18/2015
28
‫בס''ד‬
Logic Element: Normal Mode
LUT Chain
Input
addnsub
Register
Chain Input
Register Control
Signals
cin
data1
(2)
data2
data3
4-Input
LUT
Sync Load
& Clear
Logic
data4
D
DATA
Row, Column
& DirectLink
Routing
Local Routing
Register
Feedback
LUT Chain
modeOutput
is suitable for
7/18/2015
Register Chain
Output
general logic applications
and
The normal
wide
decoding functions that can take advantage of a cascade chain. In normal
mode, four data inputs from the LAB local interconnect and the carry-in
signal are the inputs to a 4-input LUT.
29
‫בס''ד‬
Logic Element: Dynamic Arithmetic
Mode
LAB Carry-In
Carry-In0
Carry-In1
Carry-In
Logic
Register Register Control
Chain Input
Signals
addnsub
data1
data2
data3
Sum
Calculator
Sync Load
& Clear
Logic
D
DATA
Carry
Calculator
Carry-In0
Carry-In1
7/18/2015
Carry-Out
Logic
Row, Column
& DirectLink
Routing
Local Routing
Register Chain
Output
The arithmetic
mode offers two 3-input LUTs that are ideal for
Carry-Out1
Carry-Out0 adders, accumulators, and comparators. One LUT
implementing
provides a 3-bit function; the other generates a carry bit.
30
‫בס''ד‬
Dynamic Arithmetic Mode
Full adder
S = (A xor B) xor C
Co = (A * B) + (C*(A xor B))
A
B
Ci
Co
S
0
0
0
0
0
0
1
0
0
1
0
1
2
0
1
0
0
1
3
0
1
1
1
0
4
1
0
0
0
1
5
1
0
1
1
0
6
1
1
0
1
0
7
1
1
1
1
1
A
B
S
Ci
Co
7/18/2015
31
‫בס''ד‬
Logic Element: Dynamic Arithmetic
Mode
LAB Carry-In
Carry-In0
Carry-In1
Carry-In
Logic
Register Register Control
Chain Input
Signals
A
B
Ci
Co
addnsub
data1
data2
data3
Sum
Calculator
Sync Load
& Clear
Logic
D
DATA
Carry
Calculator
Carry-In0
Carry-In1
7/18/2015
S
Carry-Out
Logic
Carry-Out1
Carry-Out0
Row, Column
& DirectLink
Routing
Local Routing
Register Chain
Output
32
‫בס''ד‬
Counters
D2
D0
D1
Ripple counter
Flip-flop
D
Q
Flip-flop
Flip-flop
D
clk
R
A
Q
DB
Q
Ripple carry counter is not
recommended in FPGA designs due
to their asynchronous nature
R
R
Reset
D2
D0 = Q0 xor Cin
C0 = Q0 and Cin
Synchronous design
7/18/2015
D0
COUT
CIN
Flip-flop
Ripple-carry
counter
D1
D
Flip-flop
Q
R
D
Q
R
Reset
Flip-flop
D
Q
R
clk
33
‫בס''ד‬
Dynamic Arithmetic Mode
Ripple-carry counter
LAB Carry-In
Carry-In0
Carry-In1
Carry-In
Logic
Register Register Control
Chain Input
Signals
D
D
2
1
Flip-flop
D Q
7/18/2015
R
Flip-flop
D Q
R
clk
Sum
Calculator
Sync Load
& Clear
Logic
D
DATA
Carry
Calculator
Carry-In0
Carry-In1
Flip-flop
D Q
Reset
addnsub
data1
Carry-Out
Logic
Carry-Out1
Carry-Out0
0
CIN
R
data2
data3
D
COUT
Row, Column
& DirectLink
Routing
Local Routing
Register Chain
Output
34
Dynamic Arithmetic Mode
‫בס''ד‬
Comparator
A=B if A(0) = B(0) and A(1) = B(1) … and A(n-1)=B(n-1)
or
AeqB = (A(0) xnor B(0)) and (A(1) xnor B(1)) and ….etc.
B(2) A(2) B(1) A(1) B(0) A(0)
B(2) A(2)
B(1) A(1)
B(0) A(0)
A eq B
A eq B
7/18/2015
35
‫בס''ד‬
Logic Array Blocks (LAB)





16 LEs
Local Interconnect
LAB Control Signals
LE carry chains
Register chains
Direct link
interconnect
to left (up to
48 LEs)
30 LAB Input Lines
10 LE Feedback Lines
7/18/2015
4
Fast Local Interconnect
Direct link interconnect
from left and right LAB,
MK4 memory block,
embedded multiplier,
PLL or IOE output
4
4
4
4
4
4
4
LE1
LE2
Control Signals:

2 CLK

2 CLK EN

2 ACLR

1 SCLR

1 SLOAD
LE3
LE4
Direct link
interconnect
to right (up
to 48 LEs)
LE13
LE14
LE15
LE16
36
‫בס''ד‬
High-Capacity PLDs Architecture.
Several types of configurable interconnects
Before Programming
After Programming
Switch matrix
 6 pass transistors per switch matrix
interconnect point
 Pass transistors act as programmable
switches
 Pass transistor gates are driven by
configuration memory cells
7/18/2015
Interconnect point
37
‫בס''ד‬
Programmable Interconnect
Interconnect hierarchy (not shown)
– Fast local interconnect
– Horizontal and vertical lines of various lengths
LE
LE
Switch
Matrix
LE
7/18/2015
LE
Switch
Matrix
LE
LE
38
‫בס''ד‬
High-Capacity PLDs Architecture.
Several types of I/O
Configurability of the user I/O varies to a great extent
there are dedicated pins for power and configuration some
of the user I/O pins may be reserved for special function
during configuration.
–
–
–
–
–
7/18/2015
In/out/tri-state;
Flip-flops, latches;
Pull-up/pull-down;
DDR;
Series resistors;
–
–
–
–
Bus keeper;
Drive strength control;
Slew rate control;
Single ended/differential.
39
‫בס''ד‬
High-Capacity FPDs Architecture.
Several types of I/O
Flip-flop
From array
D
Enable
Q


Clock
Flip-flop

Q

To array
D
Clock
7/18/2015

Several low-voltage I/O
standards;
Mixed voltage I/O bank
capability;
Delay lines;
Boundary scan (JTAG);
Differential I/O.
40
‫בס''ד‬
Basic I/O Block Structure
Three-State
D
Q
Three-State
Control
Clock
Output
D
Q
Output Path
Direct Input
Registered
Input
7/18/2015
Input Path
Q
D
41
‫בס''ד‬
I/O Bank I/O Bank Numbers &
Locations
7/18/2015
42
‫בס''ד‬
High-Capacity FPDs Architecture.
Special structures
1.
On chip RAMs and ROMs:
–
–
–
2.
Clock management - on chip DLLs or PLLs:
–
–
–
–
3.
7/18/2015
High end devices have up to 8 DLLs/PLLs on chip;
Used to deskew on/off chip clock signals (e. g. to RAM banks);
Provide clock division and multiplication capabilities;
DLLs have a minimum operating frequency.
DSP options and applications:
•
•
4.
5.
6.
7.
Nearly all vendors offer devices with on chip RAM Blocks;
RAM blocks may be cascaded;
RAM blocks can be configured in different ways (single ported,
dual ported, synchronous, asynchronous, CAM).
On chip Multipliers;
On chip MACs.
On chip Microprocessor Cores;
Support for various interface standards
High-speed serial I/Os
Boundary Scan/JTAG.
43
‫בס''ד‬
CycloneII Embedded Memory
 4-Kbit Blocks
–
–
–
–
–
250-MHz Performance
Fully Synchronous
True Dual-Port Mode
Simple Dual-Port Mode
Single-Port memory
Port A
DATA
ADDR
WREN
CLK
CLKENA
OUT
CLR
Port B
DATA
ADDR
WREN
CLK
CLKENA
OUT
CLR
 Flexible Capabilities
–
–
–
–
–
–
7/18/2015
Mixed-Clock Mode
Mixed-Width Mode
Shift Register Mode
Read-Only Mode
Byte Enables
Initialization Support
44
‫בס''ד‬
Block RAM Port Aspect Ratios
1
2
4
8
512 x 8
1k x 4
8+1
2k x 2
512k x (8+1)
16
256 x 16
16+2
4k x 1
256 x (16+2)
32
128 x 32
32+4
7/18/2015
128 x (32+4)
45
‫בס''ד‬
Overview of ALTERA PLDs.
Embedded Array Block
EAB size is flexible
7/18/2015
Combine EABs to create larger blocks
46
‫בס''ד‬
Global Clock Network & Phase-Locked
Loops
 Clock management is important within digital
systems design
– High speed designs require low latency, low skew clock
solutions
• Low latency – a minimum propagation delay time throughout
the device
• Low skew – a minimum difference between actual clock edges
as seen on various points on the device
– Sources for clock skew?
 Cyclone II devices provide the following for clock
management
– A global clock network
– Up to four phase-locked loops (PLLs)
7/18/2015
48
‫בס''ד‬
PLLs and global clock network
4
2
GCLK
1
3
CLK[15..12]
Parameter
Cyclone II PLL
Input Frequency Range
11 to 311 MHz
Output Frequency
Range
10 to 402.5 MHz
Time to Lock from
Power up
1 ms
VCO Operating Range
300 MHz to 1 GHz
 16 Total Nets
 Used as Clock Sources for All
CLK[3..0]
7/18/2015
CLK[11..8]
CLK[7..4]
Device Blocks
 Fed by
– Dedicated Clock Pins
– PLL Outputs
– Internal Logic
49
‫בס''ד‬
Phase-Locked Loops (PLLs)
 A PLL is a closed-loop feedback control system that
maintains a generated signal in a fixed phase relationship
to a reference signal
 Applications include:
– Frequency synthesizers for digitally-tuned radio receivers and
transmitters
– FM and AM radio signal demodulation
– Clock multipliers in digital systems that allow internal elements to
run faster (or slower) than external connections, while maintaining
precise timing relationships (our basic application in this course)
 Cyclone II PLLs provide general-purpose clocking with
clock multiplication and phase shifting as well as outputs
for differential I/O support
7/18/2015
50
Cyclone II PLL Details
‫בס''ד‬
Basic PLL Operation
Lock
Detect
Reference
Clock
N
CP
LF
PFD
M
I/O &
Global
Routing
VCO
G0
G1
Global
Clock
Network
EG
I/O
Buffer
 The main purpose of a PLL is to synchronize the phase and frequency of a
voltage controlled oscillator (VCO) to an input reference clock
 There are a number of components that comprise a PLL to achieve this phase
alignment
– The PLL compares the rising edge of the reference input clock to a feedback clock
using a phase-frequency detector (PFD)
– The PFD produces an up or down signal that determines whether the VCO needs to
operate at a higher or lower frequency
– The PFD output is applied to a charge pump and loop filter, which produces a control
voltage for setting the frequency of the VCO
7/18/2015
• If the PFD transitions the up signal high, then the VCO frequency increases
• If the PFD transitions the down signal high, then the VCO frequency decreases
51
‫בס''ד‬
Cyclone II PLL Details
Basic PLL Operation
 The loop filter converts these up/down signals to a voltage that is used
to bias the VCO
– If the charge pump receives a logic high on the up signal, current is driven
into the loop filter
– If the charge pump receives a logic high on the down signal, current is
drawn from the loop filter
 The voltage from the charge pump determines how fast the VCO
operates
 The VCO is implemented as an four-stage differential ring oscillator
– A divide counter, m, is inserted in the feedback loop to increase the VCO
frequency above the input reference frequency, making the VCO
frequency
fVCO = m × fREF
 The feedback clock, fFB, applied to one input of the PFD, is locked to
the input reference clock, fREF (fIN/n), applied to the other input of the
PFD
 The VCO output can feed up to three post-scale counters (c0, c1, c2)
7/18/2015
– These post-scale counters allow a number of harmonically related
frequencies to be produced by the PLL
52
‫בס''ד‬
CycloneII Embedded
Multiplier
X
Y
18
36
Output Registers
18
Input Registers
Sign_X
36
Sign_Y
Clock
Clear
250-MHz Performance
7/18/2015
Note: Fastest Speed Grade with Registers Activated in 18x18 or 9x9 Mode
59
‫בס''ד‬
High-Capacity FPDs Architecture.
Programming technologies
An EEPROM or EPROM transistor is used as a programmable switch for CPLDs
(and also for many SPLDs) by placing the transistor between two wires in a way that
facilitates implementation of wired-AND functions.
7/18/2015
61
‫בס''ד‬
High-Capacity FPDs Architecture.
Programming technologies
FPGA products are based either on
SRAM or antifuse technologies. An
example of usage of SRAMcontrolled switches showing two
applications of SRAM cells: for
controlling the gate nodes of passtransistor switches and to control the
select lines of multiplexers that drive
logic block inputs.
7/18/2015
62
‫בס''ד‬
Contents:






7/18/2015
Idea;
History;
High-Capacity FPGA’s Architecture &
Overview of ALTERA FPGA;
Computer aided design (CAD) flow for PLD;
Introduction to VHDL/VerilogHDL;
Getting started.
63
‫בס''ד‬
Computer aided design (CAD) flow for PLD
CAD tools are important not only for complex devices like CPLDs and
FPGAs, but also for SPLDs. A typical CAD system for PLDs would include
software for the following tasks: initial design entry, logic optimization, device
fitting, simulation, and configuration.
Text
entry
Marge &
translate
Optimize
equations
Device
fitter
Simulate
PLD
Schematic
capture
Evaluation Board
7/18/2015
HW (Board)
Simulation & Test
Programming
unit
Configuration
file
64
‫בס''ד‬
Quartus II Development System
 Fully-Integrated Design Tool
 Multiple Design Entry Methods
 Logic Synthesis
 Place & Route
 Simulation
 Timing & Power Analysis
 Device Programming
‫בס''ד‬
Computer aided design (CAD) flow for PLD
Text entry and Schematic Capture
7/18/2015
66
‫בס''ד‬
Computer aided design (CAD) flow for PLD
Marge & translate
LUT0
LUT4
LUT1
LUT5
FF1
LUT2
LUT3
7/18/2015
FF2
67
‫בס''ד‬
Computer aided design (CAD) flow for PLD
Marge & translate
LUT0
LUT4
LUT1
LUT5
FF1
LUT2
LUT3
7/18/2015
FF2
68
‫בס''ד‬
Computer aided design (CAD) flow for PLD
Optimize equations
FPGA
7/18/2015
69
‫בס''ד‬
Computer aided design (CAD) flow for PLD
Device fitter : Placing & Routing
FPGA
Programmable
Connections
7/18/2015
70
‫בס''ד‬
Computer aided design (CAD) flow for PLD
Simulate PLD
7/18/2015
71
‫בס''ד‬
Contents:






7/18/2015
Idea;
History;
High-Capacity FPGA’s Architecture;
Computer aided design (CAD) flow for PLD;
Introduction to VHDL/VerilogHDL;
Getting started.
72
VHDL Coding for Synthesis
‫בס''ד‬
Non-synthesizable VHDL
Delays:
Are not synthesizable. Statements, such as
wait for 5 ns
a <= b after 10 ns
will not produce the required delay, and should not be used in the code
intended for synthesis.
Initializations:
Declarations of signals (and variables) with initialized values, such as
SIGNAL a : STD_LOGIC := ‘0’;
cannot be synthesized, and thus should be avoided. If present, they will be
ignored by the synthesis tools.
Use set and reset signals instead.
7/18/2015
76
‫בס''ד‬
Synthesizable VHDL
Register Transfer Level (RTL) Design Description
Flip-flop
From array
D
Clock
Q
Flip-flop
Combinational
Logic
D
Clock
Q
Combinational
Logic
…
Registers
7/18/2015
78
‫בס''ד‬
VHDL Design Styles
VHDL Design
Styles
dataflow
Concurrent
statements
structural
Components and
Sequential statements
interconnects • Registers
synthesizable
7/18/2015
behavioral
• Shift registers
• Counters
• State machines
and more
if you are careful
79
Combinational Logic Synthesis for
Beginners
‫בס''ד‬
Simple rules for beginners
For combinational logic, use only concurrent
statements
• concurrent signal assignment ()
• conditional concurrent signal assignment
(when-else)
• selected concurrent signal assignment (withselect-when)
• generate scheme for equations (for-generate)
7/18/2015
81
‫בס''ד‬
Simple rules for beginners
For circuits composed of
- simple logic operations (logic gates)
- simple arithmetic operations (addition,
subtraction, multiplication)
- shifts/rotations by a constant
Use
• concurrent signal assignment
7/18/2015
()
82
‫בס''ד‬
Simple rules for beginners
For circuits composed of
- multiplexers
- decoders, encoders
- tri-state buffers
use:
• conditional concurrent signal assignment (when-else)
• selected concurrent signal assignment (with-select-when)
7/18/2015
83
‫בס''ד‬
Left vs. right side of the assignment
Left side
<=
Right side
<= when-else
with-select <=
• Internal signals
(defined
in a given
architecture)
• Ports of the mode
- out
- inout
- buffer
7/18/2015
Expressions including:
• Internal signals (defined
in a given architecture)
• Ports of the mode
- in
- inout
- buffer
84
‫בס''ד‬
Arithmetic operations
Synthesizable arithmetic operations:
 Addition: +
 Subtraction: -
 Comparisons: >, >=, <, <=
 Multiplication: *
 Division by a power of 2, /2**6
(equivalent to right shift)
 Shifts by a constant, SHL, SHR
7/18/2015
85
‫בס''ד‬
Arithmetic operations
The result of synthesis of an arithmetic operation is a
- combinational circuit without pipelining.
The exact internal architecture used (and thus delay
and area of the circuit)may depend on the timing
constraints specified during synthesis (e.g., the
requested maximum clock frequency).
7/18/2015
86
‫בס''ד‬
Operations on Unsigned Numbers
For operations on unsigned numbers:
USE ieee.std_logic_unsigned.all
and signals (inputs/outputs) of the type
STD_LOGIC_VECTOR
or
USE ieee.std_logic_arith.all
and signals (inputs/outputs) of the type
UNSIGNED
7/18/2015
87
‫בס''ד‬
Operations on Signed Numbers
For operations on signed numbers
USE ieee.std_logic_signed.all
and
signals (inputs/outputs) of the type
STD_LOGIC_VECTOR
or
USE ieee.std_logic_arith.all
and
signals (inputs/outputs) of the type
SIGNED
7/18/2015
88
‫בס''ד‬
Integer Types
Operations on signals (variables) of the integer types:
INTEGER, NATURAL,
and their sybtypes, such as
TYPE day_of_month IS RANGE 0 TO 31;
are synthesizable in the range
-(231-1) .. 231 -1 for INTEGERs and their subtypes
0 .. 231 -1 for NATURALs and their subtypes
7/18/2015
Operations on signals (variables) of these types:
INTEGER, NATURAL,
are less flexible and more difficult to control than operations on
signals (variables) of the type
STD_LOGIC_VECTOR
UNSIGNED
SIGNED,
and thus are recommened to be avoided by beginners.
90
‫בס''ד‬
Addition of Signed Numbers (1)
LIBRARY ieee ;
USE ieee.std_logic_1164.all ;
USE ieee.std_logic_signed.all ;
ENTITY adder16 IS
PORT ( Cin
X, Y
S
Cout, Overflow
END adder16 ;
: IN
: IN
: OUT
: OUT
STD_LOGIC ;
STD_LOGIC_VECTOR(15 DOWNTO 0) ;
STD_LOGIC_VECTOR(15 DOWNTO 0) ;
STD_LOGIC ) ;
ARCHITECTURE Behavior OF adder16 IS
SIGNAL Sum : STD_LOGIC_VECTOR(16 DOWNTO 0) ;
BEGIN
Sum <= ('0' & X) + Y + Cin ;
S <= Sum(15 DOWNTO 0) ;
Cout <= Sum(16) ;
Overflow <= Sum(16) XOR X(15) XOR Y(15) XOR Sum(15) ;
END Behavior ;
7/18/2015
91
‫בס''ד‬
Addition of Signed Numbers (2)
LIBRARY ieee ;
USE ieee.std_logic_1164.all ;
USE ieee.std_logic_arith.all ;
ENTITY adder16 IS
PORT ( Cin
X, Y
S
Cout, Overflow
END adder16 ;
: IN
: IN
: OUT
: OUT
STD_LOGIC ;
SIGNED (15 DOWNTO 0) ;
SIGNED (15 DOWNTO 0) ;
STD_LOGIC ) ;
ARCHITECTURE Behavior OF adder16 IS
SIGNAL Sum : SIGNED(16 DOWNTO 0) ;
BEGIN
Sum <= ('0' & X) + Y + Cin ;
S <= Sum(15 DOWNTO 0) ;
Cout <= Sum(16) ;
Overflow <= Sum(16) XOR X(15) XOR Y(15) XOR Sum(15) ;
7/18/2015 END Behavior ;
92
‫בס''ד‬
Addition of Signed Numbers (3)
ENTITY adder16 IS
PORT ( X, Y
S
END adder16 ;
: IN
: OUT
INTEGER RANGE -32768 TO 32767;
INTEGER RANGE -32768 TO 32767 );
ARCHITECTURE Behavior OF adder16 IS
BEGIN
S <= X + Y ;
END Behavior ;
7/18/2015
93
‫בס''ד‬
Addition of Unsigned Numbers
LIBRARY ieee ;
USE ieee.std_logic_1164.all ;
USE ieee.std_logic_unsigned.all ;
ENTITY adder16 IS
PORT ( Cin
X, Y
S
Cout
END adder16 ;
: IN
: IN
: OUT
: OUT
STD_LOGIC ;
STD_LOGIC_VECTOR(15 DOWNTO 0) ;
STD_LOGIC_VECTOR(15 DOWNTO 0) ;
STD_LOGIC ) ;
ARCHITECTURE Behavior OF adder16 IS
SIGNAL Sum : STD_LOGIC_VECTOR(16 DOWNTO 0) ;
BEGIN
Sum <= ('0' & X) + Y + Cin ;
S <= Sum(15 DOWNTO 0) ;
Cout <= Sum(16) ;
END Behavior ;
7/18/2015
94
‫בס''ד‬
Multiplication of signed and
unsigned numbers (1)
LIBRARY ieee;
USE ieee.std_logic_1164.all;
USE ieee.std_logic_arith.all ;
entity multiply is
port( a : in STD_LOGIC_VECTOR(15 downto 0);
b : in STD_LOGIC_VECTOR(7 downto 0);
cu : out STD_LOGIC_VECTOR(11 downto 0);
cs : out STD_LOGIC_VECTOR(11 downto 0));
end multiply;
architecture dataflow of multiply is
SIGNAL sa: SIGNED(15 downto 0);
SIGNAL sb: SIGNED(7 downto 0);
SIGNAL sres: SIGNED(23 downto 0);
SIGNAL sc: SIGNED(11 downto 0);
7/18/2015
SIGNAL ua: UNSIGNED(15 downto 0);
SIGNAL ub: UNSIGNED(7 downto 0);
SIGNAL ures: UNSIGNED(23 downto 0);
SIGNAL uc: UNSIGNED(11 downto 0);
95
Multiplication of signed and
unsigned numbers (2)
‫בס''ד‬
begin
-- signed multiplication
sa <= SIGNED(a);
sb <= SIGNED(b);
sres <= sa * sb;
sc <= sres(11 downto 0);
cs <= STD_LOGIC_VECTOR(sc);
-- unsigned multiplication
ua <= UNSIGNED(a);
ub <= UNSIGNED(b);
ures <= ua * ub;
uc <= ures(11 downto 0);
cu <= STD_LOGIC_VECTOR(uc);
7/18/2015
end dataflow;
96
Describing combinational logic
using processes
‫בס''ד‬
LIBRARY ieee ;
USE ieee.std_logic_1164.all ;
ENTITY dec2to4 IS
PORT ( w
: IN
En : IN
y
: OUT
END dec2to4 ;
7/18/2015
STD_LOGIC_VECTOR(1 DOWNTO 0) ;
STD_LOGIC ;
STD_LOGIC_VECTOR(0 TO 3) ) ;
ARCHITECTURE Behavior OF dec2to4 IS
BEGIN
PROCESS ( w, En )
BEGIN
IF En = '1' THEN
CASE w IS
WHEN "00" =>
WHEN "01" =>
WHEN "10" =>
WHEN OTHERS =>
END CASE ;
ELSE
y <= "0000" ;
END IF ;
END PROCESS ;
END Behavior ;
y <= "1000" ;
y <= "0100" ;
y <= "0010" ;
y <= "0001" ;
97
Describing combinational logic
using processes
‫בס''ד‬
7/18/2015
LIBRARY ieee ;
USE ieee.std_logic_1164.all ;
ENTITY seg7 IS
PORT ( bcd : IN
STD_LOGIC_VECTOR(3 DOWNTO 0) ;
leds : OUT
STD_LOGIC_VECTOR(1 TO 7) ) ;
END seg7 ;
ARCHITECTURE Behavior OF seg7 IS
BEGIN
PROCESS ( bcd )
BEGIN
CASE bcd IS
-abcdefg
WHEN "0000" => leds
<= "1111110" ;
WHEN "0001" => leds
<= "0110000" ;
WHEN "0010" => leds
<= "1101101" ;
WHEN "0011" => leds
<= "1111001" ;
WHEN "0100" => leds
<= "0110011" ;
WHEN "0101" => leds
<= "1011011" ;
WHEN "0110" => leds
<= "1011111" ;
WHEN "0111" => leds
<= "1110000" ;
WHEN "1000" => leds
<=
"1111111" ;
WHEN "1001" => leds
<=
"1110011" ;
WHEN OTHERS => leds
<=
“zzzzzzz" ;
END CASE ;
END PROCESS ;
END Behavior ;
98
Describing combinational logic
using processes
‫בס''ד‬
LIBRARY ieee ;
USE ieee.std_logic_1164.all ;
ENTITY compare1 IS
PORT ( A, B : IN
AeqB : OUT
END compare1 ;
STD_LOGIC ;
STD_LOGIC ) ;
ARCHITECTURE Behavior OF compare1 IS
BEGIN
PROCESS ( A, B )
BEGIN
AeqB <= '0' ;
IF A = B THEN
AeqB <= '1' ;
END IF ;
END PROCESS ;
END Behavior ;
7/18/2015
99
Incorrect code for combinational logic
- Implied latch (1)
‫בס''ד‬
LIBRARY ieee ;
USE ieee.std_logic_1164.all ;
ENTITY implied IS
PORT ( A, B : IN
AeqB : OUT
END implied ;
7/18/2015
STD_LOGIC ;
STD_LOGIC ) ;
ARCHITECTURE Behavior OF implied IS
BEGIN
PROCESS ( A, B )
BEGIN
IF A = B THEN
AeqB <= '1' ;
END IF ;
END PROCESS ;
END Behavior ;
100
Incorrect code for combinational
logic - Implied latch (2)
‫בס''ד‬
A
B
7/18/2015
AeqB
101
Describing combinational logic using
processes
‫בס''ד‬
Rules that need to be followed:
1. All inputs to the combinational circuit should be included
in the sensitivity list
2. No other signals should be included in the sensitivity list
3. None of the statements within the process should be
sensitive to rising or falling edges
4. All possible cases need to be covered in the internal
IF and CASE statements in order to avoid implied latches
7/18/2015
102
‫בס''ד‬
Covering all cases in the IF
statement
Using ELSE
IF A = B THEN
AeqB <= '1' ;
ELSE
AeqB <= '0' ;
Using default values
AeqB <= '0' ;
IF A = B THEN
AeqB <= '1' ;
7/18/2015
103
‫בס''ד‬
Covering all cases in the CASE
statement
Using WHEN OTHERS
CASE y IS
WHEN S1 => Z <= "10";
WHEN S2 => Z <= "01";
WHEN OTHERS => Z <= "00";
END CASE;
CASE y IS
WHEN S1 => Z <= "10";
WHEN S2 => Z <= "01";
WHEN S3 => Z <= "00";
WHEN OTHERS => Z <= „ZZ";
END CASE;
Using default values
7/18/2015
Z <= "00";
CASE y IS
WHEN S1 => Z <= "10";
WHEN S2 => Z <= "10";
END CASE;
104
Combinational Logic Synthesis for
Advanced
‫בס''ד‬
Advanced VHDL for synthesis
For complex, generic, and/or regular circuits
you may consider using
PROCESSES with internal
VARIABLES and
FOR LOOPs
7/18/2015
106
‫בס''ד‬
N-bit NAND
LIBRARY ieee;
USE ieee.std_logic_1164.all;
ENTITY NANDn IS
GENERIC (n: INTEGER := 8)
PORT ( X : IN STD_LOGIC_VECTOR(1 TO n);
Y : OUT STD_LOGIC);
END NANDn;
7/18/2015
107
‫בס''ד‬
N-bit NAND architecture
using variables
ARCHITECTURE behavioral1 OF NANDn IS
BEGIN
PROCESS (X)
VARIABLE Tmp: STD_LOGIC;
BEGIN
Tmp := X(1);
AND_bits: FOR i IN 2 TO n LOOP
Tmp := Tmp AND X( i ) ;
END LOOP AND_bits ;
Y <= NOT Tmp ;
END PROCESS;
7/18/2015
END behavioral1 ;
108
‫בס''ד‬
Incorrect N-bit NAND architecture
using signals
ARCHITECTURE behavioral2 OF NANDn IS
SIGNAL Tmp: STD_LOGIC;
BEGIN
PROCESS (X)
BEGIN
Tmp <= X(1);
AND_bits: FOR i IN 2 TO n LOOP
Tmp <= Tmp AND X( i ) ;
END LOOP AND_bits ;
Y <= NOT Tmp ;
END PROCESS;
7/18/2015
END behavioral2 ;
109
‫בס''ד‬
Correct N-bit NAND architecture
using signals
ARCHITECTURE dataflow1 OF NANDn IS
SIGNAL Tmp: STD_LOGIC_VECTOR(1 TO n);
BEGIN
Tmp(1) <= X(1);
AND_bits: FOR i IN 2 TO n GENERATE
Tmp(i) <= Tmp(i-1) AND X( i ) ;
END LOOP AND_bits ;
Y <= NOT Tmp(n) ;
END dataflow1 ;
7/18/2015
110
‫בס''ד‬
Parity generator entity
LIBRARY ieee;
USE ieee.std_logic_1164.all;
ENTITY oddParityLoop IS
GENERIC ( width : INTEGER := 8 );
PORT ( ad : in STD_LOGIC_VECTOR (width - 1 DOWNTO 0);
oddParity : out STD_LOGIC ) ;
END oddParityLoop ;
7/18/2015
ARCHITECTURE dataflow OF oddParityGen IS
SIGNAL genXor: STD_LOGIC_VECTOR(width DOWNTO 0);
BEGIN
genXor(0) <= '0';
parTree: FOR i IN 1 TO width GENERATE
genXor(i) <= genXor(i - 1) XOR ad(i - 1);
END GENERATE;
oddParity <= genXor(width) ;
END dataflow ;
111
Parity generator architecture
using variables
‫בס''ד‬
ARCHITECTURE behavioral OF oddParityLoop IS
BEGIN
PROCESS (ad)
VARIABLE loopXor: STD_LOGIC;
BEGIN
loopXor := '0';
FOR i IN 0 to width -1 LOOP
loopXor := loopXor XOR ad( i ) ;
END LOOP ;
oddParity <= loopXor ;
END PROCESS;
END behavioral ;
7/18/2015
112
Sequential Logic Synthesis
for
Beginners
‫בס''ד‬
For Beginners
Use processes with very simple structure only
to describe
- registers
- shift registers
- counters
- state machines.
Use examples discussed in class as a template.
Create generic entities for registers, shift registers, and
counters, and instantiate the corresponding components in
a higher level circuit using GENERIC MAP PORT MAP.
Supplement sequential components with
combinational logic described using concurrent statements.
7/18/2015
114
‫בס''ד‬
For Intermmediates
1.
2.
3.
7/18/2015
Use Processes with IF and CASE statements only. Do not use
LOOPS or VARIABLES.
Sensitivity list of the PROCESS should include only signals
that can by themsleves change the outputs of the sequential
circuit (typically, clock and asynchronous set or reset)
Do not use PROCESSes without sensitivity list
(they can be synthesizable, but make simulation inefficient)
115
‫בס''ד‬
For Intermmediates (2)
Given a single signal, the assignments to this signal should
only be made within a single process block in order to avoid
possible conflicts in assigning values to this signal.
Process 1: PROCESS (a, b)
BEGIN
y <= a AND b;
END PROCESS;
Process 2: PROCESS (a, b)
BEGIN
y <= a OR b;
END PROCESS;
7/18/2015
116
‫בס''ד‬
For Advanced
Describe the algorithm you are trying to
implement in pseudocode.
Translate the pseudocode directly to VHDL using
processes with IF, CASE, LOOP, and
VARIABLES.
7/18/2015
117
‫בס''ד‬
Introduction to VHDL/Verilog
LIBRARY IEEE;
USE IEEE.STD_LOGIC_1164.ALL;
ENTITY SHREG IS
PORT(SHIFT,CLK
A
Y
END SHREG;
: IN STD_LOGIC;
: IN STD_LOGIC;
: OUT STD_LOGIC);
ARCHITECTURE RTL OF SHREG IS
SIGNAL TMP : STD_LOGIC_VECTOR(7 DOWNTO
0) := "00000000";
BEGIN
PROCESS(CLK,A,SHIFT,TMP)
BEGIN
TMP <= TMP;
IF RISING_EDGE(CLK) THEN
IF SHIFT = '1' THEN
TMP <= TMP(6 DOWNTO 0) & A;
END IF;
END IF;
END PROCESS;
Y <= TMP(7);
END RTL;
7/18/2015
module SHREG (shift,clk,a,y);
input shift,clk,a;
output y;
reg [7:0] tmp;
always @(posedge clk)
begin
tmp = tmp;
if (shift)
tmp = {tmp[6:0],a};
end
assign y = tmp[7];
endmodule
119
‫בס''ד‬
Introduction to VHDL/Verilog
reset
Y =1
ST0
Y =2
control
ST3
ST1
Y =0
ST2
Y =3
7/18/2015
120
‫בס''ד‬
Introduction to VHDL/Verilog
LIBRARY IEEE;
USE IEEE.STD_LOGIC_1164.ALL;
ENTITY FSM IS
PORT(CLK,RST : IN STD_LOGIC;
CONTROL: IN STD_LOGIC;
Y
: OUT STD_LOGIC_VECTOR(1 DOWNTO 0));
END FSM;
ARCHITECTURE RTL OF FSM IS
TYPE StateTape IS (ST0,ST1,ST2,ST3);
SIGNAL STATE
: StateTape;
BEGIN
PROCESS(CLK,RST)
BEGIN
IF (RST = '1') THEN
STATE <= ST0;
ELSIF (CLK'EVENT AND CLK = '1') THEN
CASE (STATE) IS
WHEN ST0 => STATE <= ST1;
WHEN ST1 =>
IF (CONTROL = '1') THEN
STATE <= ST2;
ELSE
STATE <= ST3;
END IF;
WHEN ST2 => STATE <= ST3;
WHEN ST3 => STATE <= ST0;
WHEN OTHERS => NULL;
END CASE;
END IF;
END PROCESS;
WITH STATE SELECT
Y <= "01" WHEN ST0,
"10" WHEN ST1,
"11" WHEN ST2,
"00" WHEN ST3,
"00" WHEN OTHERS;
END RTL;
7/18/2015
module FSM (clk,rst,control,y);
input clk,rst;
input control;
output [1:0] y;
reg [1:0] y;
parameter [1:0] ST0 = 0,ST1 = 1,ST2 = 2,ST3 = 3;
reg [1:0] STATE;
always @(posedge clk or posedge rst)
begin
if (rst)
STATE = ST0;
else
case(STATE)
ST0: STATE = ST1;
ST1: if (control)
STATE = ST2;
else
STATE = ST3;
ST2: STATE = ST3;
ST3: STATE = ST0;
endcase
end
always @(STATE)
begin
case (STATE)
ST0: y = 1;
ST1: y = 2;
ST2: y = 3;
ST3: y = 0;
default: y = 1;
endcase
end
endmodule
121
‫בס''ד‬
Overview of ALTERA FPDs.
Important points for design




Based on Synthesis together with specialized Place and
Route tool.
Small local blocks can run quite fast.
Large complicated blocks has limited speed because of
interconnect delays.
General functional blocks optimized for given FPGA
architecture often given as macro blocks: Counters,
adders, multipliers, FIFO’s, etc.
– Important to use such blocks in synthesis to obtain good
performance


Dedicated timing estimator/calculator: LUTs fixed
delays, F-F fixed delays, interconnects depends on length
and number of routing switches.
If design gets close to full utilization then design time
increases significantly (problem with place and route).
‫בס''ד‬
Contents:






7/18/2015
Idea;
History;
High-Capacity FPGA’s Architecture;
Computer aided design (CAD) flow for PLD;
Introduction to VHDL/VerilogHDL;
Getting started.
123
‫בס''ד‬
Getting started
7/18/2015
124
‫בס''ד‬
Introduction to FPD.
The END
7/18/2015
125
‫בס''ד‬
FPGA IP
1. Programmable logic vendors are adding Hard and Soft IP (“Intellectual
Property”) to their arrays, such as small RISC processors, DSPs,
multipliers, digital filters, etc.
2. Xilinx Virtex-II FPGA with hard PowerPC 405 core + >20K CLBs
3. Embedded 400 MHz, 600+ D-MIPS RISC core (32-bit Harvard
architecture) 5-stage data path pipeline Hardware multiply and divide 32
x 32-bit general-purpose registers
4. 16 KB 2-way set-associative instruction cache
5. 16 KB 2-way set-associative data cache, write back/write through
6. Implements PowerPC User Instruction Set Architecture (UISA)
7. Xilinx MicroBlaze 32-bit processor soft core
8. Altera Excalibur FPGA with Hard ARM 922 32-bit RISC CPU
9. Altera NIOS 32-bit processor soft core
7/18/2015
131