Transcript 09_Cyclone
خانواده هاي FPLD
مرتض ي صاحب الزماني
1
Cyclone FPGA
• SRAM-based
2
مرتض ي صاحب الزماني
Altera Cyclone Devices
Cyclone
FPGA
Cyclone II
FPGA
Cyclone III Cyclone IV Cyclone V
FPGA
FPGA
FPGA
Year
introduced
2002
2004
2007
2009
2011
Process
technology
130 nm
90 nm
65 nm
60 nm
28 nm
Recommen
ded for new
designs
Yes
Yes
Yes
Yes
Yes
3
مرتض ي صاحب الزماني
Gigabit Transceiver Blocks
Cyclone V Architecture
4
مرتض ي صاحب الزماني
Cyclone V Architecture
5
مرتض ي صاحب الزماني
LAB Structure
• LAB:
– 10 ALMs
• Logic implementation
– Local interconnect:
• Fast connection between ALMs
• MLAB (Memory LAB):
– Can be configured as 32x2 dual port SRAM.
6
مرتض ي صاحب الزماني
ALM Structure
مرتض ي صاحب الزماني
7
Cyclone V
مرتض ي صاحب الزماني
8
Cyclone V Characteristics
9
مرتض ي صاحب الزماني
On-Chip Memory
1. MLAB Blocks:
– Wide and shallow
– 640-bit blocks
– 10 ALMs:
•
Each 32 x 2 bit blocks
– Configured as shift registers and FIFO.
2. M10K Blocks:
– 10 Kb blocks
10
مرتض ي صاحب الزماني
Memory Resources
• One type of chips
11
مرتض ي صاحب الزماني
Memory Resources
• Configurations in single/dual-port modes
12
مرتض ي صاحب الزماني
M10K Port Modes
• Single-Port:
– Only one read or one write operation at a
time
• Simple Dual-Port:
– Can simultaneously perform one read and
one write operations to different locations
• True Dual-Port:
– Can perform any combination of two port
operations:
13
• Two reads, two writes, or one read and one write
at two different clock frequencies.
مرتض ي صاحب الزماني
Computational Blocks
• DSP Blocks:
– Multiplication
• 27 x 27
• 18 x 18
• 9x9
– Add/subtract, accumulation
• Efficient calculation of
Σ xi . yi
– Constant storage:
• No need to input from external ports in run-time
14
مرتض ي صاحب الزماني
DSP Block
مرتض ي صاحب الزماني
15
FIR (Finite Impulse Response) Filter
y[n] = C0.x[n] + C1.x[n-1] + … + CN.x[n-N]
16
Cyclone V
مرتض ي صاحب الزماني
17
Inferring Multipliers
• In Quartus II:
• Use attribute syn_multstyle:
– lpm_mult:
• Multipliers implemented in DSP blocks
– logic:
• Multipliers implemented as LEs
18
مرتض ي صاحب الزماني
Inferring Multipliers
architecture beh of onereg is
signal temp : std_logic_vector (15 downto 0);
attribute syn_multstyle : string;
attribute syn_multstyle of temp : signal is
"logic";
begin
temp <= a * b;
r <= temp when en='1' else c;
end beh;
19
مرتض ي صاحب الزماني
Inferring Multipliers
• In XST:
• Use attribute mult_style:
– auto: (default)
• XST looks for the best implementation
– block:
• Multipliers implemented as DSP block
– lut:
• Multipliers implemented as LUTs
20
مرتض ي صاحب الزماني
Inferring Multipliers
architecture beh of onereg is
signal temp : std_logic_vector (15 downto 0);
attribute mult_style: string;
attribute mult_style of temp : {signal|entity }
is "{auto|block|lut|pipe_lut}";
begin
temp <= a * b;
…
end beh;
21
مرتض ي صاحب الزماني
Hard Processor Core
• Some chips: Arm Cortex-9
22
– Single- or dual-core processor with up to 925
MHz maximum frequency
– Hardened embedded peripherals
– Hardened protocol PCIe
– Hardened multiport memory controller, shared
by the processor and FPGA logic, supports
DDR2, DDR3, and LPDDR2 devices
– Level-1 cache: 32 KB
– Level-2 cache: 512 KB
مرتض ي صاحب الزماني
IO Element
Device
Variableprecision DSP
blocks
Fractional
PLLs
Maximum user
I/Os
23
5CGXC3
51
5CGXC4
70
5CGXC5
150
5CGXC7
156
5CGXC9
342
4
6
6
7
8
208
336
336
480
560
مرتض ي صاحب الزماني
مرتض ي صاحب الزماني
24
IO Element - OCT
• OCT (on-chip termination):
– For impedance matching
– Resistor value can be programmed
– No need for on-board termination
• Large space saving
25
مرتض ي صاحب الزماني
IO Element – Open-Drain Outpout
• Open-drain output:
– Can produce hi-Z output
– Needs a pull-up resistor to generate logic ‘1’
• Programmable pull-up resistor avoids on-board
pull-up
– Open-drain outputs can be used for wired
AND
26
مرتض ي صاحب الزماني
IO-Element – Slew Rate
• Slew rate:
– Maximum rate of change of output voltage per
unit of time (V/sec)
– Max (dvout(t)/dt)
27
مرتض ي صاحب الزماني
IO-Element – Slew Rate
• Slew rate:
– Designer can choose fast or slow rate
• Fast: for systems with high performance systems
• Slow: for reliability in systems faced with high
noise
– At the cost of speed
28
مرتض ي صاحب الزماني
Clock Management
• Fractional PLLs:
– To avoid multiple oscillators
– Programmable frequencies
– Jitter/skew removal
29
مرتض ي صاحب الزماني
Transceiver Blocks
• 3 to 12 blocks
• 3.125 to 6.144 Gbps
30
مرتض ي صاحب الزماني
Embedded Processor
• In some chips, hard processor core:
– Arm Cortex-A9
• Single core, dual core
• Up to 1.5 GHz
• 3750 MIPS
31
مرتض ي صاحب الزماني
Configuration
• When powered up:
– Reads configuration bitstream from
•
•
•
•
An Altera-compatible EPROM
A standard EPROM
A RAM in a computer system
An intelligent processor or controller
• Configuration time:
– 10 to 50 ms
32
مرتض ي صاحب الزماني
Altera Stratix
مرتض ي صاحب الزماني
33
Stratix Chips
• LUT-based FPGA
• Logic block structure, interconnection
architecture, memory blocks IO blocks,
Transceiver blocks:
– Very similar to Cyclone
34
مرتض ي صاحب الزماني
Altera FPGAs
مرتض ي صاحب الزماني
35
Stratix vs. Cyclone
Stratix Family
Higher bandwidth
Higher logic density
Cyclone Family
Lower cost
Lower power
consumption
Higher performance
36
مرتض ي صاحب الزماني
Stratix V
مرتض ي صاحب الزماني
37