mwscas2k2 - West Virginia University

Download Report

Transcript mwscas2k2 - West Virginia University

An Analog Turbo Decoder
for an (8,4) Product Code
Neiyer Correal and Joe Heck
Florida Communications Research Labs
Motorola
Plantation, FL
Matthew C. Valenti
Assistant Professor
Lane Dept. of Comp. Sci. & Elect. Eng.
West Virginia University
Morgantown, WV
[email protected]
copyright 2002
Motivation & Goals

Motivation

Iteratively decodable code are capable of nearShannon limit performance.
• e.g. turbo codes, LDPC codes, turbo product codes.

Digital hardware has its limitations
• Clock functions demand power and real estate.
• Turbo decoding uses nonlinear functions that are
awkward in digital but natural in analog.

Goal of this study
© 2002

To design and implement a simple iterative
decoder in analog.
• Simple: (8,4) product code created using a 2 by 2 array
of (3,2) single parity check codes.

Target throughput is 1 Gbps
Overview of Talk

Related work
 Single parity check (SPC) codes




Turbo product codes


© 2002

Encoder
Soft-input/Soft-output (SISO) decoder
Analog design of the SISO decoder
Encoder
Turbo Decoder
Comments & Conclusions
Related Work

Inspiration: Carver Mead (Cal Tech) 1980’s




Observation: nervous cells and CMOS transistors
operating in subthreshold mode have same
physics.
Adaptive analog systems with the robustness of
digital yet 2-orders of magnitude reduction in
power.
Primarily image processing applications.
Recent work: Turbo Decoding

J. Hagenauer (T.U. Munich)
© 2002
• Log-likelihood domain circuits.

Loeliger + Lustenberger (E.T.H. Zurich)
• Circuits operate on raw probabilities rather than LLRs.
Single Parity Check Codes

Generic (n,k) SPC code:




The (3,2) SPC code:


© 2002
The input to the SPC encoder is a stream of k data
bits: {u1, u2, …, uk}
A single parity bit is created by xor-ing the data:
up = u1u2uk
The n=k+1 bit output is the k input bits and the
parity bit: u = {u1, u2, …, uk ,up}

Let the two input bits be denoted uj and uk
The parity bit is uj,k = ujuk
The output is u = {uj, uk, uj,k }
Modulation and Channel Model

BPSK modulation




x = +1 if u = 0 and x = -1 if u = 1
Modulated code word: x = {xj, xk, xj,k }
Received code word: y = {yj, yk, yj,k }
Channel

characterized by conditional pdf: f(y|x)
• AWGN:
N 

f ( y | x) ~   x Ε s , 0 
2 

© 2002
• Rayleigh flat-fading
N0 

f ( y | x) ~   ax Es ,

2 

Scaling the Input

Input to the SISO decoder must be scaled:

In log-likelihood form:
 f ( y | x  1) 
E
  4ay s
L( y | x)  ln 
N0
 f ( y | x  1) 
SNR
© 2002

Decoder input is:
r={rj,rk,rj,k}={L(yj|xj),L(yk|xk),L(yj,k|xj,k)}
fade coefficient
(=1 for AWGN)
APP SISO Decoder
a priori
information
from other
decoders
L(xj)
scaled
rj
rj,k
channel
rk
observations
L(xk)
SISO
dec.
#1
L(uj|y)
Le(xj)
Le(xk)
L(uk|y)
a posteriori
estimates:
used to make
final bit decision
extrinsic
information
to other decoders
 P[ x  1] 

L( x)  ln 
P
[
x


1
]



Input-output relationships (BCRJ algorithm):

 rj ,k
Le ( x j )  2 tanh  tanh 
 2

© 2002
1

 rk  L( xk ) 
 tanh 

2

 extrinsic


 rj ,k 
 r  L( x j ) 
 tanh  j

Le ( xk )  2 tanh 1  tanh 
2
 2 



L( x j | y )  rj  L( x j )  Le ( x j )
a posteriori
L( xk | y )  rk  L( xk )  Le ( xk )
information:
exploits structure of code
LLR estimates
Box-Plus Operator

The heart of the decoder is a soft-xor
operation, also called “box-plus”
L(u j  u j ,k )  u j
© 2002

u j ,k

 L(u j ) 
 L(u j ,k ) 
 tanh 

 2 tanh  tanh 
 2 
 2 

1
This can be implemented in analog by a
modified Gilbert multiplier
Modified Gilbert Multiplier
Current
Mirror
Current
Mirror
I
I
Vcc
I1
I2
Vcc
V2
Vo
V1
© 2002
I EE
L(u j  u j ,k )  u j
u j ,k

 L(u j ) 
 L(u j ,k ) 
 tanh 

 2 tanh  tanh 
 2 
 2 

1
Computing Extrinsic Information
Vdda
Vgp
+
rk/VT
50 A
Vdda
+
Le(xj)/VT
+
L(xk)/VT
Vdda
+
R12
Q1
Q0
rj,k/VT
L(xk)
Q13
rk
rj,k
Le(xj)
Vbn
Q10
50 A
Computing a Posteriori LLR
Vdda
R8
R6
R7
+
L(xj|y)/VT
+
rj/VT
+
L(xj)/VT
Vbn
rj
L(xj)
Le(xj)
L(xj|y)
+
Le(xj)/VT
SISO Decoder for (3,2) SPC Code
L(xj)
rj
rj,k
rk
L(xk)

L(xj|y)
Le(xj)
Le(xk)
L(xk|y)
Components:

2 units for computing extrinsic information.
• 6 PMOS + 10 BJT transistors + 7 resistors each.

2 units for computing a priori LLR.
© 2002
• 9 BJT transistors + 9 resistors each.
Product Codes
u1

u3
u1,2

u4 
=
u1,3
u2,4
u3,4
A more powerful code can be created by
using multiple SPC codes.


© 2002

u2 
=


2 by 2 array of SPC codes can correct a single
error when using hard decision decoding.
However, we want a soft-decision decoder.
© 2002
Turbo Decoding of Product Codes
r1
r2
r1,2
Le(x1)
Le(x2)
r3
r4
r3,4
Le(x3)
Le(x4)
r1,3
r2,4
Le(x1)
Le(x3)
Le(x2)
Le(x4)
Turbo Decoder
r1
r1,2
r2
r3
© 2002
r4 r3,4
r1,3
r2,4
vertical
dec. #1
vertical
dec. #2
horiz.
dec.
#1
horiz.
dec.
#2
L(u1|y)
L(u2|y)
L(u3|y)
L(u4|y)
Vertical Decoder

The a priori LLR is computed using the output
of the horizontal decoding.
 Thus, the vertical decoder does not need the
three-input adders
L(xj)
rj
rj,k
rk
© 2002
L(xk)
Le(xj)
Le(xk)
Circuit Complexity

Two horizontal decoders, each with:



Two vertical decoders, each with:


2 units for computing extrinsic information.
Total complexity:



© 2002
2 units for computing extrinsic information.
2 units for computing a priori LLR.


8 units for computing extrinsic information.
4 units for computing a priori LLR.
48 PMOS transistors
116 BJT transistors
92 Resistors
Comments on Analog Decoding

Complicated codes can be decoded using the same
approach.


Cascade of analog cells.
Problems

Transistor mismatch.
• Equivalent to having additional noise at input.
• Current transistor tolerances have same impact as using 5-10
bit quantization in a digital implementation.

Thermal gradients
• Proper scaling will mitigate intercell gradients.
• However, intracell gradients will hurt performance.
© 2002

Decoding serial data in parallel
• Need to store analog values (e.g. sample/hold).
• Could work well with multicarrier modulation.

Design and test
• Spice-level simulations needed, but take too long.
Conclusion

Advantages of analog decoders:




Implementation issues may hinder performance


However, suboptimality in local cells is okay as long as
overall system performance is acceptable.
Extensions of this work



© 2002
No costly clock functions  smaller footprint/expense.
No concept of iteration  faster convergence.
easily implemented nonlinear functions.


(n,k) SPC code for k>2.
Product code with 3 dimensions or more.
Other types of codes: Hamming, BCH.
Turbo and LDPC codes.
Other iterative processing schemes, e.g. turbo-EQ