Fountain codes

Transcript Fountain codes

D.J.C MacKay
IEE Proceedings Communications, Vol. 152, No. 6, December 2005





Introduction
Fountain Codes
Intermission
LT Codes
Raptor Codes
2

Erasure Channel:
◦ Files are transmitted in multiple small packets.
◦ Each packet is either received without error or loss.
◦ Such as the Internet.

How to deal with packet loss?
◦ Some simple retransmission protocols:
 ACKs: for missing packets.
 ACKs: for received packets.
◦ Erasure Correcting Codes.
3

Why Erasure Correcting Codes?

Retransmission are wasteful when erasure is serious:
 ACKs: for missing packets.
 ACKs would be enormous.
 ACKs: for received packets.
 Would lead to multiple copies.

Broadcast Channel with erasure:
server
A
E
F
C
D
4

Erasure Correcting Codes:

Block Code, such as (N, K) Reed-Solomon Code:

Any K of the N transmitted symbols are received, then the original K
source symbols can be recovered.

High Complexity: O( K(N-K) log2N)

Estimate the erasure probability f first, then choose the code rate R =
K/N before transmission.

Ex. If loss-rate = 50%, then set code rate R = 1/(1-50%) = 1/2 = K/N.
( N = 2K )
loss-rate = 50%
5

Erasure Correcting Codes:

Block Code, such as (N, K) Reed-Solomon Code:

If f is larger than expected (decoder receives fewer than K symbols):

Ex. We thought loss-rate is 50%, and set the code rate R = 1/2 ( N = 2K );
however, the actual loss-rate = 66.7%, the proper code rate R should be
lower: R = 1/3 ( N = 3K )

We would like a simple way to extend the code on the fly to create a
lower-rate (N’, K) code.

No way!
loss-rate = 66.7%
6

Fountain Codes are rateless:

The number of encoded packets generated can be determined
on the fly.
7

Fountain Codes are rateless:

The number of encoded packets generated can be determined
on the fly.

Fountain Codes can also have fantastically small encoding and
decoding complexities.

Depends on the careful choice of Degree Distribution.

Balls–and–Bins Problem:
◦ Imagine that we throw N balls independently at random
into K bins, what probability of one bin have no balls in it?
K bins
…
N throws
…
9

Balls–and–Bins Problem:
◦ After N balls have been thrown, what probability of one bin
have no ball in it?
 The probability that one particular bin is empty after N balls
have been thrown:
K bins
…
10
K bins
…

Balls–and–Bins Problem:
◦ After N balls have been thrown, what probability of one bin
have no ball in it?
 The probability that one particular bin is empty after N balls
have been thrown:
 The expected number of empty bins: δ =
,which roughly implies: the probability of all bins have a ball
is (1- δ) only if:
11

Luby Transform (LT) Codes:
◦ Encoding process:
 For the ith encoded packet, select degree di by carefully chosen
Degree Distribution (Robust Soliton Distribution).
x1
 Choose di source data.
x2
x3
x4
x6
x5
 Perform XOR on chosen data.
◦ Decoding process:
3
…
k
probability
μ1
μ2
μ3
…
μk
y4
y5
x4
2
y3
x3x5x6
1
y2
x2x5
Degree
y1
x2
 Remove degree-one edges iteratively.
…
x1x3
 Decode degree-one encoded packets.
12

Designing the Degree Distribution:
◦ A few encoded packets must have high degree.
 To ensure that every source data are connected to encoded
packets.
◦ Many encoded packets must have low degree.
 So that decoding process can get started, and keep going.
 Also the total number of XOR operations involved in the
encoding and decoding is kept small.
x1
x2 x3
y1
y2 y3
x4
x5
y4
y5
13

Some properties of Degree Distribution:
◦ The complexity (both encoding and decoding) are scaled
linearly with the number of edges in the graph.
◦ Key factor: The average degree of the encoded packets.

How small (number of edges) can this be?
◦ Recall: Balls–and–Bins Problem.
 Balls: linked edges.
x1
x2 x3
x4
x5
 Bins: source data.
How small number of edges can assure that
every source data must have at least
one edge on it? (all bins have a ball)
y1
y2 y3
y4
y5
14

Some properties of encoder:
◦ Encoder throws edges into source data at random.
 The number of edges must be at least of order : K lnK.
 Balls–and–Bins Problem:
 The expected number of empty bins: δ =
,which roughly implies: the probability of all bins have a ball is (1- δ)
only if:
15

For decoder:
◦ If decoder received optimal K encoded packets, the average
degree of each encoded packet must be at least: lnK
 The number of edges must be at least of order: K lnK.
 The complexity (both encoding and decoding) of an LT code
will definitely be at least: K lnK

Luby showed that this bound on complexity can
indeed be achieved by a careful choice of Degree
Distribution.
16

Ideally, to avoid redundancy:
◦ We would like just one check node has degree one at each
iteration.

Ideal Soliton Distribution:

The expected average degree under this distribution
is roughly: lnK
17

In practice, this distribution works poorly:
◦ Fluctuations around the expected behavior:
 Sometimes in the decoding process there will be no degree-one
check node.
◦ A few source data will receive no connections at all.

Some small modification fixes these problems.
◦ Robust Soliton Distribution:
 More degree-one check node.
 A bit more high-degree check node.
18

Robust Soliton Distribution:

Two extra parameters: c and δ

The expected number of degree-one check node (through
out decoding process) is about:

δ: a bound on the decoding failure probability.

Decoding fails to run to completion after K’ of encoded packets have
been received.

c: a free parameters smaller than 1.

Luby’s key result.
19

Luby defines a positive function:
, then adds the Ideal Soliton Distribution ρ to τ and normalize
to obtain the Robust Soliton Distribution μ:
, where
※ Receiver once receives K' = KZ encoded packets ensures
that the decoding can run to completion with probability
at least 1 - δ.
20
※ High-degree ensures every
source data is likely to be
connected to a check.
τ( k/S )
※ Small-degree of τ ensures
the decoding process gets
started.
21
※ Histograms of the number of encoded packets N
required in order to recover source data K = 10,000
22
※ Practical performance of LT codes
- Three experimental decodings are shown.
an overhead
of 10%
※ All codes created with c = 0.03, δ = 0.5
(S= 30, K/S = 337, Z = 1.03), and K = 10,000
23

Complexity cost:
◦ LT Codes: O(K lnK), where K: the number of original data.
 Average degree of the encoded packets: lnK
 Encoding and decoding complexity: lnK per encoded packet
◦ Raptor Codes: Linear time encoding and decoding.
 Concatenating a weakened LT Code with an outer code.
 Average degree of weakened LT code ≒ 3
24

Weakened LT Code:
◦ Average degree of encoded packets ≒ 3
◦ A fraction of source data will receive no connections at all.

What fraction?
◦ Balls–and–Bins Problem:
Also the fraction
of empty bins
≒ 5%
25

Shokrollahi’s trick:
※ Encoder:
◦ Find a outer code can correct erasures if the erasure rate is:
, then pre-code K source data into:
◦ Transmit this slightly enlarged data using a weaken LT Code.
※ Decoder:
◦ Once slightly more than K encoded packets been received, can
recover
of the pre-coded packets (roughly K packets).
◦ Then use the outer code to recover all the original data.
26
※ Schematic diagram of a Raptor Code
K = 16
Pre-Coding
K’ = 20
covered = 17
Weaken LT
N = 18
27
※ The idea of a weakened LT Code.
※ LT codes created with c = 0.03, δ = 0.5 and
truncated at degree 8 (thus average degree = 3)
28

Ideal Soliton Distribution:

The average degree is roughly: lnK
K
d   d   (d )
d 1
1 K
1
 1   d 
K d  2 d (d  1)
K
1
   ln( K )
d 1 d
30

Robust Soliton Distribution μ:
, where

The average degree is roughly: lnK
d  (  d    d  )

d 
 (  d    d  )
  d  (  d    d  )
d
d
d
K 1
K
1
R
1
S
S
   ln( )

d 2 d  1
d 1 K
S
 ln( K )  1  ln( )  O ln( K )



31

Fountain codes

Transcript Fountain codes

Directory