Fountain codes
Download
Report
Transcript Fountain codes
D.J.C MacKay
IEE Proceedings Communications, Vol. 152, No. 6, December 2005
Introduction
Fountain Codes
Intermission
LT Codes
Raptor Codes
2
Erasure Channel:
◦ Files are transmitted in multiple small packets.
◦ Each packet is either received without error or loss.
◦ Such as the Internet.
How to deal with packet loss?
◦ Some simple retransmission protocols:
ACKs: for missing packets.
ACKs: for received packets.
◦ Erasure Correcting Codes.
3
Why Erasure Correcting Codes?
Retransmission are wasteful when erasure is serious:
ACKs: for missing packets.
ACKs would be enormous.
ACKs: for received packets.
Would lead to multiple copies.
Broadcast Channel with erasure:
server
A
E
F
C
D
4
Erasure Correcting Codes:
Block Code, such as (N, K) Reed-Solomon Code:
Any K of the N transmitted symbols are received, then the original K
source symbols can be recovered.
High Complexity: O( K(N-K) log2N)
Estimate the erasure probability f first, then choose the code rate R =
K/N before transmission.
Ex. If loss-rate = 50%, then set code rate R = 1/(1-50%) = 1/2 = K/N.
( N = 2K )
loss-rate = 50%
5
Erasure Correcting Codes:
Block Code, such as (N, K) Reed-Solomon Code:
If f is larger than expected (decoder receives fewer than K symbols):
Ex. We thought loss-rate is 50%, and set the code rate R = 1/2 ( N = 2K );
however, the actual loss-rate = 66.7%, the proper code rate R should be
lower: R = 1/3 ( N = 3K )
We would like a simple way to extend the code on the fly to create a
lower-rate (N’, K) code.
No way!
loss-rate = 66.7%
6
Fountain Codes are rateless:
The number of encoded packets generated can be determined
on the fly.
7
Fountain Codes are rateless:
The number of encoded packets generated can be determined
on the fly.
Fountain Codes can also have fantastically small encoding and
decoding complexities.
Depends on the careful choice of Degree Distribution.
Balls–and–Bins Problem:
◦ Imagine that we throw N balls independently at random
into K bins, what probability of one bin have no balls in it?
K bins
…
N throws
…
9
Balls–and–Bins Problem:
◦ After N balls have been thrown, what probability of one bin
have no ball in it?
The probability that one particular bin is empty after N balls
have been thrown:
K bins
…
10
K bins
…
Balls–and–Bins Problem:
◦ After N balls have been thrown, what probability of one bin
have no ball in it?
The probability that one particular bin is empty after N balls
have been thrown:
The expected number of empty bins: δ =
,which roughly implies: the probability of all bins have a ball
is (1- δ) only if:
11
Luby Transform (LT) Codes:
◦ Encoding process:
For the ith encoded packet, select degree di by carefully chosen
Degree Distribution (Robust Soliton Distribution).
x1
Choose di source data.
x2
x3
x4
x6
x5
Perform XOR on chosen data.
◦ Decoding process:
3
…
k
probability
μ1
μ2
μ3
…
μk
y4
y5
x4
2
y3
x3x5x6
1
y2
x2x5
Degree
y1
x2
Remove degree-one edges iteratively.
…
x1x3
Decode degree-one encoded packets.
12
Designing the Degree Distribution:
◦ A few encoded packets must have high degree.
To ensure that every source data are connected to encoded
packets.
◦ Many encoded packets must have low degree.
So that decoding process can get started, and keep going.
Also the total number of XOR operations involved in the
encoding and decoding is kept small.
x1
x2 x3
y1
y2 y3
x4
x5
y4
y5
13
Some properties of Degree Distribution:
◦ The complexity (both encoding and decoding) are scaled
linearly with the number of edges in the graph.
◦ Key factor: The average degree of the encoded packets.
How small (number of edges) can this be?
◦ Recall: Balls–and–Bins Problem.
Balls: linked edges.
x1
x2 x3
x4
x5
Bins: source data.
How small number of edges can assure that
every source data must have at least
one edge on it? (all bins have a ball)
y1
y2 y3
y4
y5
14
Some properties of encoder:
◦ Encoder throws edges into source data at random.
The number of edges must be at least of order : K lnK.
Balls–and–Bins Problem:
The expected number of empty bins: δ =
,which roughly implies: the probability of all bins have a ball is (1- δ)
only if:
15
For decoder:
◦ If decoder received optimal K encoded packets, the average
degree of each encoded packet must be at least: lnK
The number of edges must be at least of order: K lnK.
The complexity (both encoding and decoding) of an LT code
will definitely be at least: K lnK
Luby showed that this bound on complexity can
indeed be achieved by a careful choice of Degree
Distribution.
16
Ideally, to avoid redundancy:
◦ We would like just one check node has degree one at each
iteration.
Ideal Soliton Distribution:
The expected average degree under this distribution
is roughly: lnK
17
In practice, this distribution works poorly:
◦ Fluctuations around the expected behavior:
Sometimes in the decoding process there will be no degree-one
check node.
◦ A few source data will receive no connections at all.
Some small modification fixes these problems.
◦ Robust Soliton Distribution:
More degree-one check node.
A bit more high-degree check node.
18
Robust Soliton Distribution:
Two extra parameters: c and δ
The expected number of degree-one check node (through
out decoding process) is about:
δ: a bound on the decoding failure probability.
Decoding fails to run to completion after K’ of encoded packets have
been received.
c: a free parameters smaller than 1.
Luby’s key result.
19
Luby defines a positive function:
, then adds the Ideal Soliton Distribution ρ to τ and normalize
to obtain the Robust Soliton Distribution μ:
, where
※ Receiver once receives K' = KZ encoded packets ensures
that the decoding can run to completion with probability
at least 1 - δ.
20
※ High-degree ensures every
source data is likely to be
connected to a check.
τ( k/S )
※ Small-degree of τ ensures
the decoding process gets
started.
21
※ Histograms of the number of encoded packets N
required in order to recover source data K = 10,000
22
※ Practical performance of LT codes
- Three experimental decodings are shown.
an overhead
of 10%
※ All codes created with c = 0.03, δ = 0.5
(S= 30, K/S = 337, Z = 1.03), and K = 10,000
23
Complexity cost:
◦ LT Codes: O(K lnK), where K: the number of original data.
Average degree of the encoded packets: lnK
Encoding and decoding complexity: lnK per encoded packet
◦ Raptor Codes: Linear time encoding and decoding.
Concatenating a weakened LT Code with an outer code.
Average degree of weakened LT code ≒ 3
24
Weakened LT Code:
◦ Average degree of encoded packets ≒ 3
◦ A fraction of source data will receive no connections at all.
What fraction?
◦ Balls–and–Bins Problem:
Also the fraction
of empty bins
≒ 5%
25
Shokrollahi’s trick:
※ Encoder:
◦ Find a outer code can correct erasures if the erasure rate is:
, then pre-code K source data into:
◦ Transmit this slightly enlarged data using a weaken LT Code.
※ Decoder:
◦ Once slightly more than K encoded packets been received, can
recover
of the pre-coded packets (roughly K packets).
◦ Then use the outer code to recover all the original data.
26
※ Schematic diagram of a Raptor Code
K = 16
Pre-Coding
K’ = 20
covered = 17
Weaken LT
N = 18
27
※ The idea of a weakened LT Code.
※ LT codes created with c = 0.03, δ = 0.5 and
truncated at degree 8 (thus average degree = 3)
28
Ideal Soliton Distribution:
The average degree is roughly: lnK
K
d d (d )
d 1
1 K
1
1 d
K d 2 d (d 1)
K
1
ln( K )
d 1 d
30
Robust Soliton Distribution μ:
, where
The average degree is roughly: lnK
d ( d d )
d
( d d )
d ( d d )
d
d
d
K 1
K
1
R
1
S
S
ln( )
d 2 d 1
d 1 K
S
ln( K ) 1 ln( ) O ln( K )
31