Transcript HVT
Communication Theory
I. Frigyes
2009-10/II.
http://docs.mht.bme.hu/~frigyes/hirkelm
hirkelm01bEnglish
Frigyes: Hírkelm
2
Topics
• (0. Math. Introduction: Stochastic processes, Complex
envelope)
• 1. Basics of decision and estimation theory
• 2. Transmission of digital signels over analog channels:
noise effects
• 3. Transmission of digital signels over analog channels:
dispersion effects
• 4. Analóg jelek átvitele – analóg modulációs eljárások (?)
• 5. Channel characterization: wireless channels, optical
fibers
•
•
•
•
•
6. A digitális jelfeldolgozás alapjai: mintavételezés, kvantálás, jelábrázolás
7. Elvi határok az információközlésben.
8. A kódelmélet alapjai
9. Az átvitel hibáinak korrigálása: hibajavító kódolás; adaptív kiegyenlítés
10. Spektrális hatékonyság – hatékony digitális átviteli eljárások
Frigyes: Hírkelm
3
(0. Stochastic processes, the
complex envelope)
Stochastic processes
• Also called random waveforms.
• 3 different meanings:
• As a function of ξ number of realizations:
a series of infinite number of random
variables ordered in time
• As a function of time t: a member of a
time-function family of irregular variation
• As a function of ξ and t: one member of a
family of time functions drawn at random
Frigyes: Hírkelm
5
Stochastic processes
• example:
f(t,ξ3)
2
f(t,ξ2)
ξ
f(t,ξ1)
f(t1,ξ)
t
Frigyes: Hírkelm
f(t2,ξ)
1
f(t 3,ξ)
3
6
Stochastic processes: how to
characerize them?
• According to the third definition
• And with some probability distribution.
• As the number of random variables is
infinite: with their joint distribution (or
density)
• (not only infinite but continuum cardinality)
• Taking these into account:
Frigyes: Hírkelm
7
Stochastic processes: how to
characerize them?
• (Say: density)
• First prob. density of x(t) p xt t
x
• second:
joint t1,t2
p x1, x 2 xt1 , xt 2 t1 , t 2
• nth:n-fold joint
p x1 , x2 ,... xn xt1 , xt 2 ,.... xt n t1 , t 2 ,...t n
• The stochastic process is completly
characterized, if there is a rule to compose
density of any order (even for n→).
• (We’ll see processes depending on 2
parameters)
Frigyes: Hírkelm
8
Stochastic processes: how to
characerize them?
• Comment: although precisely the process
(function of t and ξ) and one sample
function (function of t belonging to say ξ16)
are distinguished we’ll not always make
this distinction.
Frigyes: Hírkelm
9
Stochastic processes: how to
characerize them?
•
•
•
•
•
Example: semi-random binary signal:
Values : ±1 (P0=P1= 0,5)
Change: only at t=k×T
1
1
p x x 1 x 1
2
2
First density_:
1
1
Second::
x1 1, x2 1,
x
1
,
x
1
1
2
2
2
if t1 , t 2 in the same time slot ;
p x1, x 2 xt1 , xt 2 1
1
4 x1 1, x2 1 4 x1 1, x2 1
1 x 1, x 1 1 x 1, x 1, otherwise
1
2
1
2
4
4
Frigyes: Hírkelm
10
Continuing the example:
In the same time-slot
45o
Frigyes: Hírkelm
In two distinct time-slots
45o
11
Stochastic processes: the
Gaussian process
• A stoch. proc. is Gaussian if its n-th density is
that of an n-dimensional vector random variable
px ti X
1
det K 2
n
e
1
X m T K 1 X m
2
• m is the expected value vector, K the covariance
matrix.
• nth density can be produced if are given
mx t Ext és K x t1 , t2 Ext1 mx t1 xt2 mx t2
• are given
Frigyes: Hírkelm
12
Stochastic processes: the
Gaussian process
• An interesting property of Gaussian
processes (more precisely: of Gaussian
variables):
Ex y z w Ex y Ez w
Ex z Ey w Ex w Ez y
• These can be realizations of one process
at different times
Frigyes: Hírkelm
13
Stochastic processes: stationary
processes
• A process is stationary if it does not
change (much) as time is passing
• E.g. the semirandom binary signal is
(almost) like that
• Phone: to transmit 300-3400 Hz sufficient
(always, for everybody). (What could we
do if this didn’t hold?)
• etc.
Frigyes: Hírkelm
14
Stochastic processes: stationary
processes
• Precise definitions: what is almost unchanged:
• A process is stationary (in the strict sense) if for
the distribution function of any order and any at
any time and time difference
Fx t1 , t 2 ...t n , t n 1 ,... Fx t1 , t 2 ...t n , t n 1 ,...
• Is stationary in order n if the first n distributions
are stationary
• E.g.: the seen example is first order stationary
• In general: if stationary in order n also in any
order <n
Frigyes: Hírkelm
15
Stochastic processes: stationary
processes
• Comment: to prove strict sense stationarity
is difficult
• But: if a Gaussian process is second order
stationary (i.e. in this case: if K(t1,t2) does
not change if time is shifted) it is strict
sense (i.e. any order) stationary. As: if we
know K(t1,t2) nth density can be computed
(any n)
Frigyes: Hírkelm
16
Stochastic processes:
stationarity in wide sense
• Wide sense stationary: if the correlation
function is unchanged if time is shifted (to
be defined)
• A few definitions:.
• a process is called a Hilbert-process if
Ext
2
• (That means: instantaneous power is
finite.)
Frigyes: Hírkelm
17
Stochastic processes: wide
sense stationary processes
• (Auto)correlation function of a Hilbertprocess:
Rt1 , t 2 Ext1 .xt 2
• The process is wide sense stationary if
• the expected value is time-invariant and
• R depends only on τ=t2-t1 for any time and
any τ.
Frigyes: Hírkelm
18
Stochastic processes: wide sense
– strict sense stationary processes
• If a process is strict-sense stationary then also
wide-sense
• If at least second order stationary: then also
wide sense.
Rt1 , t2 Ext1 , xt 2
X
X
• I.e.:
Frigyes: Hírkelm
t1
t1
X t2 p x1x2 X t 1 , X t 2 dX t1 dX t2
X t2 p x1x2 X t 1 , X t dX t1 dX t2 Rt1 , t2
2
Rt1 , t 2 Rt1 , t1 R
19
Stochastic processes: wide sense
– strict sense stationary processes
• Further: if wide sense stationary, not strict
sense stationary in any sense
• Exception: Gaussian process. This: if wide
sense stationary, also in stict sense.
Frigyes: Hírkelm
20
Stochastic processes: once
again on binary transmission
• As seen: only first order stationary (Ex=0)
• Correlation:
•
if t1 and t2 in the same time-slot:
Rt1 , t 2 Ext1 .xt 2 1
•
if in different:
Rt1 , t 2 Ext1 Ext 2 0
Frigyes: Hírkelm
21
Stochastic processes: once
again on binary transmission
• The semi-random binary transmission can
be transformed in random by introducing a
dummy random variable e distributed
uniformly in (0,1)
yt xt e
T
• like x:
Frigyes: Hírkelm
e
Eyt 0
22
Stochastic processes: once
again on binary transmission
• Correlation:
• If |t1-t2|>T,
(as e T)
• if |t1-t2| T
Eyt1 yt2 EEyt1 .yt2 e 0
1; e T t1 t2
Ry t1 , t2
0; otherwise
• so
Ry 1 Ee T 1
Frigyes: Hírkelm
T
23
Stochastic processes: once
again on binary transmission
• I.e. :
-T
Frigyes: Hírkelm
T
τ
24
Stochastic processes: other
type of stationarity
• Given two processes, x and y, these are
jointly stationary, if their joint distributions
are alle invariant on any τ time shift.
• Thus a complex process zt xt jy t
is stationary in the strict sense if x and y
are jointly stationary.
• A process is periodic (or ciklostat.) if
distributions are invariant to kT time shift
Frigyes: Hírkelm
25
Stochastic processes: other
type of stationarity
• Cross-correlation:
Rx , y t1 , t 2 Ext1 y t 2
• Two processes are jointly stationary in the
wide sense if their cross correlation is
invariant on any time shift t 2 t1
Frigyes: Hírkelm
26
Stochastic processes: comment
on complex processes
• Appropriate definition of correlation for
these:
R t1 , t2 E x t1 .x t 2
• A complex process is stationary in the
wide sense if both real and imaginary
parts are wide sense stationary and they
are that jointly as well
Frigyes: Hírkelm
27
Stochastic processes: continuity
• There are various definitions
• Mean square continuity
E xt xt , ha
2
2
• That is valid if the correlation is continuous
Frigyes: Hírkelm
28
Stochastic processes:
stochastic integral
• x(t) be a stoch. proc. Maybe that Rieman
integral exists for all realizations:
b
s x t dt
a
• Then s is a random variable (RV). But if
not, we can define an RV converging (e.g.
mean square) to the integral-approximate
sum:
Frigyes: Hírkelm
29
Stochastic processes:
stochastic integral
2
n
s xt dt if lim E s xti ti 0
ti 0
a
i 1
b
• For this
b
Es Ext dt
a
b b
s 2 Rt1 , t 2 Ext1 Ext 2 dt1dt2
a a
Frigyes: Hírkelm
30
Stochastic processes:
stochastic integral - comment
• In σs2 the integrand is the
(auto)covariancie-function:
C t1 , t 2 ˆ Ext1 xt 2 Ext1 Ext 2
• This depends only on t1-t2=τ if x is
stationary (at least wide sense)
Frigyes: Hírkelm
31
Stochastic processes: time
average
• Integral is needed – among others –to define
time average
• Time average of a process is its DC component;
• time average of its square is the mean power
• definition:
1
n x lim
T 2T
Frigyes: Hírkelm
T
xt dt
T
32
Stochastic processes: time
average
• In general this is a random variable. It
would be nice if this were the statistical
average. This is really the case if
2
nx Ext and n 0
• Similarly we can define
1
R lim
T 2T
Frigyes: Hírkelm
T
xt xt dt
T
33
Stochastic processes: time
average
• This is in general also a RV. But equal to
the correlation if
2
Rx Rx , with R 0
• If these equalities hold the process is
called ergodic
• The process is mean square ergodic if
1
lim
T 2T
Frigyes: Hírkelm
T
R d Ext
2
x
T
34
Stochastic processes: spectral
density
• Spectral density of a process is, by
definition the Fourier transform of the
correlation function
S x
R e
x
j
d
Frigyes: Hírkelm
35
Stochastic processes: spectral
density
• A property:
E xt
2
1
2
S d
x
as
1
Fu d u 0 ; , : Fourier va riable - pairs
2 -
But : R 0 E xt
2
• Consequently this integral >0; (we’ll
see: S˙(ω)>0)
Frigyes: Hírkelm
36
Spectral density and linear
transformation
• As known in time functions output function
is convolution
x(t)
FILTER
y(t)
h(t)
y t xt ht
xu ht u du
• h(t): impulse response
Frigyes: Hírkelm
37
Spectral density and linear
transformation
• Comment.: h(t<0)≡ 0; (why?);
and: h(t) = F-1[H(ω)]
• It is plausible: the same for stochastic
processes
• Based on that it can be shown :
S y H S x
2
• (And also
Frigyes: Hírkelm
Eyt Ext H 0
)
38
Spectral density and linear
transformation
• FurtherS(ω) ≥ 0 (all frequ.)
• For: if not, there is a domain where
S(ω) <0 (ω1, ω2)
x(t)
FILTER
y(t)
h(t)
H(ω)
Frigyes: Hírkelm
Sx(ω)
Sy(ω) (its integral is negative)
39
Spectral density and linear
transformation
• S(ω) is the spectral density (in rad/sec).
As:
2
H(ω)
S y S x H
Power
ω
Frigyes: Hírkelm
1
2
S d 2S B
x
Hz
x
40
Modulated signals – the complex
envelope
• In previous studies we’ve seen that in
radio, optical transmission
• one parameter is influenced (e.g. made
proportional)
•
of a sinusoidal carrier
•
by the modulating signal
.
• A general modulated signal:
xt 2 Ad t cosc t t
Frigyes: Hírkelm
41
Modulated signals – the
complex envelope
• Here d(t) and/or (t) carries the
information – e.g. are in linear relationship
with the modulating signal
• An other description method (quadrature
form):
xt Aat cosc t Aqt sinc t
• d, , a and q are real time functions –
deterministic or realizations of a stoch.
proc.
Frigyes: Hírkelm
42
Modulated signals – the
complex envelope
• Their relationship:
d t
at qt
2
2
2
qt
; t arctg
at
at 2d t cos t ; qt 2d t sin t
• As known x(t) can also be written as:
xt Reat jqt e
jct
Frigyes: Hírkelm
43
Modulated signals – the
complex envelope
• Here a+jq is the complex envelope.
Question: when, how to apply.
• To beguine with: Fourier transform of a
real function is conjugate symmetric:
X ˆ Fxt ; and Fx- t X
• But if so: X(ω>0) describes the signal
completly: knowing that we can form the
ω<0 part and, retransform.
Frigyes: Hírkelm
44
Modulated signals – the
complex envelope
• Thus instead of X(ω) we can take that:
X ˆ X sign X X j jsign X
• By the way:
2 X ; 0
X
0; 0
↓
„Hilbert” filter
• The relevant time function:
xt F X xt jF1 jsign X
Frigyes: Hírkelm
1
45
Modulated signals – the
complex envelope
We can write: xt xt jx t F1 - jsign
•
• The shown inverse Fourier transform is
1/t.
x
xt xt j
d
• So
t
xt xt jHxt xt jxˆ t
• Imaginary part is the so-callerd Hilberttranszform of x(t)
Frigyes: Hírkelm
46
Modulated signals – the
complex envelope
• Now introduced function xt is the analytic
function assigned to x(t) (as it is an
analytic function of the z=t+ju complex
variable).
• An analytic function can be assigned to
any (baseband or modulated) function;
relationship between the time function and
the analytic function is
Frigyes: Hírkelm
47
Modulated signals – the
complex envelope
• It is applicable to modulated signals: analytic signal of
cosωct is ejωct. Similarly that of sinωct is jejωct. So if
quadrature components of the modulated signal a(t), q(t)
are
•
band limited and
•
their band limiting frequency is < ωc/2π (narrow band
signal)
xt at cos ct xt at e jct
• then
or
NB. Modulation is a linear
operation in a,q: frequencyxt qt sin t xt jqt e jct
c
displacement.
Frigyes: Hírkelm
48
Modulated signals – the
complex envelope
• Thus ~
x t ̂ a t jq t complex envelope
determines uniquely the modulated signals. In the time
domain
~
x t at jq t xt ~
x t e jct xt
Re xt at cos c t qt sin c t
• Comment: ~
x t according to its name can be complex.
(X(ω) is not conjugate symmetricaround ωc.)
• Comment 2: if the bandwidt B>fc,xt is not analytic, its
real part does not define the modulated signal.)
• Comment 3: a és q can be independent signals (QAM)
or can be related (FM or PM).
Frigyes: Hírkelm
49
Modulated signals – the
complex envelope
• In frequency domain? On analytic signal we saw.
~
x t xt e
j c t
~
and so X X c
X(ω)
X(ω)
X˚(ω)
X̃(ω)
Frigyes: Hírkelm
ω
50
Modulated signals – the complex envelope
• Linear transformation – bandpass
filter –acts on x̃(t) as
a lowpass filter.
• If H(ω) is asymm:
x̃(t) is complex – i.e.
is a crosstalk
between a(t) és q(t)
között
• (there was no sin
component – now
there is.)
Frigyes: Hírkelm
M(ω)
H(ω)
X(ω)=F[m(t)cosωct]
Y(ω)
Y˚(ω)
X˚(ω)
X̃ (ω)
Ỹ(ω)
51
Modulated signals – the complex envelope,
stochastic processes
• Analytic signal and complex envelope are
defined for deterministic signals
• It is possible for stochastic processes as
well
• No detailed discussion
• One point:
Frigyes: Hírkelm
52
Modulált jelek – a komplex burkoló;
sztochasztikus folyamatok
• x(t) is stationary (Rx is independent from t),
iff
Eat Eqt 0; Ra Rq ; Raq Rqa
xt at cos c t qt sin c t és
Rx t , Ra Rq cos c Rqa Raq sin c
Ra Rq cos c 2t Raq Rqa sin c 2t
Frigyes: Hírkelm
53
Narrow band (white) noise
• White noise is of course not narrow band
• Usually it can be made narrow band by a
(fictive) bandpass filter:
H(ω)
X(ω)
X(ω)
Sn(ω) =N0/2
Frigyes: Hírkelm
ω
54
Narrow band (white) noise
properties
jc t
jc t
~
nt n t e
nc t jn s t e
Rc Rs ; Rc , s Rs ,c
Rn~ Rc jR c , s
1
S n S n~ c S n~ c
2
Frigyes: Hírkelm
55
1. Basics of decision theory and
estimation theory
Detection-estimation problems in
communications
• 1. Digital communication: one among signals
known (by the receiver) – in presence of noise
DIGITAL
SOURCE
Transmission
channel
SINK
• E.g.: (baseband binary communication)
s1 t U nt ; s0 t nt
• Decide: which was sent?
Frigyes: Hírkelm
57
Detection-estimation problems in
communications
• 2. (Otherwise) known signal has unknown
parameter(s) (statistics are known) • Same block schematic; example: noncoherent FSK
s1 t 2 A cos c1t 1 nt ; s2 t 2 A cos c 2t 2 nt
1
; ,
p1 p 2 2
0; otherwise
Frigyes: Hírkelm
58
Detection-estimation problems
in communications
• Other example: non-coherent FSK, over nonselective Rayleigh fading channel
s1 t 2 A cos c1t 1 nt ; s2 t 2 A cos c 2t 2 nt
1
; ,
p1 p 2 2
0; otherwise
Frigyes: Hírkelm
2
p A 2 e
2 2
59
Detection-estimation problems
in communications
• 3. Signalshape undergoes random changes
• Example: antipodal signal set in very fast fading
DIGITAL
SOURCE
Transmission
channel
T(t)
SINK
s1 t st T t nt ; s2 t st T t nt
Frigyes: Hírkelm
60
Detection-estimation problems
in communications
• 4. Analog radio communications: one parameter
of the carrier is proportional to the time-contous
modulating signal.
• E.g..: analog FM; estimate: m(t)
st 2 A cos c t 2 .F.mt nt
• Or: digial transmission over frequency-selective
fading. For decision: h(t) must be known (i.e.
estimated
r t
ht si d nt ; i 1,2,... M
Frigyes: Hírkelm
61
Basics of decision theory
• Simplest example: simple binary transmission;
decision is based on N independent samples
• Model:
H0
SOURCE
CHANNEL
H1
OBSEVATION
SPACE
(OS)
(Only statistics are known)
Comment: now ˆ has nothing to
do with Hilbert transform
Decision rule
H0? H1?
DECIDER
Frigyes: Hírkelm
Ĥ
62
Basics of decision theory
• Two hypothesis (H0 és H1)
• Observation: N samples→the OS is of Ndimensions
• Observation: rT=(r1,r2…,rN)
• Decision: which was sent
• Results: 4 possibilities
•
1. H0 sent & Ĥ=H0 (correct)
•
2. H0 sent & Ĥ=H1 (erroneous)
•
3. H1 sent & Ĥ=H1 (correct)
•
4. H1 sent & Ĥ=H0 (erroneous)
Frigyes: Hírkelm
63
Bayes decision
• Bayes decision :
•
a.) probabilities of sending H0 or H1 are apriori known:
Pr H 0 ˆ P0 ; Pr H1 ˆ P1
•
• b.) each decision has some cost (Cik) (we decide
in favor of i while sent was k)
• c.) it is sure: false decision is more expensive
than correct:
C10 C00 ; C01 C11
Frigyes: Hírkelm
64
Bayes decision
•
d.) decision rule: the average cost (so called
risk, K) should be minimal
K C11 P1 Pr Hˆ H1 | H1 C01 P1 Pr Hˆ H 0 | H1
C P Pr Hˆ H | H C P Pr Hˆ H | H
00 0
0
0
10 0
1
0
OS
pr|H1(R|H1)
(Z1)
FORRÁS
„H1”
pr|H0(R|H0)
„H0”
(Z0)
domain of r; two pdf-s
correspond to each point.
Frigyes: Hírkelm
65
Bayes decision
• Question: how to partition OS in order to
get minimal K?
• For that: K in detail:
K C00 P0
C11P1
pr|H 0 R | H 0 dR C10 P0 pr|H 0 R | H 0 dR
Z 0
Z1
pr|H 1 R | H1 dR C01P1 pr|H 1 R | H1 dR
Z1
Z 0
• As some decision is taken:
• And so
1
Z1
Frigyes: Hírkelm
Z1 Z 0 Z 0
p 1
Z 0 Z1
Z 0
66
Bayes decision
• From that:
K P0C10 P1C11
P1 C01 C11 pr|H 1 pR | H1 dR
Z 0
P0 C10 C00 pr|H 0 pR | H 0 dR
Z 0
• Term 1 & 2 are constant
(Z )
FORRÁS
„H ”
„H ”
• And: both integrands >0
(Z )
p (R|H )
• Thus: Z1, where the first integrand is larger
Z0, where the second
pr|H1(R|H1)
1
0
1
r|H0
Frigyes: Hírkelm
0
0
67
Bayes decision
Z1 : P1 C01 C11 pr | H 1 pR | H1
P0 C10 C00 pr | H 0 pR | H 0
• And here we decide in favor of H1: Hˆ H1
Z 0 : P1 C01 C11 pr | H 1 pR | H1
P0 C10 C00 pr | H 0 pR | H 0
• decision: H0
Frigyes: Hírkelm
Hˆ H 0
68
Bayes decision
• It can also be written: decide for H1 if
pr | H 1 R | H1
P0 C10 C00
pr | H 0 R | H 0 P1 C01 C11
•
•
•
•
Otherwise for H0
Lefthand side: likelyhood ratio, Λ(R)
Righthand: (from certain aspect) the treshold, η
Comment: Λ depends only on the realisation of r
(on: what did we measure?)
•
η only on the a-priori probabilities and costs
Frigyes: Hírkelm
69
Example on Baysean decision
• H1: constant voltage + Gaussian noise
• H0: Gaussian noise only (designation:
φ(r;mr,σ2)
• Decision: on N independent samples of r
• At sample #i
pri | H 1 Ri ; m, 2 ; pri | H 0 Ri ;0, 2
N
p r | H 1 R | H 1 Ri ; m, 2 ;
• This resulting in
i 1
N
p r | H 0 R | H 0 Ri ;0, 2
Frigyes: Hírkelm
i 1
70
Example on Baysean decision
Ri m 2
2 exp 2 2
i 1
R
N
Ri 2
1
2 exp 2 2
i 1
N
• its logaritm
1
ln R
• resulting in
m
2
N
Nm 2
Ri 2 2
i 1
2
N
Nm
Nm
ˆ
ˆ
H H1 : Ri
ln
; H H 0 : Ri
ln
m
2
m
2
i 1
i 1
N
2
threshold
Frigyes: Hírkelm
71
Comments to the example
• 1. The threshold contains known quantities only,
independent of the measured values
• 2. Result depends only on the sum of ri-s – we
have to know only that; so called sufficient
statistics l(R):
N
l R Ri
i 1
• 2.a Like in this example: OS dimension is
whatever, l(R) is always 1D
• i.e.„1 coordinate” – the others are independent
on hypothesis
Frigyes: Hírkelm
72
Thus the decision process
H0
SOURCE
CHANNEL
H1
OBSERVATION
SPACE
(OS)
(Only statistics known)
l R
DECISION
SPACE
(DS)
DECIDER
Ĥ
Frigyes: Hírkelm
Decision rule
73
Comments to the example
• 3. Special case: C00=C11=0 és C01=C10=1
K P0
pr|H 0 R | H 0 dR P1 pr|H 1 R | H1 dR
Z1
Z 0
• (i.e. probability of erroneous decision)
• If P0,1≡0,5, the treshold N.m/2
Frigyes: Hírkelm
74
An other example, for home
• Similar but now the signal is not constant
but Gaussian noise with variance σS2
• I.e.
H1:Π φ(Ri;0,σS2+σ2)
•
H0:Π φ(Ri;0,σ2)
• Questions: threshold; sufficient statistics
Frigyes: Hírkelm
75
Third example - discrete
• Given two Poisson sources with different
expected values; which was sent?
• Remember: Poisson-distribution:
mn e n
Pr n
n!
• Hypotheses:
n m1
m1 e
H1 : Pr n
n!
Frigyes: Hírkelm
n m0
m0 e
; H 0 : Pr n
n!
76
Third example - discrete
n
m1 m1 m0
e
n
• Likelihood-ratio:
m
0
ln m1 m0
if n
: H1 ;
• Decision rule:
ln m1 m0
(m1>m0)
ln m1 m0
if n
: H0
ln m1 m0
• For sake of precision:
ln m1 m0
if n
: H1Q H 0 1 Q ;
ln m1 m0
Pr(Q 0) Pr(Q 1) 0,5
Frigyes: Hírkelm
77
Comment
• A possible situation: a-priori probebilities
are not known
• A possible method then:
compute maximal K (as a function of Pi;
and chose the decision rule wich
minimizes that (so called minimax
decision)
• (Note: this is not optimal at any Pi )
• But we don’t deal with this in detail.
Frigyes: Hírkelm
78
Probability of erroneous decision
• For that: compute relevant integral
• In example 1 (with N=1): Gs pdf should be
integrated over the hatched domains
d0
Threshold:
Frigyes: Hírkelm
d1
m
d 0 ln
m
2
79
Probability of erroneous decision
• Thus:
PE P0 P1 | 0 P1P0 | 1
1
d0
P1 | 0 R ;0, dR erfc
2
d0
2
1
m d0
P0 | 1 R ; m, dR erfc
2
d0
2
• If lnη=0: d0=d1=m/2 (threshold: the point of
intersection)
• Comment: N samples:
m N
d0
ln
2
m N
Frigyes: Hírkelm
80
Decision with more than 2
hypotheses
• M possible outcomes (e.g.:non-binary
digital communication – we’ll see: why for)
• Like before: each decision has a cost
• Their average is the risk
• With Bayes-decision: this is minimized
• Like before: Observation Space
•
decision rule: partitioning of the OS
Frigyes: Hírkelm
81
•
Decision with more than 2
hypotheses
M 1M 1
Like before, risk:
K ˆ Cij Pj p R| Hj R | H j dR
i 0 j 0
• From that (with M = 3)
Z i
K P0 C00 P1C11 P2 C22
P2 C02 C22 pr|H 2 R | H 2 P1 C01 C11 pr|H 1 R | H1 dR
Z 0
P2 C12 C22 pr|H 2 R | H 2 P0 C10 C00 pr|H 0 R | H 0 dR
Z1
P1 C21 C11 pr|H 1 R | H1 P0 C20 C00 pr|H 0 R | H 0 dR
Z 2
Frigyes: Hírkelm
82
Döntés kettőnél több hipotézisnél
• Likelyhood ratio-series :
1 R
pr| H 1 R | H1
pr| H 0 R | H 0
; 2 R
pr| H 2 R | H 2
pr| H 0 R | H 0
• Decision rule(s):
Hˆ H1 or H 2 : P1 C01 C11 1 R P0 C10 C00 P2 C12 C02 2 R
Hˆ H or H if less;
0
2
Hˆ H 2 or H1 : P2 C02 C22 2 R P0 C20 C00 P1 C21 C01 1 R
Hˆ H or H if less;
0
1
Hˆ H 2 or H1 : P2 C12 C22 2 R P0 C20 C10 P1 C21 C11 1 R
Hˆ H or H if less;
Frigyes: Hírkelm
1
0
83
Döntés kettőnél több hipotézisnél
(M =3)
• This defines 3 streight lines (in the 2D decision
space)
Λ2(R)
H0
H2
H1
Λ1(R)
Frigyes: Hírkelm
84
Ecample: special case – error
probability
• The average error
probability is
minimized
Cii 0; Ci j 1
Λ2(R)
H2
• Then we get:
Λ2 = (P1/P2)Λ1.
P0 /P2
H1
H0
Frigyes: Hírkelm
P0 /P1
Λ1(R)
85
The previous, detailed
Hˆ H 1 vagy H 2 : P11 R P0
Hˆ H vagy H ha kisebb;
0
Cii 0; Ci j 1
2
Hˆ ˆ H 1 vagy H 2 : P1 C01 C11 1 R P0 C10 C00 P2 C12 C02 2 R
H H 2 vagy H 1 : P2 2 R P0
Hˆ H vagy H ha kisebb;
0
Hˆ H 0 vagy H 1 ha kisebb;
2
C20 CH00 P1 C21 C01 1 R
Hˆ H 2 vagy H 1 : P2 C02 C 22 2 R ΛP20(R)
2
Hˆ H 2 vagy H 1 : P2 ˆ2 R P11 R
Λ2 = (P1/P2)Λ1.
H H vagy H ha kisebb;
ˆ
0
P01 /P2
H vagy
H1 H
vagy
H
kisebb;
0 ha
Hˆ H
:
P
C
C
C R
2
1
2
12
22 2 R P0 C 20 C10 P1 C 21
H1 11 1
H0
Hˆ H vagy H ha kisebb;
1
Frigyes: Hírkelm
0
P0 /P1
Λ1(R)
86
Example –special case, error
probability: a-posteriori prob.
• Very important, based on the precedings: we have
H1 vagy H 2 : P1 pr| H 1 R | H 1 P0 pr| H 0 R | H 0
H 2 vagy H1 : P2 pr| H 2 R | H 2 P0 pr| H 0 R | H 0
H 2 vagy H 0 : P2 pr| H 2 R | H 2 P1 pr| H 1 R | H1
• If we divide each by pr(R) and apply Bayes theorem we
get:
Pr b | a Pr a
Bayes theorem : Pr a | b
Pr b
Pr H1 | R Pr H 0 | R; Pr H 2 | R Pr H 0 | R; Pr H 2 | R Pr H1 | R
Frigyes: Hírkelm
(these
are a-posteriori probabilities)
87
Example –special case, error
probability: a-posteriori prob.
• I.e. we have to decide on the max. aposteriori probabilities.
• Rather plausible: probability of correct
decision is the highest if we decide on
what is the most probable
Frigyes: Hírkelm
88
Bayes- theorem (conditional
probabilities)
Pr a, b
Pr a, b
Pr a | b ˆ
; Pr b | a ˆ
Pr b
Pr a
Pr b | a . Pr a
Pr a | b
Pr b
• For discrete variables:
pb | a . pa
pa | b
pb
• Continuous:
• a discrete, b continuous:
pb | a . Pr a
Pr a | b
pb
Frigyes: Hírkelm
89
Comments
• 1. The Observation Space is N dimensional (N
is number of the observations).
•
The Decision Space is M-1 dimensional (M is
number of the hypotheses).
• 2. Explicitly we dealt only with independent
samples; investigation is much more
complicated is these are correlated
• 3. We’ll see that in digital transmission the case
N>1 is often not very important
Frigyes: Hírkelm
90
Basics of estimation theory:
parameter-estimation
• Task is to estimate unknown parameter(s)
of an analog or digital signal
• Examples: voltage measurment in noise
x s; r s n
•
digital signal; phase measurement
2
si t ai t cos c t
i ;
M
(i 0,1,... M 1)
Frigyes: Hírkelm
91
Basics of estimation theory:
parameter-estimation
• Frequency estimation
•
synchronizing an unaccurate oscillator
• Power estimation of interfering signals
•
interference cancellation (via the
antenna or multiuser detection)
• SNR estimation
• etc
Frigyes: Hírkelm
92
Basics of estimation theory:
parameter-estimation
• The parameter can be:
•
i. a random variable (pdf is assumed to be known) or
ii. means: we
•
ii. an unknown deterministic value
have no a-priori
• Model:
i. means: we have
knowledge about
some a-priori knowledge
its magnitude
Mapping to
OS
Domain of
the estimated
parameter
a
PARAMETER
SPACE
Frigyes: Hírkelm
pa(A)
Becslési
szabály
OBSERVATION
SPACE (OS)
â R
ESTIMATION
SPACE
93
Example – details (estimation 1)
• We want to measure
voltage a
• We know that a V
• And that Gaussian
noise is added
φ(r;0,σn2)
• Observable
parameter is r = a+n
R A2
1
• Mapping of the
p r |a R | A
exp
2
2 n
parameter to OS:
2 n
Frigyes: Hírkelm
94
Parameter estimation –parameter
is a RV
• Similar principle: estimation ha some cost; its
average is the risk; we want to minimize that
• realization of the parameter : a
• Observation vector: R
• Estimated value: â(R)
• Cost is in the general case a 2-variable function:
C(a,â)
• Error of the estimation is ε = a-â(R)
• Often the cost depnds only on that: C = C (ε)
Frigyes: Hírkelm
95
Parameter estimation –
parameter is a RV
• Examples:
0; / 2
C ; C ; C
1; / 2
2
• Risk is
K EC
CA aˆ R pa,r A, R dRdA
• Joint pdf can be
written:
pa ,r A, R pr R pa|R A | R
Frigyes: Hírkelm
96
Parameter estimation –
parameter is a RV
• Applying to the square cost function
(subscript ms: mean square)
2
pR A aˆ R pa|r A | R dAdR
K ms
• K=min (i.e. dK / daˆ 0 )where the inner
integral = min (as the outer is i. pozitiv and
ii. does not depend on A); this holds where
Frigyes: Hírkelm
97
Parameter estimation –
parameter is a RV
d
2
ˆ
A
a
R
pa|r A | R dA
daˆ
2 Ap a|r A | R dA 2aˆ R pa|r A | R dA 0
• The second integral =1, thus
aˆ R ms
Ap A | R dA
a|r
• i.e. a-posteriori expected value of a.
• (According to the previous definition : a-posteriori knowledge: what
is gained from measurement/investigation.)
Frigyes: Hírkelm
98
Comment
• Comming back to the risk:
K ms
2
ˆ
pr R dR A aR pa|r A | R dA
• Inner integral is now: the conditional
variance, σa2(R). Thus
2
2
•
K p R R dR E
ms
r
a
a
Frigyes: Hírkelm
99
Parameter estimation –
parameter is a RV
• An other cost function: 0 in a band Δ
elsewhere 1. The risk is now
Δ
aˆ R 2
K un pr R dR 1 pa|r A | R dA
aˆ R 2
• (un: uniform)
• K=min, if result of the estimation is
maximum of the conditional pdf (if Δ is
small): max. a-posteriori – MAP-estimation
Frigyes: Hírkelm
100
Parameter estimation –
parameter is a RV
• Than derivative of the log conditional
pdf=0
ln p A | R
0
a|r
A
A aˆ R
• (so called MAP-equation)
• Applying Bayes-theorem log-cond. pdf
can be written
p R | A. p A
pa|r A | R
r|a
pr R
a
ln pa|r A | R ln pr|a R | A ln pa A ln pr R
Frigyes: Hírkelm
101
Parameter estimation –
parameter is a RV
• First term is the statistical relationship between
A-R
• Secnd is the a-priori knowledge
• Last term does not depend on A.
• Thus what is maximal is
l A ˆ ln pr|a R | A ln pa A
• And so the MAP equation is:
ln pr|a R | A ln pa A
A aˆ R , ahol :
0
A
A
Frigyes: Hírkelm
102
Once again
• Minimum Mean Square Error (MMSE)
estimate is the average of the a-posteriori.
pdf
• Maximum a-posteriori (MAP) estimate is
the maximum of the a-posteriori. pdf
Frigyes: Hírkelm
103
Example (estimate-2)
• Again we have Gauss
ian a+n, but N
independent samples 1
A2
pa ( A)
exp
2
2 a
2 a
N
pr|a R | A
i 1
2
Ri A
1
exp
2
2 a
2 n
• What we need to p A | R pr|a R | A pa A
a |r
pr R
(any) estimate is
Frigyes: Hírkelm
104
Example (estimate-2)
• Note that p(R) is constant from the point of
view of the conditional pdf, thus its form is
irrelevant.
• Thus
N
2
Ri A
1
A2
1 i 1
pa|r A | R k1 R
exp
2
N
2
n
a
2
2 a 2 n
i 1
Frigyes: Hírkelm
105
Example (estimate-2)
• This is a Gaussian distribution and we need its
expected value. For that the square must be
completed in the exponent, resulting in
1
pa|r A | R k 2 R exp
2
2
2
with 2
Frigyes: Hírkelm
a
1
A 2
2
a n N N
2
a n
2
2
N a n
2
2
Ri
i 1
N
2
106
2
Example (estimate-2)
• In Gaussian pdf average = mode thus
a2
1 N
aˆms R aˆmap R 2
Ri
2
a n / N N i 1
Frigyes: Hírkelm
107
Example (estimate-3)
• a is now also φ(A;0,σn2), but only
s(A), a nonlinear function of it can be
observed (e.g. phase of a carrier);
noise is added: ri s A ni
• Az a-posteriori sűrűségfüggvény
N
2
Ri s A
2
1 i 1
A
pa|r A | R k R exp
2
2
2
n
a
Frigyes: Hírkelm
108
Példa (becslés-3)
• Remember: the MAP equation:
ln pr|a R | A ln pa A
A aˆ R , ahol :
0
A
A
• Aplying that to the preceding
a
aˆ MAP R 2
n
2
Frigyes: Hírkelm
s A
Ri s A A
i 1
N
109
Parameter estimation – the
parameter is a real constant
• In that case it is the measurement result what is
a RV. E.g. in the case of a square cost function
the risk is:
2
ˆ
K A aR A pr|a R | AdR
• This is minimal if â(R)=A. But this has no sense:
that’s the value what we want to estimate.
• In that case – i.e. if we have no a-priori
knowledge – the method can be:
• we search for a function of A – an estimator –
which is „good” (has average close to A and low
Frigyes: Hírkelm
110
variance)
Közbevetve: a tételek eddig
• 1. Sztohasztikus folyamatok: főként a
fogalmak definiciója
• (sztoh. foly.; val. sűrűségek-eloszlások, erős
stacioanaritás; korrelációs fv. gyenge
stacionaritás, időátlag, ergodicitás spektrális
sűrűség, lin. transzformáció)
• 2. Modulált jelek – komplex burkoló
• (mod. jelek leírása, időfüggv., analitikus függv.
(frekv-idő) komplex burkoló, egyikből a másik,
szűrő)
Frigyes: Hírkelm
111
Közbevetve: a tételek eddig/2
• 3. A döntéselmélet alapjai
• (Milyen feladatok; költség, kockázat, a-priori val,
Bayes-f. döntés; a megfigyelési tér opt.
felosztása, küszöb, elégséges statisztika, M
hipotézis (nem kell rész-letezni, csak az
eredmény), a-post. val.)
• 4. A becsléselmélet alapjai
• (Val.-vált paraméter, költségfüggv. min. ms. max.
a-post.; determ. param, likelyhood fv. ML, MVU
becslés, torzítatlan; Cramér-Rao; hatékony)
Frigyes: Hírkelm
112
Parameter estimation – the parameter is
a real constant – criteria for the estimator
• Average of the – somehow chosen – estimator:
A
Eâ R ˆ â R p r|A R | A dR A B
A B(A)
• If this is = A: unbiased estimate; operation of
estimation is searching of the average
• If bias B is constant: can be subtracted from the
average (e.g. in optics: background radiation)
• if B=f(A): biased estimate
Frigyes: Hírkelm
113
Parameter estimation – the parameter is
a real constant – criteria for the estimator
• Error variance:
aˆ R A
2
•
•
•
•
•
•
•
•
2
2
ˆ
E aR A B 2 A
Best would be:
unbiased and
low variance
more precisely: unbiased and minimális variance
for any A (MVU)
Such estimator does or does not exist
An often applied estimator: maximum likelihood
Likelihood function.: pr|a R | A as a function of A
Estimator of A then: maximum of the likelihood function
Frigyes: Hírkelm
114
A few of details
This is certainly better
than that:
variance of the
measuring result is lower
Θ
Further:
Frigyes: Hírkelm
115
Max. likelihood (ML) estimation
• Necessary condition of max of ln likelihood
A aˆ R , ahol :
ln pr|a R | A
A
0
• Remember: in the case of a RV: MAP estimate
ln pr|a R | A ln pa A
A aˆ R , ahol :
0
A
A
• Thus: ML is the same without a-priori knowledge
Frigyes: Hírkelm
116
Max. likelihood (ML) estimation
• For any unbiased estimate: variance cannot be lower
than Cramér-Rao lower bound, CRLB,
2 aˆ R A
ln p R | A
r|a
E
A
2
1
• or an other form:
ln pr|a R | A
-E
2
A
2
2 aˆ R A
Frigyes: Hírkelm
1
117
Max. likelihood (ML) estimation
• Proof of CRLB: via Schwartz- unequality
• If the estimate is equal to CRLB: efficient
estimate.
Example of existing MVU
but no (or unknown)
efficient estimate
(var θ^1>CRLB)
Frigyes: Hírkelm
118
Example 1 (estimation,
nonrandom)
• Voltage + noise, but the voltage is a real,
nonrandom
• Max likelihood estimation
ri A ni ; i 1,2,... N ; pn (n) n;0, n
ln pr| A
N 1
2
A
n N
1 N
aˆ ML R Ri
N i 1
Frigyes: Hírkelm
2
Ri A 0
i 1
N
119
Example 1 (estimation,
nonrandom)
• Is it biased??
1
Eaˆ ML R
N
N
1
ERi N NA A
i 1
• Expected value of â is the (a-priori not
known) true value; i.e. it is unbiased
Frigyes: Hírkelm
120
Example 2 (estimation,
nonrandom): phase of a sinusoid
• What can be measured is a function of the
quantity to be estimated
• (Independent Samples are taken at equal
times)
• A likelyhood function to be maximized:
Frigyes: Hírkelm
121
Example 2 (estimation,
nonrandom): phase of a sinusoid
• This is = max, if the function below = min
•
=0
• The righthand side tends to 0, with large N
Frigyes: Hírkelm
122
Example 2 (estimation,
nonrandom): phase of a sinusoid
• And then finally:
Frigyes: Hírkelm
123
2. Transmission of digital signals
over analog channels: effect of
noise
Introductory comments
• Theory of digital transmission is (at least
partly) application of decision theory
• Definition of digital signals/transmission:
•
Finite number of signal shapes (M)
•
Each has the same finite duration (T)
•
The receiver knows (a priori) the signal
shapes (they are stored)
• So the task of the receiver is hypothesis
testing.
Frigyes: Hírkelm
125
Introductory comments –
degrading effects
ωc
s(t)NONLINEAR
AMPLIFIER
n(t)
BANDPASS
FILTER
FADING
CHANNEL
z0(t)
ωc
z1(t)
ω1
ACI
ω2
ACI
INTERFERENCE
INTERFERENCE
z2(t)
INTERFERENCE
Frigyes: Hírkelm
+
BANDPASS
FILTER
DECISION
MAKER
CCI
126
Introductory comments
• Quality parameter: error probability
• (I.e. the costs are:
•
Cii 0; Ci k 1; i, k 1,2,..., M )
• Erroneous decision may be caused by:
•
additíve noise
•
linear distortion
•
nonlinear distortion
•
additive interference (CCI, ACI)
•
false knowlledge of a parameter
•
e.g. synchronizing error
Frigyes: Hírkelm
127
Introductory comments
• Often it is not one signal of which the error
probability is of interest but of a group of
signals – e.g. of a frame.
• (A secondquality parameter: erroneous
recognition of T : the jitter.)
Frigyes: Hírkelm
128
Transmission of single signals in
additive Gaussian noise
• Among the many sources of error now we
regard only this one
• Model to be investigated:
TIMING (T)
n(t)
SIGNAL
GENERATOR
SOURCE
{mi}, Pi
Frigyes: Hírkelm
mi
+
si(t)
ˆm
DECISION
MAKER
SINK
r(t)= si(t)+n(t)
129
Transmission of single signals in
additive Gaussian noise
•
Specifications:
• a-priori probabilities Pi are known
• support of the real time finctions s t is
i
•
(0,T)
•
their energy is finite (E: square integral
of the time functions)
mi si t relationship is mutual and
•
unique (i.e. there is no error in the
transmitter)
Frigyes: Hírkelm
130