Transcript Document
Tutorial 8, STAT1301 Fall 2010, 16NOV2010,
MB103@HKU
By Joseph Dong
Recall: A Partition on
a Set
ο
ο Any exhaustive and disjoint collection of subsets of a
given set forms a partition of that set.
ο E.g.
ο π΅, π΅π forms a trivial partition of the presumed set
π = π΅ βͺ π΅π .
ο If π: π β π = 1,2,3 , then the collection of pre-images
of atoms of the range, π β1 1 , π β1 2 , π β1 3 , forms a
partition of the domain π.
2
Recall: Conditioning
on a Partition
ο
ο Shares the same idea
with
ο Divide and Conquer
ο Casewise enumeration
ο A Tree-diagram
ο Formal language:
ο Goal = find the
probability of event πΈ,
β πΈ .
ο It is equivalent to
finding the
intersection of it with
the sure event Ξ©.
β πΈ β‘β πΈβ©Ξ© .
3
Recall: Conditioning
on a Partition (contβd)
ο
ο Formal language (continued)
ο Now break down the sure event into a number of
manageable smaller pieces and these pieces together
forms a partition {π΄π |π β πΌ} of the sure event Ξ©.
ο If we investigate all such events πΈ β© π΄π , then weβre
done.
β πΈ β© Ξ© = ββ πΈ β© π΄π
ο The hardcore of the problem now becomes finding
each β πΈ β© π΄π , and this is where the conditioning
takes place.
β πΈ π΄π β
β π΄π
ο Assuming it is more straight forward a task to find
β πΈ π΄π and β π΄π .
4
Recall: What does an R.V. do
to its State Space?
ο An r.v. cuts the state space
into blocks. On each of
these blocks, the r.v. sends
all points there to a
common atom in the
sample space.
ο
ο An r.v. causes a partition
on the state space.
ο Conversely, given a
partition on the state space,
you can also define random
variables on it so that it
βconformsβ the partition by
taking one value for each
block.
Random
Variable
Partition
on π
5
Conditioning an
Event on an R.V.
ο
ο Since an r.v. cuts the
state space into a
partition, conditioning
on an r.v. is just
conditioning on that
partition it caused on
the state space.
ο The meaning of β πΈ π
is now clearly illustrated
on the right.
6
β πΈ π as a Random Variable
ο
ο It contains a random variable π
inside, making itself a function of π.
ο It has a distribution and
expectation.
ο Lotus
ο Question: Whatβs the meaning of its
expected value?
ο To fix its value by fixing an π value:
ο β πΈ π = π₯1 , β πΈ π β π₯1 , π₯2
ο Every fixed value is now a
conditional probability involving
two events.
7
Exercise:
Finding β πΈ from β πΈ π
ο
ο This is the prototypical problem of
finding the probability of an event
via the technique of conditioning
on a random variable.
ο Hint: Ponder on the link
between Law of Total Probability
and Expectation.
ο Ans:
β πΈ =πΌβ πΈπ
8
β ππ
ο
ο It involves two r.v.βs now.
ο Given β π π :
β π π is a function of the
ο Q1: How to find β π =
bivariate random vector
π, π .
ο Fixing π will give you
back the conditional
density of π given π at the
fixed position.
9
Conditional, Marginal, and Joint
densities
ο
ο Difference among 3 types of densities:
ο a conditional density β π π
ο is normalized by the marginal probability of β π
ο is a point dividing a row sum/integral
ο is the density of π|π
ο a joint density β π, π
ο is normalized by the entire joint space
ο is a point dividing the sum/integral of entire space
ο is the density of π, π
ο a marginal density β π
ο is also normalized by the entire space
ο is a row sum dividing the sum/integral of entire space
ο is the density of π
10
Handout Problem 1
ο
11
Recall: Whatβs the Expectation
of a random variable
ο
ο First of all, the random variable has to be numerically valued.
Thatβs why expectation is also known as the βexpected valueβ
and is a numerical characteristic of the sample space (a subset
of β or simply β itself with zero densities equipped at those
impossible points).
+β
πΌ π =
ββ
π₯ππ π₯ ππ₯
ο The expectation is both conceptually and technically equivalent
to the location of the center of probability mass of the sample
space.
ο Expectation provides only partial information of the random
variable because it eliminates randomness by giving you back
only 1 representative point of the sample space.
12
For examples,
ο
ο πΈ|π is a set-valued random variable.
ο Given π = π₯, it evaluates to the set πΈ β© π = π₯ .
ο We cannot have an expected value defined for πΈ|π.
ο Clarification: πΈ|π is not β πΈ π . The latter is numerically
valued, as we have previously established for its expected
value: πΌ β πΈ π = β πΈ .
ο More elaboration: On the set-theory layer, πΈ|π is not
strictly different from the set-r.v. pair πΈ, π . But when
onto the probability-theory layer, β πΈ π is
normalized by a different space than is β πΈ, π .
13
πΌ ππΈ
ο
ο π|πΈ is a numerically-valued random variable. We can
compute its expected value.
ο πΌ π πΈ vs πΌ π : their sample spaces are different.
ο Compute πΌ π πΈ using βπΈ = β β
πΈ β
π π
Ξ©
β ππ, πΈ
β πΈ
+β
=
ββ
β β
,πΈ)
β πΈ
π₯ππ|πΈ π₯ ππ₯
ο Compute πΌ π using β
+β
π π β ππ =
Ξ©
ββ
π₯ππ π₯ ππ₯
14
Warm-up exercise
ο
ο Handout Problem 2
15
πΌ π π : concepts
ο
ο First of all, this is a random variableβa function of π.
ο Its randomness comes from the state space of π, but the
mapping mechanism is worked out together by both of π
and π.
ο This expression is known as the conditional expectation of
the conditionee π given the conditioner π.
ο The expectation is done with respect to π.
ο To be precise, should say w.r.t. π|π.
ο There are multiple (or even a continuum of) sample spaces
of π|π, depending on which atom value π takes. After fixing
π to an atom, or equivalently, a block in the state space that
has been partitioned by π, the expression πΌ π π = π₯1 is just
a constant.
ο The expectation eliminates the randomness of π given π.
16
πΌ π π as an r.v.
ο
ο It uses the joint state space
of π and π as its own state
space.
ο It uses a degenerated
version of the sample space
of π as its own sample
space.
ο The degeneration preserves
the locus of the overall
center of mass.
ο Each point in the
degenerated space is a
block center of mass
17
βDegeneration preserves
overall center of massβ
ο
ο π cuts its own state space as
well as the joint state space
of it and π.
ο This partition of the joint
state space will be mapped
by π to a partition on its
own sample space (a
numeral set).
ο Then the expression
πΌ π π = π₯1 represents the
locus of center of mass of the
first block of the partition.
ο πΌ π π represents the
totality of loci of these block
centers of mass.
18
Exercise:
Finding πΌ π from πΌ π π
ο
ο This is the prototypical problem of finding the expectation of a random
variable via the technique of conditioning on another random variable.
ο Ans.
πΌ π =πΌ πΌ ππ
ο In the divide-conquer-merge paradigm:
ο Divide is done by the conditioner π
ο Conquer refers to the inner expectation carried out at each
division
ο Merge refers to the outer expectation to piece up the whole
plate. This exercise addresses the merge step.
ο Compare with the conditional probability, ponder the link
between them.
19
Conditional Variance
ο
ο Finding variance by
conditioning:
π π
=πΌ π π π +π πΌ π π
ο Pf.
ο Unfortunately, the
degeneration of the
sample space of π does
not preserve second
moments.
ο Thatβs why there is the
addendum π πΌ π π
in the formula.
20
Summary:
Conditional Expectation
ο
The key observations are
ο Obs1: To find the center of mass of a piece of material,
you can divide it into a few blocks, find their centers of
mass, and then find the center of mass of these block
centers of masses. The initial division of the piece is quite
arbitrary.
ο This fundamental law of physics supports the many nice
properties of expectation in the calculus of probability.
ο Obs2: A random variable partitions its state space into a
collection of atom-valued blocks.
ο This suggests using random variable as a general device to
divide the piece mentioned in Obs1. Such a random
variable is called the conditioner.
21
Linking β πΈ π to πΌ π π
ο
ο Trick: Use indicator of set πΈ. The indicator is a Bernoulli
random variable.
ο Reason: β πΈ π β‘ πΌ πΌπΈ π
ο Conclusion: The conditional probability of an event
conditioned on a random variable (a partition) is a
conditional expectation of the indicator of that event
conditioned on the same random variable in disguise.
ο All properties of conditional expectation should apply to
conditional probability. Such as the Law of Total
Probability is just πΌ π β‘ πΌ πΌ π π in disguise.
22
Choosing Conditioner
ο
ο The art of conditioning lies in the choice of the
conditioner.
ο Usually, if our unknown target is the r.v. π, and we
know that π is a known function of a known r.v. π,
then it would be natural to use π as the conditioner
for π, that is
ο Divide the state space of π by π
ο Conquer every πΌ π π
ο Merge them into πΌ π
23
Exercises
ο
ο Handout problem 3
ο Handout problem 4
ο Handout problem 5
ο Handout problem 6
24