Math Camp 2: Probability Theory

Transcript Math Camp 2: Probability Theory

Math Camp 2: Probability
Theory
Sasha Rakhlin
Introduction









-algebra
Measure
Lebesgue measure
Probability measure
Expectation and variance
Convergence
Convergence in probability and almost surely
Law of Large Numbers. Central Limit Theorem
Useful Probability Inequalities





Jensen’s inequality
Markov’s inequality
Chebyshev’s inequality
Cauchy-Schwarz inequality
Hoeffeding’s inequality
-algebra

Let  be a set. Then a -algebra  is a
nonempty collection of subsets of 
such that the following hold:




If E  ,  - E  
If Fi   i, i Fi  
Measure

A measure  is a function defined on a
-algebra  over a set  with values in
[0, ] s.t.



() = 0
(E) = i (Ei) if E = i Ei
(, , ) is called a measure space
Lebesgue measure

The Lebesgue measure  is the unique
complete translation-invariant measure
on a -algebra s.t. ([0,1]) = 1
Probability measure



Probability measure is a positive
measure  over (, ) s.t. () = 1
(, , ) is called a probability space
A random variable is a measurable function
X: R
Expectation and variance

If X is a random variable over a
probability space (, , ), the
expectation of X is defined as
E ( X )   Xd


The variance of X is
var( X )  E (( X  E ( X )) 2 )
Convergence


xn  x if   > 0  N s.t. |xn – x| <  for n
>N
P
X n 
X (Xn converges to X in
probability) if P( X n  X   )  0   > 0
Convergence in probability
and almost surely

Any event with probability 1 is said to
happen almost surely. A sequence of
real random variables Xn converges
almost surely to a random variable X iff
P ( lim X n  X )  1
n 

Convergence almost surely implies
convergence in probability
Law of Large Numbers.
Central Limit Theorem

Weak LLN: if X1, X2, … is an infinite
sequence of i.i.d. random variables with
P
 = E(X1) = E(X2) =…, X n 
 , that
is, lim P( X     )  0
Xn  
CLT: lim
P(
 z )  ( z ) where  is
n 
n

n
/ n
the cdf of N(0,1)
Jensen’s inequality

If  is a convex function, then
 ( E ( X ))  E ( ( X ))
Markov’s inequality

E( X )
If X  0 and t  0, Pr( X  t ) 
t
Chebyshev’s inequality


If X is random variable and t > 0,
var( X )
Pr(| X  E ( X ) | t ) 
t2
e.g.

Pr(| X  E ( X ) | 2 ) 
1
4
Cauchy-Schwarz inequality

If E(X2) and E(Y2) are finite,
E ( XY )  E ( X 2 ) E (Y 2 )
Hoeffding’s inequality

Let ai  Xi  bi for i = 1, …, n. Let Sn = 
Xi, then for any t > 0,
2 t 2
Pr( Sn  E ( Sn )  t )  2 i 1
n
( bi ai ) 2

Math Camp 2: Probability Theory

Transcript Math Camp 2: Probability Theory

Directory