The Central Limit Theorem
Download
Report
Transcript The Central Limit Theorem
Paul Cornwell
March 31, 2011
1
Let X1,…,Xn be independent, identically
distributed random variables with positive
variance. Averages of these variables will be
approximately normally distributed with
mean μ and standard deviation σ/√n when n
is large.
2
How large of a sample size is required for the
Central Limit Theorem (CLT) approximation
to be good?
What is a ‘good’ approximation?
3
Permits analysis of random variables even
when underlying distribution is unknown
Estimating parameters
Hypothesis Testing
Polling
4
Performing a hypothesis test to determine if
set of data came from normal
Considerations
◦ Power: probability that a test will reject the null
hypothesis when it is false
◦ Ease of Use
5
Problems
◦ No test is desirable in every situation (no universally
most powerful test)
◦ Some lack ability to verify for composite hypothesis
of normality (i.e. nonstandard normal)
◦ The reliability of tests is sensitive to sample size;
with enough data, null hypothesis will be rejected
6
Symmetric
Unimodal
Bell-shaped
Continuous
7
Skewness: Measures the asymmetry of a
distribution.
◦ Defined as the third standardized moment
◦ Skew of normal distribution is 0
X
1 E
3
X
n
i 1
i
X
3
(n 1) s 3
8
Kurtosis: Measures peakedness or heaviness
of the tails.
◦ Defined as the fourth standardized moment
◦ Kurtosis of normal distribution is 3
4
x
2 E
n
X
n
i 1
i
X
4
(n 1) s 4
9
Cumulative distribution function:
X
F ( x; n, p) n C i p i (1 p) ni
i 0
E[ X ] np
Var[ X ] np(1 p)
10
parameters Kurtosis
Skewness
% outside
1.96*sd
K-S
distance
Mean
Std Dev
n = 20
p = .2
-.0014
(.25)
.3325
(1.5)
.0434
.128
3.9999
1.786
n = 25
p = .2
.002
.3013
.0743
.116
5.0007
2.002
n = 30
p = .2
.0235
.2786
.0363
.106
5.997
2.188
n = 50
p = .2
.0106
.209
.0496
.083
10.001
2.832
.005
.149
.05988
.0574
19.997
4.0055
n = 100
p = .2
*from R
11
Cumulative distribution function:
xa
F ( x; a, b)
ba
ab
E[ X ]
2
(b a ) 2
Var[ X ]
12
12
parameters
Kurtosis
Skewness
% outside
1.96*sd
K-S
distance
Mean
Std Dev
n=5
(a,b) = (0,1)
-.236
(-1.2)
.004
(0)
.0477
.0061
.4998
.1289 (.129)
n=5
(a,b) = (0,50)
-.234
0
.04785
.0058
24.99
6.468 (6.455)
n=5
(a,b) = (0, .1)
-.238
-.0008
.048
.0060
.0500
.0129 (.0129)
n=3
(a,b) = (0,50)
-.397
-.001
.0468
.01
24.99
8.326 (8.333)
*from R
13
Cumulative distribution function:
F ( x; ) 1 e
E[ X ]
x
1
Var[ X ]
1
2
14
parameters
Kurtosis
Skewness
% outside
1.96*sd
K-S
distance
Mean
Std Dev
1.239
(6)
.904
(2)
.0434
.0598
.9995
.4473 (.4472)
n = 10
.597
.630
.045
.042
1.0005
.316 (.316)
n = 15
.396
.515
.0464
.034
.9997
.258 (.2581)
n=5
λ=1
*from R
15
Find n values for more distributions
Refine criteria for quality of approximation
Explore meanless distributions
Classify distributions in order to have more
general guidelines for minimum sample size
16
Paul Cornwell
May 2, 2011
17
Central Limit Theorem: Averages of i.i.d.
variables become normally distributed as
sample size increases
Rate of converge depends on underlying
distribution
What sample size is needed to produce a
good approximation from the CLT?
18
Real-life applications of the Central Limit
Theorem
What does kurtosis tell us about a
distribution?
What is the rationale for requiring np ≥ 5?
What about distributions with no mean?
19
Probability for total distance covered in a
random walk tends towards normal
Hypothesis testing
Confidence intervals (polling)
Signal processing, noise cancellation
20
Measures the “peakedness” of a distribution
Higher peaks means fatter tails
4
x
2 E
3
n
21
Traditional assumption for normality with
binomial is np > 5 or 10
Skewness of binomial distribution increases
as p moves away from .5
Larger n is required for convergence for
skewed distributions
22
Has no moments (including mean, variance)
Distribution of averages looks like regular
distribution
CLT does not apply
1
f ( x)
(1 x 2 )
23
α = β = 1/3
Distribution is
symmetric and bimodal
Convergence to normal
is fast in averages
24
Heavier-tailed,
bell-shaped curve
Approaches normal
distribution as
degrees of freedom
increase
25
4 statistics: K-S distance, tail probabilities,
skewness and kurtosis
Different thresholds for “adequate” and
“superior” approximations
Both are fairly conservative
26
Distribution
∣Kurtosis∣
<.5
∣Skewness∣ Tail Prob.
<.25
.04<x<.06
K-S Distance
<.05
max
Uniform
3
1
2
2
3
Beta (α=β=1/3)
4
1
3
3
4
Exponential
12
64
5
8
64
Binomial (p=.1)
11
114
14
332
332
Binomial (p=.5)
4
1
12
68
68
Student’s t
with 2.5 df
NA
NA
13
20
20
Student’s t
with 4.1 df
120
1
1
2
120
27
Distribution
∣Kurtosis∣
<.3
∣Skewness∣ Tail Prob.
<.15
.04<x<.06
K-S Distance
<.02
max
Uniform
4
1
2
2
4
Beta (α=β=1/3)
6
1
3
4
6
Exponential
20
178
5
45
178
Binomial (p=.1)
18
317
14
1850
1850
Binomial (p=.5)
7
1
12
390
390
Student’s t
with 2.5 df
NA
NA
13
320
320
Student’s t
with 4.1 df
200
1
1
5
200
28
Skewness is difficult to shake
Tail probabilities are fairly accurate for small
sample sizes
Traditional recommendation is small for
many common distributions
29