Transcript Review

Review
Discrete Distributions
• Binomial distribution,
• Negative binomial distribution,
• Hypergeometric distribution,
• Poisson distribution.
Expected Value
• If X is a discrete rv and p(x) is the value
of its probability distribution at x, the
expected value of X is defined as
E ( X )   X   x  p ( x)
x
Example
• Toss a coin 4 times.
X = number of heads. What’s E(X) ?
• The pmf of X is
x:
0
1
2
3
4
p(x): 1/16 4/16 6/16 4/16 1/16
• So,
1
4
6
4
1
E ( X )  0   1  2   3   4   2.
16
16
16
16
16
Example
• Let X be a Bernoulli rv with pmf
1  p x  0
p ( x)  
x 1
p
Then E(X) = 0p(0) + 1p(1) = p. So the
expected value of X is just the
probability that X takes on the value 1.
Example
• X = number of children born up to and
including the first boy. The pmf of X is
x 1
p( x)  p(1  p) ,
x  1,2,3,...
• Then

E ( X )   x  p( x )   x p (1  p )
x 1

d
1
x
 p  [ (1  p ) ]  .
dp
p
x 1
x 1
Expected Value of a
Function of a RV
• If a rv X has a pmf p(x), then the
expected value of any function h(X) is
computed by
E[(h( X )]   h( X )   h( x)  p( x)
x
• Special case: h(x) = a·x + b.
E(a X + b) = a·E(X) + b.
Why?
Variance
• The expected value measures the center
of a probability distribution.
• Variance measures the variability of a
pmf.
Variance
• Let X have pmf p(x) and expected value
. Then the variance of X, denoted by
V ( X ) or  , or just  , is
2
x
2
V(X)  E[( X   ) ]   (x -  )  p( x ).
2
2
• The standard deviation (SD) of X is
 x  V ( X ).
Example
• If X has pmf :
x
1
2
6
8
p(x) .4
.1
.3
.2
Then
 = 1×.4 + 2×.1 + 6×.3 + 8×.2 = 4 .
2 = (1 - 4)2×.4 + (2 - 4)2 × .1 + (6 - 4)2 ×.3
+ (8 - 4)2 ×.2 = 8.4.
and  = 2.90.
A Shortcut Formula
V ( X )    E ( X )  [ E ( X )] .
2
2
2
• Proof:
V(X)  E[( X   ) ]  E ( X - 2X   )
2
2
2
 E ( X )  2E ( X )    E ( X )  [ E ( X )] .
2
2
2
2
Rules of Variance
V (aX  b)  
2
aX  b
 a  ,
2
2
X
 aX b | a |  x .
• In particular,

2
aX
 a  ,

2
X b
2
2
X
 .
2
X
 aX | a |  x
Moments
• The kth moment about the origin of a rv
X, denoted by µk’ , is the expected value
of Xk, , symbolically,
µk’ = E(Xk) = x xk · p(x).
• The kth moment about the mean of a rv
X, denoted by µk, is the expected value
of (X - µ)k, , symbolically,
µk = E[(X - µ)k] = x (x - µ)k · p(x).
Special Cases
• The expectation, or the mean, is the 1st
moment about the origin.
µ = µ1’ = E(X) = x x · p(x).
• The variance is the 2nd moment about
the mean
2 = µ2 = E[(X - µ)2] = x (x - µ)2 · p(x).
The Binomial Distribution
Binomial Distribution
n x
n x
b( x; n, p)    p (1  p)
 x
x  0,1,2,..., n.
• For X ~ Bin(n,p), the cdf will be denoted
by
x
P( X  x)  B( x; n, p)   b( y; n, p)
y 0
Mean & Variance
If X ~ Bin(n, p), then
• E(X) = np,
• V(X) = npq (where q = 1-p.)
 x  npq.
Example(Cont)
• n = 5, p = 11/32 . Then
• E(X) = n · p = 5 · 11/32 = 1.72.
• V(X) = n · p · q = 5 · 11/32 · 21/32 = 1.13.
•  = (1.13)1/2 = 1.06.
Hypergeometric and
Negative Binomial
Distribution
Introduction
• The hypergeometric and negative
binomial distribution are both closely
related to the binomial distribution.
Introduction
• The negative binomial distribution arises
from fixing the number of S’s and letting
the number of trials to be random.
• The hypergeometric distribution is the
exact probability model for sampling
without replacement from a finite
dichotomous (S,F) population.
Negative Binomial Dist’n
• The experiment consists of a sequence
of independent trials.
• Each trial results in either S or F.
• The probability of success, p, is
constant from trial to trial.
• Trials are performed until a total of s
successes have been observed, where s
is a prespecified positive integer.
Negative Binomial RV
• X = the number of F’s that precede the
rth success, is called a negative
binomial rv.
• Possible values of X are 0, 1, 2, …
pmf
• Denote by nb(x; r, p) the pmf of X. Then
 x  1 s
xs
nb( x; s, p)  
 p (1  p) , x  1,2,...
 s  1
• Why?
• Total # of trials = x; The last trial must be a
success. Among the first (x-1) trials, there
are (s - 1) successes & x-s failures.
Review of Chapter 3
• Hypergeometric distribution,
 S  N  S 
 

x  n  x 

P ( X  x )  h ( x; n , S , N ) 
N
 
n
for max( 0, n  N  M )  x  min( n, M ).
• Poisson distribution.

e 
p( x;  ) 
, x  0, 1, 2,...
x!
x
Example
• What’s the probability that < 3 requests
are received during a particular hour?
• P( X < 3) = P(0) + P(1) + P(2)
= e-5 + 5· e-5 + 52 · e-5/2
= 0.125.
Example
• What’s the probability that exactly 10
requests are received during a particular
2-hour period?
• Rate = 2 × 5 = 10.
• P(X = 10) = e-10 1010/10! = 0.125.
Example
• How many calls do they expect to get
during a 45-min period?
• E(X) = (3/4) · 5 = 3.75.
Continuous RVs
&
Probability Distributions
Continuous RV
• An rv X is continuous if its set of
possible values is an entire interval of
numbers.
Example:
• X = the pH of a random soil sample
• X = the weight of a randomly selected
person.
pdf
• Let X be a continuous rv. Then a
probability density function (pdf) of X is
a function f(x) such that for any two
numbers a and b with a  b,
b
P(a  X  b)   f ( x)dx.
a
• For f(x) to be a pdf, f(x) must satisfy:

f(x)  0 for all x, and  f ( x)dx  1.
Example
• Waiting time at a bus station. A bus
arrives every 10 minutes. So the waiting
time is from 0 to 10. One possible pdf
for waiting time X is
1 / 10, 0  x  10
f ( x)  
otherwise.
0
• The probability of waiting between 3 to 5
minutes is:
5
P(3  X  5)   0.1 dx  0.1  x  0.2.
3
3
5
Uniform Distribution
• A continuous rv X is said to have a
uniform distribution on the interval
[A, B] if the pdf of X is
1 /( B  A) A  x  B
f ( x; A, B)  
otherwise
0
• Graphs of uniform distributions.
Probability at a Point
• When X is a discrete rv, each possible
value is assigned positive probability.
This is no longer true for continuous rv.
• If X is a continuous rv, then for any
number c, P(X = c) = 0. Consequently,
P(a  X  b) = P(a < X  b) = P(a  X < b)
= P(a < X < b).
Example
• Let X = the “time headway” for two
randomly chosen consecutive cars on a
freeway during a period of heavy flow.
Suppose the pdf of X is given by:
f(x) = 0.15 e-0.15( x - 0.5), x  0.5.
f(x) = 0 for x < .5 and f(x) decreases
exponentially fast as x increase from .5.
Example
• First, it clear that f(x)  0. Now we verify




f ( x )dx   .15e
.15( x .5 )
.5
 .15e
.075
dx .15e
.075


.5
e .15 x dx
1 (.15)(.5)
e
 1.
.15
• The probability that headway time is at
most 5 seconds is
5
5

.5
P( X  5)   f ( x )dx   .15e
 .15e
.075
.15( x .5 )
dx .15e
.075

5
.5
e .15x dx
1 .15 x x  5
.075 .75
.075
(
e
)
 e ( e  e )  .491.
x  .5
.15
CDFs & Expected Values
cdf
• The cumulative distribution function
(cdf) F(x) for a continuous rv X is
defined for every number x by
x
F ( x)  P( X  x)   f ( y)dy.

• For each x, F(x) is the area under the
density curve to the left of x. It is the
probability of observing X a value
smaller than or equal to x.
Example
• Let X have a uniform distribution on the
interval [A, B]. Then
1 /( B  A) A  x  B
f ( x; A, B)  
otherwise
0
• So, for x < A, F(x) = 0 and for x  B,
F(x) = 1. For A  x  B,
F ( x)  
x

1
x A
f ( y )dy  
dy 
.
A B A
B A
x
Example
• The entire cdf is:
 0 x A
x A
F ( x)  
A x B
B  A
 1 x  B.
• The graph of the cdf looks like:
Propositions
• Compute probabilities using F(x):
P(a  x  b) = F(b) - F(a).
• Obtaining pdf from cdf:
• If X is a continuous rv with cdf F(x)
differentiable at every point x, then the
pdf f(x) =F ’(x).
Example
• For uniform distribution on [A, B], the
cdf is
 0 x A
x A
F ( x)  
A x B
B  A
 1 x  B.
• So, for example, if A < a < b < B, then
P(a < X < b) = F(b)-F(a) = (b-a)/(B-A).
• The pdf
•
f(x) = F ’(x) = 1/(B-A)
for A < x < B.
Expected Values
• The expected value (or, mean) of a
continuous rv X with pdf f(x) is

 X  E ( X )   x  f ( x)dx.

• If X is a continuous rv with pdf f(x) and
h(X) is any function of X, then

h( X )  E (h( X ))   h( x)  f ( x)dx.

Example
• The pdf of the waiting time (in minutes)
at a checkout is given by
f(x) = x/8
for 0  x < 4.
• What’s the probability of waiting less
than 3 min?
• What’s the expectation of the waiting
time?
Example
• What’s the probability of waiting less
than 3 min?
2
3 x
x x3 9
P( X  3)   dx 

 .5625.
0 8
16 x  0 16
• What’s the expectation of the waiting
time?
3
4
x
x x  4 64
E ( X )   x  dx 

 2.667.
0
8
24 x  0 24
Variance & S.D.
• The variance of a continuous rv X with
pdf f(x) and mean  is

  V ( X )   ( x   )  f ( x)dx  E[( x   ) ].
2
X
2

• The standard deviation (S.D.) of X is
 x  V ( X ).
• V(X) = E(X2) - [E(X)]2.
2
Linear Transformation
• If h(X) = a X + b and V(X) = 2, then
V(h(X))=V(a X + b) = a 2 2
and
aX+b = |a| .
Example(Cont)
• The pdf of the waiting time at a checkout:
f(x) = x/8
for 0  x < 4.
• Find the variance of the waiting time.
•  = E(X) = 2.667.
4
4
x x  4 256
2
2 x
E ( X )   x  dx 

 8.
0
8
32 x  0 32
V ( X )  E ( X )  [ E ( X )]  8  2.667  .889.
2
2
2
Normal Distribution
Introduction
• The normal distribution is the most
important distribution in all of
probability and statistics.
• Many numerical populations have
distributions that can be approximated
very well by a normal curve.
Example
• Scores of standardized tests,
• Measurements of intelligence & aptitude,
• Returns of a stock (or a portfolio),
• Measurement errors …
Definition
• A continuous rv X is said to have a
normal distribution with parameters 
and  2 if the pdf of X is

1
2
f ( x;  ,  ) 
e
2 
( x )2
2 2
   x  .
Remarks
• Notation: X ~ N(, 2).
• It’s clear that f(x; , 2)  0 and it can be
shown that

f ( x;  ,  )dx  1.
2
• E(X) = , and V(X) = 2.
Standard Normal Dist’n
• With  = 0 and  = 1, the normal
distribution is called a standard normal
distribution.
• The pdf of a standard normal rv Z is
1 z2 / 2
f ( z; 0, 1) 
e
2
   z  .
• The cdf of Z is denoted by (z).
Normal Probability Table
• Table A.3 on page 704 of the text tabulates
the standard normal probabilities (cdf). This
is one of the most useful statistical tables.
• Example: Using the table to compute:
* P(Z < 1.20),
* P(Z > 1.68),
(= 1 - P( Z  1.68))
* P(-1.96 < Z < 1). (= P( Z < 1) - P( Z  -1.96))
Inverse Reading of
Table A.3
• Z denotes the (100)th percentile of the
standard normal distribution.
• The area under the standard normal curve
to the right of Z (tail probability) is .
• Find: Z.30, Z.90.
Standardization
• If Z ~ N(0, 1), then X =  +  Z ~ N(, 2).
• Inversely, if X ~ N(, 2), then
Z = (X -  )/ ~ N(0, 1).
• The transformation
X Z 
X 

Is called standardization.
• P(X  x) =P[Z  (x - )/] =  [(x - )/].
Standardization
• (100p)th percentile for N(, 2)
=  +  · (100p)th percentile for N(0, 1).
• So if X ~ N(, 2), then
X =  +  · Z .
Rule of Thumb
If X is (approximately) normal, then
• about 68% of the x's are within 1 SD of
the mean;
• about 95% of the x's are within 2 SDs of
the mean;
• about 99.7% of the x's are within 3 SDs
of the mean;
Example(Fish)
The lengths of fish in a certain fish
population follows a normal distribution
with  = 54 mm and  = 4.5 mm.
• What percentage of the fish are between 50
and 60 mm long?
* Let Z = (X - )/. Then
z1=(50 - 54)/4.5= -.89, z2=(60 - 54)/4.5=1.33.
Use Table A.3: P(50  X  60)=P(-.89Z1.33)
=.9082- .1867 = .7215.
Example(Fish)
• What percentage of the fish are more
than 48 mm long?
* z1 = (48 - 54)/4.5 = - 1.33.
P( X > 48) = 1- .0918 = .9082.
• What percentage of the fish are between
58 and 60 mm long?
* z1 = (58 - 54)/4.5 = 0.89, z2 = 1.33.
P(58 < X < 60) = .9082 - 0.8133 = .0949.
Example(Fish)
• What is the 70th percentile of the fish
length ? What is the 90th percentile?
* From Table A.3, Z.70 = 0.52.
So, X.70 = 54 + 4.5 ·0.52 = 56.3
* Similarly, Z.90 = 1.29. and
X.90 = 54 + 4.5 ·1.29 = 59.80.
Example (Height)
Among American women aged 18 - 24,
10% are less than 61.2 inches tall; 80%
are between 61.2 and 67.4 inches and
10% are more than 67.4 inches. Assume
the height can be well approximated by
a normal distribution.
• Find the mean  and the SD .
Example(Height)
* Z.10 = -1.29 and Z.90 = 1.29 , so
 61.2  


1
.
29
 
 67.4  

 1.29
 
Solving for  and , we have
 = (67.4 - 61.2)/(1.29 + 1.29) = 2.4, and
 = 64.3.
Normal Approximation
• The normal distribution is often used to
approximate the distribution of discrete
populations.
• In particular, under certain conditions,
the normal distribution can be used as
an approximation to the binomial
distribution.
Normal Approximation
to Binomial Distribution
• For a binomial rv X , we have
 X  np, and  X  npq.
• When both np and nq are relatively
large, the normal distribution with the
same mean and SD is a very good
approximation to Bin(n, p).
Normal Approximation
to Binomial Distribution
• Let X ~ Bin(n, p), Then if np  5 and
nq  5, X has approximately a normal
distribution with  X  np, and  X  npq.
x  .5  np
P( X  x)  B( x; n, p)  (
).
npq
• This is the area under the normal curve
to the left of x+.5. “+.5” is the correction
for discreteness. This is called
continuity correction.
Example
• X ~ Bin(30, 0.3). Want: P(6  X  10).
• Mean=30 × .3 = 9, SD = (30× .3× .7)1/2=2.51.
• P(6  X  10) = P(X  10) - P(X  5)
  ((10 + .5 - 9)/2.51) -  ((5 + .5 - 9)/2.51)
= (.598) -  (-1.394)= .7257 - .0832= .6425.
• Direct calculation yields
P(6  X  10) = P(6) + … + P(10) = .6437.
• The results are very close.