Ch5. Probability Densities
Download
Report
Transcript Ch5. Probability Densities
Ch5. Probability Densities II
Dr. Deshi Ye
[email protected]
5.4 Other Prob. Distribution
Uniform distribution: equally likely
outcome
1
f ( x)
0
for x
else
1
x
dx
2
2
1
2 2
2 x
dx
3
Variance of uniform
( )2
2
2
2
Mean of uniform
12
2/33
Ex.
Students believe that they will get the
final scores between 80 and 100.
Suppose that the final scores given
by the instructors has a uniform
distribution.
What is the probability that one
student get the final score no less
than 85?
3/33
Solution
f(x)
0.05
1
1
100 80
1
20
x
80
85
100
P(85 x 100)= (Base)(Height)
= (100 - 85)(0.05) = 0.75
4/33
5.6 The Log-Normal Distr.
Log-Normal distribution:
(ln x )
1
1
2 2
x e
dx for x 0, 0
f ( x) 2
0
2
0, 1
By letting y=lnx
It has a long right-hand tail
ln b
ln a
Hence
1
e
2
( y ) 2
2 2
dy F (
,
ln b
) F(
ln a
)
5/33
Mean of Log-Normal
Mean and variance are
2 / 2
e
Proof.
e
x2
2 2
, e
2
2
(e 1)
dx
6/33
Gamma distribution
x
1
x 1e for x 0, , 0,
f ( x) ( )
0
else
( ) x 1e x dx ( 1)( 1)
0
( 1)! when α is a positiveinteger
Mean and
Variance
,
2
2
7/33
The Exponential Distribution
By letting
distribution
1
in the Gamma
1 x
for x 0, , 0,
e
f ( x)
0
else
Mean and
Variance
,
2
2
8/33
5.8 The Beta Distribution
When a random variables takes on
values on the interval [0,1]
( ) 1
1
x
(
1
x
)
for 0 x 1, , 0
f ( x) ( )( )
0
else
Mean and
Variance
,
( )2 ( 1)
2
9/33
Beta distribution
Are used extensively in Bayesian
statistics
Model events which constrained to
take place within a interval defined by
minimum and maximum value
Extensively used in PERT, CPM,
project management
10/33
5.9 Weibull Distribution
x
f ( x)
0
1 x
for x 0, , 0
e
Mean and
Variance
1
1
(1 ),
2
2
((1
2
) ((1
1
))2 )
11/33
Weibull distribution
Is most commonly used in life data
analysis
Manufactoring and delivery times in
industrial engineering
Fading channel modeling in wireless
communication
12/33
5.10 Joint distribution
Experiments are conduced where two
or more random variables are
observed simultaneously in order to
determine not only their individual
behavior but also the degree of
relationship between them.
13/33
Two discrete random variables
f ( x1 , x2 ) P( X1 x1 , X 2 x2 )
The probability that X1 takes value x1 and X2 will take
the value x2
EX.
x1
0
1
2
0.1
0.2
0.4
0.2
0.1
0
x2
0
1
14/33
Marginal probability
distributions
f1 ( x1 ) P( X1 x1 )
f (x , x )
1
2
all x2
EX.
x1
0
x2
0 0.1
1 0.2
0.3
f1 ( x1 )
1
2
0.4
0.2
0.6
0.1
0
0.1
15/33
Conditional Probability distribution
The conditional probability of X1 given that X2=x2
f ( x1 , x2 )
f1 ( x1 | x2 )
f 2 x2
for all x1 provided f 2 ( x2 ) 0
If two random variables are independent
f1 ( x1 | x2 ) f1 ( x1 ) for all x1 and x2
f x1 , x2 f1 x1 f 2 ( x2 )
16/33
EX.
With reference to the previous example, find
the conditional probability distribution of X1,
given that X2=1. Are X1 and X2 independent?
f (0,1) 0.2
0.5
Solution. f1 (0 | 1)
f 2 (1) 0.4
f (1,1) 0.2
f1 (1 | 1)
0.5
f 2 (1) 0.4
f (2,1)
0
f1 (2 | 1)
0
f 2 (1) 0.4
f1 (0 | 1) 0.5 0.3 f1 (0)
Hence, it is dependent
17/33
Continuous variables
If X , X ,, X
are k continuous random
variables, we refer to f ( x1, x2 ,, xk ) as
the joint probability density of these
random variables
1
2
b1 b2
a1 a2
k
bk
f ( x1 , x2 ,, xk )dx1dx2 dxk
ak
18/33
EX.
P179.
6e 2 x1 3 x2
f ( x1 , x2 )
0
for x1 0, x2 0
elsewhere
Find the probability that the first random variable
between 1 and 2 and the second random variable
between 2 and 3
19/33
Marginal density
Marginal density of X1
f1 ( x1 ) f ( x1 , x2 )dx2
Example of
previous
f1 ( x1 ) 6e
0
2 x1 3 x2
dx2
for x1 0
20/33
Distribution function
F ( x1 , x2 )
x1
x2
f ( x1 , x2 )dx1dx2
F1 ( x1 ) F ( x1 , )
F2 ( x2 ) F (, x2 )
21/33
Independent
If two random variables are independent iff the
following equation satisfies.
f ( x1 , x2 ) f1 ( x1 ) f 2 ( x2 ) for all ( x1 , x2 )
22/33
Properties of Expectation
Consider a function g(x) of a single
random variable X. For example:
g(x) =9x/5 +32.
If X has probability density f(x), then
the mean or expectation of g(x) is
given by
E[ g ( x)] g ( x) f ( x)dx
Or
E[ g ( x)] g ( xi ) f ( xi )
all xi
23/33
Properties of Expectation
If a and b are constants
E[ax b] aE[ x] b
D[ax b] a 2 D[ x]
Proof. Both in continuous and discrete case
24/33
Covariance
Covariance of X1 and X2: to
measure
E[( X1 1 )( X 2 2 )]
Theorem. When X1 and X2 are
independent, their covariance is 0
E[( X1 1 )( X 2 2 )] 0
25/33
5.11 Checking Normal
Question: A data set appears to be
generated by a normal distributed
random variable
Collect data from students’ last 4
numbers of mobiles
26/33
Simple approach
Histogram can be checked for lack of
symmetry
A single long tail certainly contradict
the assumption of a normal
distribution
27/33
Normal scores plot
Also called Q-Q plot, normal quantile plot,
normal order plot, or rankit plot.
Normal scores: an idealized sample from the
standard normal distribution. It consists of the
values of z that divide the axes into equal
probability intervals. For example, n=4.
m1 z0.2 0.84
m2 z0.4 0.25
m3 z0.4 0.25
m4 z0.2 0.84
28/33
Steps to construct normal score
plot
1) order the data from smallest to
largest
d1 d2 dn
2) Obtain the normal scores
3) Plot the i-th largest observation,
versus i-th normal score mi, for all i.
Plot
(di , mi )
29/33
Normal scores in Minitab
In minitab, the normal scores are
calculated in different ways:
The i-the normal score is
((i 3 / 8) /(n 1 / 4))
1
Where 1 ( x) is the inverse cumulative
distribution function of the standard
normal
30/33
Property of Q-Q plot
If the data set is assumed to be
normal distribution, then normal
score plot will resemble to a 450 /line
through the original.
31/33
5.12 Transform observation to
near normality
When the histogram or normal scores plot
indicate that the assumption of a normal
distribution is invalid, transformations of the
data can often improve the agreement with
normality.
Make larger values
smaller
1
x
ln x
x
1/ 4
x
Make large value
larger
2
x ,x
3
32/33
Simulation
Suppose we need to simulate values
from the normal distribution with a
2
specified and
•From
z
x
The value x can be calculated from the
value of a standard normal variable z
1) z can be obtained from the value for a uniform variable u by
numerically solving u=F(z)
2) Box-Muller-Marsaglia method: it starts with a pair of independent
variable u1 and u2, and produces two standard normal variables
33/33
Box-Muller-Marsaglia
It starts with a pair of independent variable u1 and u2,
and produces two standard normal variables
z1 2 ln(u2 ) cos(2 u1 )
z2 2 ln(u2 ) sin(2 u1 )
Then
x1 z1
x2 z2
34/33
Simulation from
exponential distribution
Suppose we wish to simulate an
observation from the exponential
distribution
F ( x) 1 e0.3x ,
0 x
The computer would first produce the value u from the
uniform distribution. Then
ln(1 u )
x
0 .3
35/33
Population and sample
36/33
Population and Sample
Investigating: a physical phenomenon,
production process, or manufactured unit,
share some common characteristics.
Relevant data must be collected.
Unit: the source of each measurement.
A single entity, usually an object or person
Population: entire collection of units.
37/33
Population and sample
Population
sample
38/33
Key terms
Population
All items of interest
Sample
Portion of population
Parameter
Summary Measure about Population
Statistic
Summary Measure about sample
39/33
Examples
Population
Unit
variables
All students
currently
enrolled in
school
student
GPA
Number of
credits
All books in
library
book
Replacement
cost
40/33
Sample
Statistical population: the set of all
measurement corresponding to each
unit in the entire population of units
about which information is sought.
Sample: A sample from a statistical
population is the subset of
measurements that are actually
collected in the course of
investigation.
41/33
Sample
Need to be representative of the
population
To be large enough to contain
sufficient information to answer the
question about the population
42/33
Discussion
P10, Review Exercises 1.2
A radio-show host announced that she wanted
to know which singer was the favorite among
college students in your school. Listeners were
asked to call and name their favorite singer.
Identify the population, in terms of
preferences, and the sample.
Is the sample likely to be more representative?
Comment. Also describe how to obtain a
sample that is likely to be more representative.
43/33