Probability Density Functions

Download Report

Transcript Probability Density Functions

0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Statistical Design Methods
For Engineers
Lecture 3: Probability Density Functions
Lecture 3: PDFs 1
0.8
0.7
0.6
Objectives
0.5
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
1.
Introduce concept of probability, probability density functions
(pdf) and cumulative probability distributions (CDF).
2.
Discuss central limit theorem & origin of common pdfs: uniform,
normal, Weibull.
3.
Introduce JFIT tool to generate pdfs from simulation or
experimental data.
4.
Use a fitted pdf to calculate a probability of non compliance
(PNC).
Lecture 3: PDFs 2
0.8
0.7
0.6
0.5
Probability of Events: Multiplication Rule
0.4
0.3
0.2
0.1
o
o
o
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Let A = an event of interest
Let B = another event of interest
Let C = Event A occurring and event B occurring.
– What is the probability of compound event C occurring?
– P(C)=P(A and B)=P(A)P(B|A)

P(A and B)=P(A∩B)=probability of A occurring multiplied by the probability of
B occurring given that event A has occurred.
– P(B|A) is called a conditional probability
– P(C)=P(B and A)=P(B)P(A|B)
– If events A and B are independent of one another then P(B|A)=P(B) and
similarly P(A|B)=P(A) so P(C)=P(A)P(B).
– If events A and B are mutually exclusive then P(A|B)=0=P(B|A)
o
Therefore P(A and B)=P(A)P(B|A)=P(B)P(A|B) which implies
P(A|B)=P(A)P(B|A)/P(B) (Bayes’ Theorem)
Lecture 3: PDFs 3
0.8
0.7
0.6
0.5
Probability of Events: Addition Rule
0.4
0.3
0.2
0.1
o
o
o
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Let A = an event of interest
Let B = another event of interest
Let D = Event A occurring or event B occurring.
– What is the probability of compound event D occurring?
– P(D)=P(A or B)=P(A)+P(B)-P(A and B)

P(A or B ) is defined as an “inclusive OR” which means = probability of event
A or event B or both events A and B
– If events A and B are independent of one another then P(D)=P(A)
+P(B)-P(A)P(B).
– If events A and B are mutually exclusive then P(D)=P(A)+P(B)
A
B
A∩B
Lecture 3: PDFs 4
0.8
0.7
0.6
0.5
Example 1
0.4
0.3
0.2
0.1
o
o
o
o
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Have two shirts in closet, say one red and the other blue.
Have 3 pants in closet, say one brown, one blue and one
green.
What is probability that randomly picking a shirt and pants that
one chooses a red shirt (event A) and blue pants (event B)?
P(A and B)=P(A)P(B)=(1/2)(1/3)=1/6 (independent events)
Lecture 3: PDFs 5
0.8
0.7
0.6
0.5
Example 2
0.4
0.3
0.2
0.1
o
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Have two parts in series in a circuit and both parts have to
operate in order for circuit to operate. (series reliability)
R1
o
o
R2
R1=probability of part 1 operating, R2 = probability of part 2
operating.
If the reliability of part 1 is independent of the reliability of part 2
then P(part 1 AND part 2 operating)=R1*R2
Lecture 3: PDFs 6
0.8
0.7
0.6
0.5
Example 3
0.4
0.3
0.2
0.1
o
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
What if two parts are in parallel in the circuit and at least one of
the two parts must operate for the circuit to operate. (parallel
reliability)
R1
R2
o
P(part1 operating OR part2 operating)=R1 +R2 – R1*R2 if the
reliability of part1 is independent of the reliability of part2.
– How likely is this independence to be true?
– Might the reliability of one of the parts depend on whether or not the
other part is operating?
Lecture 3: PDFs 7
0.8
0.7
0.6
Defining a PDF
0.5
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Probability Density Function, f(x)
 Definition - a mathematical function, f(x), such that
f(x)dx = the probability of occurrence of a random variable X
within the range x to x+dx. i.e.
f(x)dx = Pr{x< X < x+dx}
A typical Probability Density may appear as f(x)
Freq. Of Occurrence

X
Lecture 3: PDFs 8
0.8
0.7
0.6
0.5
Properties of a PDF
0.4
0.3
0.2
0.1



23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Probability of Occurrence - is the area under a pdf bounded
by any specified lower and upper values of the random variable
X.
 Note That The Value
f(x) for all values of x
of f(x) Can Be > 1
Total area under f(x) = 1.0
P( xL  x  xU )  
xU
xL
f(x)=freq of occurrence
per unit of x
P(  x  )  1.0
P( xL  X  xU )
xL
f ( x)dx
xU
x
Lecture 3: PDFs 9
0.8
0.7
0.6
0.5
Properties of a CDF
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
18.2
19.4
Cumulative Distribution Function, F(x) - a function, called a
CDF, that returns the cumulative probability that a random
variable, X, will have a value less than x. Maximum value = 1.0
x
f (u )du for    x  

F ( x)  P( X  x)  
Cumulative
Distribution
Function, CDF
1.0
F(x)  P(  X  x)
Probability

18.8
17
17.6
0
x
X

Lecture 3: PDFs 10
0.8
0.7
0.6
0.5
Expectation
0.4
0.3
0.2
0.1

23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Expected value of a random variable X.
– If X has only discrete value {xi, i=1,2,…,n} and pi is the probability of having the
value X=xi. The expected value of X, written E[X], is determined by
n
E[ X ]   pi xi
i 1
If pi =1/n for all i = 1,2,
,n.
i.e. all n possible values have same probability of being chosen.
E[ X ] 
1
1
x1  x2 
n
n

x  x2 
1
xn  1
n
n
 xn
 arithmetic average
– If X has a continuous distribution of possible values then f(x)dx = the
probability that X is between x and x+dx. The expected value of X is then given
by weighting each value of X by its probability of occurrence f(x)dx.
E[ X ]   xf ( x)dx

e.g. f ( x)   e  x , E[ X ]   x e   x dx 
0
1

Lecture 3: PDFs 11
0.8
0.7
0.6
0.5
Measures of Central Tendency
0.4
0.3
0.2
0.1

23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Mean
x=+
1 n
Mean=X =  xf(x)dx, Sample Mean = x =  xk
n k 1
x=-

Median (middle value)
m
F(median)=0.5=  f(y)dy=Pr(X  m)
-

Mode (most probable)
df
 0, mode
dx X=mode
m
Lecture 3: PDFs 12
0.8
0.7
0.6
0.5
Measures of Dispersion
0.4
0.3
0.2
0.1

23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Variance
x=+
1 n
2
Variance     (x-X ) f(x)dx, Sample Variance = s 
x

x



k
n

1
k 1
x=-
2
X

2
2
X
Standard Deviation
1 n
2
Standard Deviation= Variance   X ,Sample Stdev = sX 
x

x
 k 
n  1 k 1

3
Range
2
1
1
2
3
Range  R  Max x  Minx

Mean Absolute Deviation
Normal pdf
x=+
1 n
MAD   |x-X |f(x)dx, Sample MAD   xk  x
n k 1
x=-
Lecture 3: PDFs 13
0.8
0.7
0.6
0.5
Variance
0.4
0.3
0.2
0.1

23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Discrete distribution
n
n
Var[ X ]   pi  xi  E[ X ]   pi xi2  [ E[ X ]]2  E[ X 2 ]  [ E[ X ]]2
1
2
1
1
1-p
e.g. geometric distribution f(x) = (1-p) p, E[X]= , Var[X]= 2
p
p
x-1

Continuous distribution
Var[ X ]   ( x  E[ X ])2 f ( x)dx   x 2 f ( x)dx   E[ X ]
2
2
1
1
- x
e.g. exponential distribution f(x)= e , Var[ X ]  2     2
   
2
Lecture 3: PDFs 14
0.8
0.7
0.6
0.5
Measure of Asymmetry
0.4
0.3
0.2
0.1

23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Skewness (3rd Central Moment)
x=+
Skewness 

(x-X )3 f(x)dx  E[( x   X )3 ],
x=-
Coefficient of Skewness  Sk 
Skewness
StandardDeviation 
3
x=+
n
1
3
Sk  3  (x-X ) f(x)dx, Sample Sk  3
 xk  x 

 x=-
s (n  1)(n  2) k 1
1
3
Sk>0, positive Skewness
Mean > Median > Mode
Lecture 3: PDFs 15
0.8
0.7
0.6
Measure of “Peakedness”
0.5
0.4
0.3
0.2
0.1

23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Kurtosis (4thx=+
Central
Moment)

Kurtosis   (x-X ) 4 f(x)dx =E[(x-X ) 4 ],
x=-
Coefficient of Kurtosis  Ku 
Ku 
1
4
x=+

Kurtosis
StandardDeviation 
4
(x-X ) 4 f(x)dx,
x=-
(n  2n  3)
(2n  3)(n  1)
 xk  x 
Sample Ku 
3



(n  1)(n  2)(n  3) k 1  s 
n(n  2)(n  3)
2
Ku>3, more peaked than
Normal distribution
n
4
Ku < 3, less peaked than
Normal distribution
Lecture 3: PDFs 16
0.8
0.7
0.6
0.5
Weibull distribution
0.4
0.3
0.2
0.1
23
3.25
3.00
2.75
2.50
2.25
2.00
1.75
1.50
1.25
1.00
0.75
0.50
0.25
0.00
-0.25
-0.50
-0.75
-1.00
-1.25
3.75
Z≡(X-) / 
Z
3.50
0.04
0.035
0.03
1
Mode = 7.6
Median =
20.1
Weibull
0.025
0.02
Mean =
25
x
f ( x)   
  
 1
e
x
 
 

0.75
0.5
CDF
22.4
21.8
21.2
20
20.6
19.4
Probability Distribution
f(x), pdf
18.2
18.8
17
17.6
0
0.015
0.01
0.25
0.005
0
stdev =  = 20
0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 10
0
pdf
CDF
random variable, X
Lecture 3: PDFs 17
0.8
0.7
0.6
0.5
Useful Properties
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0

Expected value of sum of variables is sum of expected values
E[ x1  x2   xn ]  E[ x1 ]  E[ x2 ]   E[ xn ]

Variance of a constant, c, times a random variable x is c2 times
the variance of x.
Var[ x]  E[( x  E[ x]) 2 ]  E[ x 2 ]  ( E[ x]) 2
Var[cx]  E[c 2 x 2 ]  ( E[cx]) 2  c 2 E[ x 2 ]  (cE[ x]) 2  c 2Var[ x]

Variance of sum of two variables is sum of variances of each
variable IF the variables are independent of one another.
Var[ x  y ]  Var[ x]  Var[ y ]  cov[ x, y ]
cov[ x, y]  E[( x  E[ x])( y  E[ y ])]  E[ xy ]  E[ x]E[ y ]
If x is independent of y, then cov[x,y]=0 (converse is not true, Why?)
Var[ x  y ]  Var[ x]  Var[ y ]
Lecture 3: PDFs 18
0.8
0.7
0.6
0.5
Types of Distributions
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0

Discrete Distributions




Continuous Distributions







Bernoulli Distributions, f(x)=px(1-p)1-x
Binomial Distributions, f(x)=(n!/[(n-x)!x!])px(1-p)n-x
Poisson, f(x)=(x/x!) exp(-)*,
Uniform f(t) = 1/(b-a) fpr a<t<b, zero otherwise
Weibull Distributions,
f(t)=(/q)(t/q)1exp(-(t/q))
– Exponential Distribution
f(t)=exp(-t)
Logistic Distribution, f(z)=z/[b(1+z)2], z=exp[ (x-a) /b]*
Raleigh Distribution, f(r)=(r/2)exp(-½(r/)2)*
Normal Distribution,
f(x)=(1/2p1/2)exp(-1/2x/2
Lognormal Distribution
• f(x)=(1/x2p1/2)exp(-1/2lnx)/2*
Central Limit Theorem
–Law of Large Numbers (LLN)
*(not discussed here)
Lecture 3: PDFs 19
0.8
0.7
0.6
Another Way to Pick Distributions: Distributions can be
determined using maximum entropy arguments.
0.5
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0

Entropy is a measure of disorder or statistical uncertainty.

The density function which maximizes the entropy* expresses the largest
uncertainty for a given set of constraints, e.g.
– If no parameters of the data are known
 f(x) is a Uniform Distribution (max uncertainty says all values equally
probable)
– If only the mean value of the data is known
 f(x) is an Exponential Distribution
– If the mean and variance are known
 f(x) is a Normal Distribution (Maxwellian distribution if dealing with
molecular velocities)
– If the mean, variance and range are known
 f(x) is a Beta Distribution
– If the mean occurrence rate between independent events is known: (mean time
between failures)
 f(x) is a Poisson Distribution
* Ref Reliability Based Design in Civil Engineering, Milton E Harr, Dover Publications 1987,1996.
Lecture 3: PDFs 20
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Discrete Distributions
Bernoulli, binomial
Lecture 3: PDFs 21
0.8
0.7
0.6
0.5
Bernoulli Trial & binomial distribution
0.4
0.3
0.2
0.1
o
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
What if perform test and outcome can either be a pass or a fail
(binary outcome). This is called a Bernoulli trial or test.
o Let p = probability of a failure
Bernoulli distribution
f ( x)  P{ X  x}  p x (1  p)1 x , x  0,1
P{ X  1}  p  probability of failure in test,
P{ X  0}  1  p  probability of passed test
E[ X ]  p, Var[ X ]  p(1  p)
Since many experiments are of the
pass/fail variety, the basic building block
for such statistical analyses is the
Bernoulli distribution. The random
variable is x and it takes on a value of 1
for failure and 0 for passed.
Lecture 3: PDFs 22
0.8
0.7
0.6
0.5
Multiple Bernoulli Trials
0.4
0.3
0.2
0.1
o
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Perform say 3 Bernoulli trials and want to know the probability
of 2 successes. R=1-p where p = probability of failure.
– P(s=2|3,R) = P(first trial is success and second is success and third is
failure OR first success and second failure and third success OR first
fails and second is success and third is success)
– R*R*(1-R) + R*(1-R)*R + (1-R)*R*R = 3R2(1-R)
– General formula P(s|n,R)=nCs * Rs * (1-R)n-s (binomial distribution)

General formulation of binomial distribution
 n x
n x
f ( x)  P( X  x)    p 1 p 
 x
where
 n
n!
 x 
   n  x ! x !
  E[ x]  np
 2  E[ x 2 ]  ( E[ x]) 2  npq  np(1  p)
  np(1  p)  npq
q p
 3  1 
npq
1  6 pq
 4  2  3 
npq
Lecture 3: PDFs 23
0.8
0.7
0.6
0.5
Probability Mass Function, b(s|n,p)
0.4
0.3
0.2
0.1
o
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Graph of f(x) for various values of n. (p=0.02)
0.6
f(x|n=20,p)
0.5
f(x|n=100,p)
f(x|n=500,p)
0.4
begins to look like a
normal distribution
with mean = n*p
0.3
0.2
0.1
0
0
1
2
3
4
5
6
7
8
9
10
12
14
16
18
20
Lecture 3: PDFs 24
0.8
0.7
0.6
0.5
Example with binomial distribution
0.4
0.3
0.2
0.1

23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Suppose we performed pass / fail test (Bernoulli trial) on a
system x=1 if fails x=0 if it passes . Perform this test on n
systems, the resulting estimate of the probability of non
compliance = sum of x values / number of trials,
i.e. <PNC> = (x1+x2+…+xn) / n = # noncompliant / # tests.
binomial
p  x / n  estimate of PNC
E[ x]  np,Var[ x]  np(1  p)
E[ x / n]  np / n  p
Var[ x / n]  Var[ x]/ n 2  np(1  p) / n 2  p(1  p) / n
Stdev[ x / n]  p(1  p) / n  p(1  p) / n
This information
will be used later
Lecture 3: PDFs 25
0.8
0.7
Continuous Distributions: Plots for
Common PDFs
0.6
0.5
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17.6
17
0
0.3
0.2
0.4
Uniform
0.2
0
Normal
5
Log-Normal
1
0.1
0
2
4
0.2
0
0.5
Rayleigh
0
1
2
Weibull
1
10
1
0
Exponential
1
5
2
3
2
0
6
0.4
2
1
2
3
1
3
Gamma
6
2
0.5
4
Beta
1
2
0
0.5
1
0
0.5
0
2
4
1
See following charts for details →
Lecture 3: PDFs 26
0.8
0.7
0.6
0.5
Uniform & Normal Distributions
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Uniform distribution
1
a  b
2
1
2
 2  variance=  b  a 
12
b-a
  std dev=
2 3
  mean=
pdf
Both the uniform and
normal distributions
are used in RAVE
 1
  b-a  , a  x  b
f ( x)  
0
,otherwise
CDF
0
,x<a


 3  0  1  coef. of skewness
 x-a 
F ( x)  
,a  x  b
b-a
9



  2  coef. of kurtosis
1
,x>b
4


5
Normal distribution (Gaussian distribution)
1
f ( x) 
e
 2p
1  x 
 

2  
2
  mean
 2  variance
  std dev
3  0  1  coef. of skewness
Standard Normal distribution
 4  3  2  coef, of kurtosis
1  1 z2
f ( z) 
2p
e
2
define z = (x - ) / 
(Unit Normal distribution)
   z
  mean=0
 2  variance=1
  std dev=1
3  0  1  coef. of skewness
 4  3  2  coef, of kurtosis
Lecture 3: PDFs 27
0.8
0.7
Central Limit Theorem
and Normal Distribution
0.6
0.5
0.4
0.3
0.2
0.1

23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
The Central Limit Theorem says:
– If X= x1+x2+x3+…+xn and if each variable, xj, comes from a its own
distribution whose mean is j, stdev j, then the variable Z, defined
below obeys a unit normal distribution, signified by N(0,1).
n
Z
n
x 
j 1
j
j 1
j 1
-2s/√n
-1s/√n
+1/√n
+2s/√n
+3/√n
j
~ N (0,1)
n

-3s/√
2
j
Mean of x, <x>
– This is true as long as no single xj value overwhelms the rest of the sum

What does this mean? Using a simpler form for Z this means
1 n
 xj   x  
The pdf of the mean value, x , of a variable is
n j 1
Z




n



n
~ N (0,1)
a normal distribution independent of what
distribution characterized the variable itself.
Lecture 3: PDFs 28
0.8
0.7
0.6
0.5
Generating PDFs from Data
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Commercial codes

Crystal Ball
http://www.decisioneering.com/

Codes developed
by J.L. Alderman

– Uses Johnson family of
distributions.
– 4-parameter
distributions.
– Parameters chosen to
match first 4 central
moments of data.
EasyFit version 4.3 (Professional)
http://www.mathwave.com/

Need some goodness of fit (gof)
test.





Kolmogorov-Smirnov (K-S)
Anderson-Darling (A-D)
Cramer-von Mises (C-vM)
Chi squared (c2)
others
JFIT (quick)

GAFIT (slower, but more
accurate fit)
– Uses Johnson family
– Parameters chosen
using Rapid Tool.
Lecture 3: PDFs 29
0.8
0.7
0.6
0.5
Methods Of PDF Fitting
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Temperature
2.82124992
The ‘Crystal Ball’ Tool and
the ‘EasyFit’ Tool both Fit
the Actual Data To Each of
The Various Density
Functions In their
Respective Libraries
Lecture 3: PDFs 30
0.8
0.7
0.6
0.5
Methods Of PDF Fitting
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Temperature
The PDF That Produces The
Best Match, According To User
Selectable Criteria (Chi
Squared, KolmogorovSmirnoff(KS), Anderson-Darling
(AD) ) Is Selected To Represent
the Data.
2.82124992
The ‘Crystal Ball’ Tool and
the ‘EasyFit’ Tool both Fit
the Actual Data To Each of
The Various Density
Functions In their
Respective Libraries
Lecture 3: PDFs 31
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
N. L. Johnson Method Of PDF
Determination
Temperature
2.82124992
The Johnson family of distributions are fitted
in the excel tool, JFIT and commercial tool
EasyFit™ by matching the first four
standardized central moments of the data to
the expressions for the moments of the
distributions. GAFIT uses GA for optimized fit.
Lecture 3: PDFs 32
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
N. L. Johnson Method Of PDF
Determination
The Johnson family of
distributions is composed of
four density functions (normal,
lognormal, unbounded and
bounded)
These members are all related
to the unit normal distribution
by a mathematical
transformation of variables
Computations are simplified
because they can be done
using the unit normal variable
distribution f(Z)~N(0,1).
Lecture 3: PDFs 33
0.8
0.7
0.6
0.5
Calculation of Moments from Data
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Standardized Central Moments
NAME
MOMENT
1st (noncentral)
MEAN
VARIANCE
2nd (central)
SYMBOL
x
s
2
EXCEL FUNCTION
AVERAGE(Range)*
VAR(Range)*
DIRECT CALCULATION (Unbiased)
1 n
xi

n i 1
1 n
2
x

x
 i 

n  1 i 1
n
3rd (central)/s3 SKEWNESS
Sk
SKEW(Range)*
n  xi  x 
3
i 1
(n  1)(n  2) s 3
n
(n  2n  3)  xi  x 
2
4th (central)/s4
4th (central)/s
4
KURTOSIS*
KURTOSIS*
Ku
Ku *
KURT(Range)+3
i 1
(n  1)(n  2)(n  3) s 4
4
3
(2n  3)(n  1)
n(n  2)(n  3)
[(n2-2n+3)/(n(n+1))]*KURT
+ 3(n-1)/(n+1)*
*Unbiased values
Lecture 3: PDFs 34
0.8
0.7
0.6
0.5
Regions of Johnson Distributions
0.4
0.3
0.2
0.1
22.4
21.8
21.2
20
20.6
19.4
23
Skewness =
Normalized
Third Central
Moment
ForRegions
Any PDF,
Kurtosis Distributions
> Skew2 +1
for Johnson
This Defines The Impossible Area
0
1
Forbidden Region
2 =kurtosis
18.2
18.8
17
17.6
0
2
3
Unbounded, SU
4
Kurtosis =
Normalized
Fourth Central
Moment
211
5
6
Bounded, SB
Two ordinate
lognormal
7
8
0
1
2
1=(skewness)
2
3
4
21.561.5123
Lecture 3: PDFs 35
0.8
0.7
0.6
JSB, Bounded Density Function
0.5
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
f ( x) 

 2p y (1  y )
e
 y 
1
    ln 
 
2
1

y


2
x 
y
,  x    


 y 
F ( x)       ln 

1

y



The Johnson
bounded
distribution can fit
many different
shapes because it
has 4 adjustable
parameters
Lecture 3: PDFs 36
0.8
0.7
0.6
JSU Unbounded Density Function
0.5
0.4
0.3
0.2
0.1
f ( x) 
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0

 2p y  1
2


e
F ( x)      ln y  y  1
2


2
1
2

    ln y  y 1 
2


y
x 
Again using 4
parameters allows
for better fitting.

Lecture 3: PDFs 37
0.8
0.7
0.6
0.5
JFIT Tool
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
Steps: Enter Data in Either C6-9 OR C13-17 not both!
2. Set Plot Range (D22-D24)
3. Enter Limits Or Target Yield Below PDF chart
4. Press compute
Compute




Type
Probs?
z
-3.000000
-2.49
-2.28
-2.13
-2.03
-1.94
-1.86
-1.79
-1.73
-1.68
-1.63
-1.59
-1.54
First Four Central Moments of The Data
4.00000
Mean
1.00000
Variance
0.18226
Skew
3.05910
Kurtosis
Do NOT insert data in both C6:C9 and C13:C17
Output Parameters of Johnson Curve if data inserted in C6:C9.
-46.19231651 Types:
Problems:
16.49535715 JSL=LogNormal
0 None
1 Variance <= 0
1 JSU=Unbounded
-12.48020369 JSB=Bounded
2 Kurt < Skew^2 + 1
3 Sb did not converge
JSN=Normal
JSL
4 Inconsistent Input Data
0 JST=Two-Ordinate
5 SU did not converge
Data Will Be Plotted For
zmin =
-3
zmax =
3
# of steps =
200
x
P(z)
1.2343E+00 4.4318E-03
1.663702784 1.7907E-02
1.848182597 2.9812E-02
1.972738181 4.0850E-02
2.0687069
5.1266E-02
2.147687153 6.1187E-02
2.215323077 7.0697E-02
2.274808354 7.9849E-02
2.328134588 8.8687E-02
2.376631603 9.7241E-02
2.421233645 1.0554E-01
2.46262341 1.1359E-01
2.501315693 1.2143E-01
PDF(x)
5.3305E-03
2.0884E-02
3.4321E-02
4.6623E-02
5.8124E-02
6.8999E-02
7.9355E-02
8.9268E-02
9.8790E-02
1.0797E-01
1.1682E-01
1.2539E-01
1.3370E-01
if you use the Excel Function
'Kurt' to compute the
Kurtosis you must add 3 to
get the correct value for entry
into cell c9
CDF(x)
1.3499E-03
6.3615E-03
1.1373E-02
1.6385E-02
2.1396E-02
2.6408E-02
3.1419E-02
3.6431E-02
4.1442E-02
4.6454E-02
5.1465E-02
5.6477E-02
6.1489E-02
0.4008
Raytheon
PDF
0.3611
0.3214
0.2817
PDF(x)

Var
sk
k
0.2419
0.2022
0.1625
0.1228
0.0831
0.0434
0.0037
1.2343 1.8359 2.4376 3.0392 3.6409 4.2425 4.8442 5.4459 6.0475 6.6492 7.2508
LL 2.1268
Yield(%) = 95.00%
UL= 6.0452
Press ENTER if you change Lower or Upper Limits or Change Targeted Yield
Right Click on/near the axis values to change the number formats
1.0000
CDF
0.9000
0.8000
0.7000
CDF(x)
20
20.6
19.4
18.2
18.8
17
17.6
0
0.6000
0.5000
0.4000
0.3000
0.2000
0.1000
0.0000
1.2343 1.8359 2.4376 3.0392 3.6409 4.2425 4.8442 5.4459 6.0475 6.6492 7.2508
Lecture 3: PDFs 38
0.8
0.7
0.6
0.5
JFIT Tool(Mode 1)
0.4
0.3
0.2
0.1
23
Compute

Var
sk
k




Type
Probs?
Insert values
theC6-9
Steps: for the four moments
Enter Datainto
in Either
2. Set Plot and
Rangethe
(D22-D
top set of cells then press compute
3. Enter Limits Or Targe
results appear in the lower
set ofcompute
cells.
4. Press
First Four Central Moments of The Data
4.00000
Mean
1.00000
Variance
0.18226
Skew
3.05910
Kurtosis
Do NOT insert data in both C6:C9 and C13:C17
Output Parameters of Johnson Curve if data inserted in C6:C9.
-46.19231651 Types:
Problems:
16.49535715 JSL=LogNormal
0 None
1 Variance <= 0
1 JSU=Unbounded
-12.48020369 JSB=Bounded
2 Kurt < Skew^2 + 1
3 Sb did not converge
JSN=Normal
JSL
4 Inconsistent Input Data
0 JST=Two-Ordinate
5 SU did not converge
0.4
0.3
0.3
0.2
PDF(x)
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
0.2
0.2
0.1
0.1
0.0
0.0
0.0
Lecture 3: PDFs 39
0.8
0.7
0.6
0.5
JFIT Tool (Mode 2)
0.4
0.3
0.2
0.1
23
22.4
Remove values for the four moments in the
top set of cells.
known
values for Johnson parameters
First Four CentralInsert
Moments
of The Data
Mean
in the lower set of cells.
0.2319
Variance
0.2088
Press compute and the pdf and CDF graphs
Skew
0.1856
Kurtosis
appear for that particular distribution 0.1625
Do NOT insert data in both C6:C9 and C13:C17
PDF(x)




Type
Probs?
Output Parameters of Johnson Curve if data inserted in C6:C9.
3.193742536 Types:
Problems:
PDF
0.2319
1.195191141 JSL=LogNormal
0 None
0.2088
1 Variance <= 0
45.23630054 JSU=Unbounded
0.1856
-0.776525532 JSB=Bounded
2 Kurt < Skew^2 + 1
0.1625
3 Sb did not converge
JSN=Normal
JSB
0.1394
4 Inconsistent Input Data
JST=Two-Ordinate
0.1162
5 SU did not converge
0.0931
PDF(x)

Var
sk
k
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
0.1394
0.1162
0.0931
0.0699
0.0468
0.0236
0.0005
-0.5239 1.5
0.0699
0.0468
0.0236
0.0005
-0.523 1.5297 3.5833 5.6370 7.6906 9.7442 11.797 13.851 15.905 17.958 20.012
9
9
5
1
8
4
Lecture 3: PDFs 40
0.8
0.7
0.6
0.5
JFit TABS (Bottom of JFit Screen)
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Johnson ST, Two Ordinate Distributions
Johnson SN, Normal Distributions
Johnson SB – Bounded Distributions
Johnson SU – Unbounded Distributions
Johnson SL – Logarithmic Distributions
Lecture 3: PDFs 41
0.8
0.7
0.6
0.5
Johnson Type 1: Log Normal (JSL)
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Lecture 3: PDFs 42
0.8
0.7
0.6
0.5
Johnson Type 2 Unbounded (JSU)
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Lecture 3: PDFs 43
0.8
0.7
0.6
0.5
Johnson Type 3 Bounded (JSB)
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Lecture 3: PDFs 44
0.8
0.7
0.6
0.5
Johnson Type 4 Normal (JSN)
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Lecture 3: PDFs 45
0.8
0.7
0.6
0.5
Johnson Type 5
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Johnson ST Fit: Two Ordinate Distribution
This is the
Histogram
corresponding to
g=0
h = . 80
e = .022
l = .078
Lecture 3: PDFs 46
0.8
0.7
0.6
0.5
Using JFIT To Estimate PNC or Yield
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Lecture 3: PDFs 47
0.8
0.7
0.6
0.5
Acronyms and Abbreviations
0.4
0.3
0.2
0.1








23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
AD - Anderson Darling
CLT - Central Limit Theorem
JFIT - Johnson Fitting Tool
KS - Kolmogorov Smirnoff
LSL - Lower Specification Limit
PDF -Probability Density Function
PNC - Probability of Non Compliance
USL - Upper Specification Limit
Lecture 3: PDFs 48
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
End of
PDF Fitting Module
(Lecture 3)
Back up slides
Lecture 3: PDFs 49
0.8
0.7
0.6
0.5
Distribution Fitting - Preliminary Steps
0.4
0.3
0.2
0.1

23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
This article covers several steps you should consider taking
before you analyze your probability data and apply the analysis
results.
– Step 1 - Define The Goals Of Your Analysis




The very first step is to define what you are trying to achieve by analyzing
your data. You should have a clear understanding of your goals as this will
help you throughout the entire data analysis process. Try answering the
following questions:
What kind of information would you like to obtain?
How will you obtain the information you need?
How will you apply that information?
Lecture 3: PDFs 50
0.8
0.7
0.6
0.5
Example:
0.4
0.3
0.2
0.1


23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Robert is the head of the Customer Support Department at a
large company. In order to reduce the customer service times
and improve the customer experience, he would like to do the
following:
Determine the probability that a customer can be served in 5
minutes or less. To solve this problem, Robert needs to:
– Perform distribution fitting to sample data (customer service times) for a
selected period of time (e.g. last week)
– Select the best fitting distribution
– Calculate the probability using the cumulative distribution function of the
selected distribution

If the probability is less than 95%, consider hiring additional
customer support staff
Lecture 3: PDFs 51
0.8
0.7
0.6
0.5
Step 2 - Prepare Data For Distribution Fitting
0.4
0.3
0.2
0.1

23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Preparing your data for distribution fitting is one of the most important steps
you should take, since the analysis results (and thus the decisions you
make) depend on whether you correctly collect and specify the input data.
– Data Format


Your data might come in one of the generally accepted formats, depending on the
source of data and how it was collected. You need to make sure the distribution fitting
software you are using supports the data format you need, and if it doesn't, you might
need to convert your data to one of the supported formats.
The most commonly used format in probability data analysis is an unordered set of
values obtained by observing some random process. The order of values in a data set
is not important and does not affect the distribution fitting results. This is one of the
fundamental differences between distribution fitting (and probability data analysis in
general) and time series analysis where each data value is connected to some timepoint at which this value was observed.
– Sample Size

The rule of thumb is the more data you have, the better. In most cases, to get reliable
distribution fitting results, you should have at least 75-100 data points available. Note
that very large samples (tens of thousands of data points) might cause some
computational problems when fitting distributions to data, and you might need to
reduce the sample size by selecting a subset of your data. However, in many cases
one has only 10 to 30 data points. Expect larger variances!
Lecture 3: PDFs 52
0.8
0.7
0.6
0.5
Step 3 - Decide Which Distributions To Fit
0.4
0.3
0.2
0.1

23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Before fitting distributions to your data, you should decide which distributions are
appropriate based on the additional information about the data you have. This can be
helpful to narrow your choice to a limited number of distributions before you actually
perform distribution fitting.
– Data Domain - Continuous or Discrete?


The easiest part is to determine whether your data is continuous or discrete. If your data can take
on real values (for example, 1.5 or -2.33), then you should consider continuous distributions only.
On the other hand, if your data can take on integer values (1, 2, -5 etc.) only, then you might want to
fit both continuous and discrete distributions.
The reason to use continuous distributions to analyze discrete data is that there is a large number
of continuous distributions which frequently provide much better fit than discrete distributions.
However, if you are confident that your random data follows a certain discrete distribution, you
might want to use that specific distribution rather than continuous models.
– The Nature of Your Data

In most cases, you have not just raw data, you also have some additional information about the
data and its properties, how the data was collected etc. This information might be very useful to
narrow your choice to several probability distributions.
– Example. If you are analyzing the sales data of a company, it should be clear that this kind
of data cannot contain negative values (unless the company sells at a loss), and thus it
wouldn't make much sense to fit distributions which can take on negative values (such as
the Normal distribution) to your data.

In addition, some particular distributions are recommended for use in several specific industries. An
obvious example of such an industry is reliability engineering which makes great use of the Weibull
distribution and several additional models (Exponential, Lognormal, Gamma) to perform the
analysis of failure data. These distributions are widely used in many other industries, but in
reliability engineering they are considered "standard".
Lecture 3: PDFs 53
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Additional Distributions
Lecture 3: PDFs 54
0.8
0.7
0.6
0.5
Geometric distribution
0.4
0.3
0.2
0.1

23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Probability that an event occurs on the kth trial.
– Let p = probability of event occurring in a single trial.
– f(k)= probability of event occurring on kth attempt.= (1-p)k-1p
the is called the probability mass function or pmf.

Example: Find probability a capacitor shorts due to puncturing
of dielectric after there have been k voltage spikes. If the
probability of a puncture on the jth voltage spike is pj, then the
probability of puncture on the kth spike is
– f(k)=(1-p1)(1-p2)…(1-pk-1)pk

If all the p values are the same then f(k)=(1-p)k-1p which is the
geometric distribution probability mass function or pmf.
Lecture 3: PDFs 55
0.8
0.7
0.6
0.5
Hypergeometric distribution
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Hypergeometric distribution (use in sampling w/o replacement)
 M  N  M 
M


mean

np
,
p

 x P(X=x,n

n  x  | N,M) =


N
f ( x)  P( X  x) 
N
npq  N  n 
 N n
2
n



np
(1

p
)


 
( N  1)
 N 1 
N  population size
M=# bad units in population
n = sample size drawn from N w/o replacement
x = # failed units in sample
Ex. N=600 missiles in inventory, M=80 missiles were worked on by
Operator #323 and they are believed to have been defectively assembled
by this operator. If we take n=20 missiles randomly from the inventory, then
f(x) = probability of x defective missiles in the sample. f(0)=0.054
Lecture 3: PDFs 56
0.8
0.7
0.6
0.5
Poisson Distribution
0.4
0.3
0.2
0.1

23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Used extensively for situations involving counting of events
– e.g. probability of x false alarms during time period T=60 seconds when the
average alarm rate is  (false alarms/ sec).
– The pmf is f(k) = (T)k*exp(-T) / k!= (60)k*exp(-60) / k!
E[k]=1/T,1/T1/2

Also used in predicting number of failed units over time period TW, given the
mean time to failure MTTF =1/.
– f(k units failing in time TW)= (TW)k*exp(-TW)/k!.

If want a 90% probability of x or fewer failures in time span TW , then find
– F(x)=0.90= exp(-TW) + (TW)1*exp(-TW)/1! + (TW)2*exp(-TW)/2! +
… + (TW)x*exp(-TW)/x! (solve numerically for TW the warranty period.)
– If you were considering the case of x=0 failures during the warranty period, then
you could set the warranty period = TW =-ln(0.9)/ ~ 0.105/ years.

If your warranty period TW = 1 year and the MTTF=10 years then you should
expect only 1- exp(-.1*1) ~ 9.5% of units produced to have one or more
failures during that time.
Lecture 3: PDFs 57
0.8
0.7
0.6
0.5
Uniform Discrete Distribution
0.4
0.3
0.2
0.1


23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Assume b > a integers
f(x|a,b) = 1/(b-a+1) for a≤ x ≤ b. n uniformly distributed pts.
= 0, otherwise
a
a+1
k
b

F(x)= 0, x≤a
= (x-a+1)/(b-a+1), a≤ x ≤ b
=1, x≥b

There are n=b-a+1 discrete uniformly distributed data points
– Mean = (a+b)/2
– Var(X) = [(b-a+1)2-1]/12
Lecture 3: PDFs 58
0.8
0.7
0.6
0.5
Continuous distributions
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17.6
17
0
Logistics distribtion

f (x) 
0.6
1
xa
b
e
b 

1

e


0.5
0.4
0.3
0.2
0.1
0
-2
-1
0
1
2
3
4
F (x) 
xa
b



2

1
xa
b
e
b 

1

e


xa
b



2
m e a n = a = m e d ia n = m o d e
1


1  e


xa
b
v a ria n c e  p

 3


4
2
b
2
3
 0
 4 .2
Gamma distribution (Erlang for  = integer)
 x 1e x / 
,x 0
 
f ( x)  P( X  x)      
0
,x0

    mean
 2   2  variance
2
 3  1 

distribution of time up to having exactly =k events
3   2 
occur. Used in queueing theory, reliability, etc..
 4  2 

Lecture 3: PDFs 59
0.8
0.7
0.6
0.5
Beta & Weibull distributions
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0

=
 +
Beta distribution
 x 1 1  x 

f ( x)  
B( ,  )
0

B( ,  ) 
 1
,0  x  1
2 
,otherwise
( )( )
 beta function
(   )
xmode
2
1

 + 2  +  1
 1

 +  2
0
0.5
Weibull distribution


f (t )   


t 
 
 
 1
e
0
t 
 
 

,  ,   0, t  0
,otherwise
F (t )  P (T  t )  1  e
 t 
 
 

 1
   1  
 
  2
 1 
 2   2  1     2 1   
  
  
Exponential distribution
 e t ,   0, t  0
f (t )  
 0 ,otherwise
F (t )  P(T  t )  1  e t

1

2 
1
2
 3  1  2
4  2  9
Lecture 3: PDFs 60
0.8
0.7
0.6
0.5
Rayleigh & Triangle distributions
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Rayleigh distribution
 x

f ( x)    2


 
1 x 
  
2  
2

,   0, x  0
 e

0 ,otherwise
F ( x)  P( X  x)  1  e
1 x 
  
2  
2
p
2

 2  2 2 1 
p

3 
1 
4

 3
p
 2  p2 
3/ 2
p
2
 0.63
 4   2   32  3p 2  /(4  p ) 2  3.245
Triangle ditribution
 2  xa 
 b  a  m  a  , a  X  m



f ( x)  
 2  b  x  , b  X  m
 b  a  b  m 
a= lower limit
b = upper limit
m = most probable or mode
a
m
b
  a  m  b / 3
 2   a 2  m 2  b 2  ab  am  bm  /18
3
  b  a  / 3 

3
10

(m  a)  
(b  m)  
(m  a)  
 1 

 1 
 1  2
(b  a)  
(b  a)  
b  a  

 4  324 135  2.4
Lecture 3: PDFs 61
0.8
0.7
0.6
0.5
Lognormal and Normal distributions
0.4
0.3
0.2
0.1
23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Lognormal distribution
2
  ln x    
ln  2ln
1

ln
 

=mean

e
 MTTF
 
 2   ln  
1




2
e
,0  x  
 ln
2
2
f ( x)  

=variance
=

e
1
x ln 2p

, otherwise m =median  e ln

 0
2

Normal distribution (Gaussian distribution)
1
f ( x) 
e
 2p
1  x 
 

2  
2
  mean
 2  variance
  std dev
3  0  1  coef. of skewness
 4  3  2  coef, of kurtosis
Many statistical process
control calculations
assume normal
distributions and
incorrectly so!
Standard Normal distribution
Any normal distribution can be
scaled to the unit normal
distribution. Useful when using
tables to look up values

Used to
model mean
time to repair
for availability
calculations
(Unit Normal distribution)
  mean=0
1  12 z2
f ( z) 
e    z   2  variance=1
2p
  std dev=1
define z = (x - ) / 
3  0  1  coef. of skewness
 4  3  2  coef, of kurtosis
Lecture 3: PDFs 62
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1

23
22.4
21.8
21.2
20
20.6
19.4
18.2
18.8
17
17.6
0
Mixed distribution. Used to model bimodal processes.
f(x)=pg1(x) + (1-p)g2(x)
  p1  (1  p)  2
 2   2  p( 12  12 )  (1  p)( 22   22 )
 3  S  p( S1 13  3 121  13 )  (1  p)(S 2 23  3 22 2   23 )
 4  K  p( K1 14  4S1 131  6 1212  14 )  (1  p)(K1 24  4S1 23 2  6 22 22   24 )

Definitions.
   xf ( x)dx,  j   xg j ( x)dx,
 r   x   r f ( x)dx,  rj   x   j r g j ( x)dx,
 j   j   ,  2  1   2  1
 12   21   x  1 2 g1 ( x)dx,  12   22   x   2 2 g 2 ( x)dx,
S j   3 j   x   j  g j ( x)dx, K j   4 j   x   j  g j ( x)dx,
3
4
Lecture 3: PDFs 63