Characterizing Variation and Distributions
Download
Report
Transcript Characterizing Variation and Distributions
IENG 486 - Lecture 05
Interpreting Variation Using
Distributions
7/16/2015
IENG 486 Statistical Quality &
Process Control
1
Assignment:
Reading:
Chapter 1: (1.1, 1.3 – 1.4.5)
Chapter 2: (2.2 – 2.7)
Cursory – get Fig. 1.12., p.34; Deming Management,1.4.4 Liability
Cursory – Define, Measure, Analyze, Improve, Control
Chapter 3: (3.1, 3.3.1, 3.4.1)
HW 1: Chapter 3 Exercises:
7/16/2015
1, 3, 4 – using exam calculator
10 (use Normal Plots spreadsheet from Materials page)
43, 46, 47 (use Exam Tables from Materials page – Normal Dist.)
IENG 486 Statistical Quality & Process
Control
2
Distributions
Distributions quantify the probability of an event
Events near the mean are most likely to occur, events
further away are less likely to be observed
35.0
2.5
30.4
(-3)
7/16/2015
34.8
32.6
(-)
(-2)
37
()
39.2
(+)
43.6
41.4
(+3)
(+2)
IENG 486 Statistical Quality & Process
Control
3
Normal Distribution
Normal Distribution
0.4
Mean,Std. dev.
0,1
f(x )
0.3
0.2
0.1
0
-4
-3
-2
-1
0
X
Notation: r.v.
x ~ N ,
1
2
3
4
This is read: “x is normally distributed with mean and
standard deviation .”
Standard Normal Distribution
r.v. z ~ N 0, 1
(z represents a Standard Normal r.v.)
7/16/2015
IENG 486 Statistical Quality & Process
Control
4
Simple Interpretation of Standard
Deviation of Normal Distribution
P( x ) .6827
P( 2 x 2 ) .9546
P( IENG
3486Statistical
x Quality
&3Process
) .9973
7/16/2015
Control
5
Standard Normal Distribution
The Standard Normal Distribution has a mean () of 0
and a variance (2) of 1 (thus, standard deviation is also 1)
Total area under the curve, (z), from z = – to z =
is exactly 1
The curve is symmetric about the mean
Half of the total area lays on either side, so:
(– z) = 1 – (z)
(z)
7/16/2015
z
IENG 486 Statistical Quality & Process
Control
6
Standard Normal Distribution
How likely is it that we would observe a data point
more than 2.57 standard deviations beyond the
mean?
under the curve from – to z = 2.57 is found by
using the table on pp. 693-694, looking up the
cumulative area for z = 2.57, and then subtracting the
cumulative area from 1.
Area
(z)
7/16/2015
z
IENG 486 Statistical Quality & Process
Control
7
7/16/2015
IENG 486 Statistical Quality & Process
Control
8
Standard Normal Distribution
How likely is it that we would observe a data point
more than 2.57 standard deviations beyond the
mean?
under the curve from – to z = 2.57 is found by
using the table on pp. 693-694, looking up the
cumulative area for z = 2.57, and then subtracting the
cumulative area from 1.
Answer: 1 – .99492 = .00508, or about 5 times in 1000
Area
(z)
7/16/2015
z
IENG 486 Statistical Quality & Process
Control
9
What if the distribution isn’t a Standard
Normal Distribution?
If it is from any Normal Distribution, we can
express the difference from a sample mean to
the population mean in units of the standard
deviation, and this converts it to a Standard
Normal Distribution.
Conversion formula is:
where:
z
x
x is the sample location point,
is the population mean, and
is the population standard deviation.
7/16/2015
IENG 486 Statistical Quality & Process
Control
10
What if the distribution isn’t even a
Normal Distribution?
The Central Limit Theorem allows us to take the sum
of several means, regardless of their distribution, and
approximate this sum using the Normal Distribution if
the number of observations is large enough.
Most assemblies are the result of adding together
components, so if we take the sum of the means for each
component as an estimate for the entire assembly, we
meet the CLT criteria.
If we take the mean of a sample from a distribution, we
meet the CLT criteria (think of how the mean is computed).
7/16/2015
IENG 486 Statistical Quality & Process
Control
11
Example: Process Yield
Specifications are often set irrespective of process
distribution, but if we understand our process we can
estimate yield / defects.
Assume a specification calls for a value of 35.0 2.5.
Assume the process has a distribution that is Normally
distributed, with a mean of 37.0 and a standard deviation of
2.20.
Estimate the proportion of the process output that will meet
specifications.
7/16/2015
IENG 486 Statistical Quality & Process
Control
12
Continuous & Discrete Distributions
Continuous
Discrete
Probability of a range
of outcomes is the area
under the PDF
(integration)
Probability of a range of
outcomes is the area
under the PDF
(sum discrete outcomes)
35.0
2.5
30.4
(-3)
34.8
32.6
(-)
(-2)
7/16/2015
35.0
2.5
37
()
39.2
(+)
43.6
41.4
(+3)
(+2)
30
32
34
IENG 486 Statistical Quality & Process
Control
36
()
38
40
42
13
Discrete Distribution Example
Sum of two six-sided dice:
Outcomes range from 2 to 12.
Count the possible ways to obtain each individual sum forms a histogram
What is the most frequently occurring sum that you could roll?
Most likely outcome is a sum of 7 (there are 6 ways to
obtain it)
What is the probability of obtaining the most likely sum in a
single roll of the dice?
6 36 = .167
What is the probability of obtaining a sum greater than 2 and
less than 11?
32 36 = .889
7/16/2015
IENG 486 Statistical Quality & Process
Control
14
How do we know what the distribution
is when all we have is a sample?
Theory – “CLT applies to measurements taken consisting of
many assemblies…”
Experience – “past use of a distribution has generated very
good results…”
“Testing” – combination of the above … in this case, anyway!
If we know the generating function for a distribution, we can
construct a grid (probability paper) that will allow us to observe
a straight line when sufficient data from that distribution are
plotted on the grid
Easiest grid to create is the Standard Normal Distribution …
7/16/2015
because it is an easy transformation to “standard“ parameters
IENG 486 Statistical Quality & Process
Control
15
Normal Probability Plots
Take raw data and count observations (n)
Set up a column of j values (1 to j)
Compute F(zj) for each j value
F(zj) = (j - 0.5)/n
Get zj value for each F(zj) in Standard Normal Table
Find table entry(F(zj)), then read index value (zj)
Set up a column of sorted, observed data
Sorted in increasing value
Plot zj values versus sorted data values
Approximate with sketched line at 25% and 75% points
7/16/2015
IENG 486 Statistical Quality & Process
Control
16
Interpreting Normal Plots
Assess Equal-Variance and Normality assumptions
Data from a Normal sample should tend to fall along the line, so
if a “fat pencil” covers almost all of the points, then a normality
assumption is supported
The slope of the line reflects the variance of the sample, so
equal slopes support the equal variance assumption
Theoretically:
Sketched line should intercept the zj = 0 axis at the mean value
Practically:
7/16/2015
Close is good enough for comparing means
Closer is better for comparing variances
If the slopes differ much for two samples, use a test that
assumes the variances are not the same
IENG 486 Statistical Quality & Process
Control
17
Normal Probability Plots
Tools for constructing Normal Probability Plots:
(Normal) Probability Paper
In-class handout
Normal Plots Template
Materials Page on course website
Interpretation
Fat Pencil Test
7/16/2015
IENG 486 Statistical Quality & Process
Control
18