File - freesixsigmasite.com

Download Report

Transcript File - freesixsigmasite.com

Analyze Phase
Hypothesis Testing Normal Data
Part 1
Hypothesis Testing Normal Data Part 1
Welcome to Analyze
“X” Sifting
Inferential Statistics
Intro to Hypothesis Testing
Sample Size
Hypothesis Testing ND P1
Testing Means
Analyzing Results
Hypothesis Testing ND P2
Hypothesis Testing NND P1
Hypothesis Testing NND P2
Wrap Up & Action Items
LSS Green Belt v11.1 MT - Analyze Phase
2
© Open Source Six Sigma, LLC
Test of Means (t-tests)
t-tests are used:
– To compare a Mean against a target.
• i.e.; The team made improvements and wants to compare
the Mean against a target to see if they met the target.
– To compare Means from two different samples.
• i.e.; Machine one to machine two.
• i.e.; Supplier one quality to supplier two quality.
– To compare paired data.
• Comparing the same part before and after a given process.
They don’t look the
same to me!
LSS Green Belt v11.1 MT - Analyze Phase
3
© Open Source Six Sigma, LLC
1 Sample t
A 1-sample t-test is used to compare an expected population Mean to a
target.
Target
μsample
MINITABTM performs a one sample t-test or t-confidence interval for the
Mean.
Use 1-sample t to compute a confidence interval and perform a Hypothesis
Test of the Mean when the population Standard Deviation, σ, is unknown.
For a one or two-tailed 1-sample t:
– H0: μsample = μtarget
– Ha: μsample ≠, <, > μtarget
LSS Green Belt v11.1 MT - Analyze Phase
If P-value > 0.05 fail to reject Ho
If P-value < 0.05 reject Ho
4
© Open Source Six Sigma, LLC
1 Sample t-test Sample Size
T
Population
n=2
n = 30
Target
Cannot tell the
difference
between the sample
and the target.
X
X
XX
X
X
X
X
X
X X X X
X
Can tell the
difference
between the sample
and the target.
X
X
XX
X X
X XX
SE Mean 
LSS Green Belt v11.1 MT - Analyze Phase
5
S
n
© Open Source Six Sigma, LLC
Sample Size
Three fields must be filled in
and one left blank.
LSS Green Belt v11.1 MT - Analyze Phase
6
© Open Source Six Sigma, LLC
Sample Size
Power and Sample Size
1-Sample t Test
Testing Mean = null (versus not = null)
Calculating power for Mean = null + difference
Alpha = 0.05 Assumed Standard Deviation = 1
Sample
Size Power
10 0.9
15 0.9
20 0.9
25 0.9
30 0.9
35 0.9
40 0.9
The various sample sizes
show how much of a
difference can be detected
assuming a Standard
Deviation = 1.
Difference
1.15456
0.90087
0.76446
0.67590
0.61245
0.56408
0.52564
LSS Green Belt v11.1 MT - Analyze Phase
7
© Open Source Six Sigma, LLC
1-Sample t Example
1. Practical Problem:
• We are considering changing suppliers for a part we currently purchase
from a supplier that charges us a premium for the hardening process.
• The proposed new supplier has provided us with a sample of their
product. They have stated they can maintain a given characteristic of 5
on their product.
• We want to test the samples to determine if their claim is accurate.
2. Statistical Problem:
Ho: μN.S. = 5
Ha: μN.S. ≠ 5
3. 1-sample t-test (population Standard Deviation unknown,
comparing to target).
α = 0.05
β = 0.10
LSS Green Belt v11.1 MT - Analyze Phase
8
© Open Source Six Sigma, LLC
Example
4. Sample Size:
•
•
Open the MINITABTM worksheet: “Exh_Stat.MTW”.
Use the C1 column: Values
– In this case, the new supplier sent 9
samples for evaluation.
– How much of a difference can be
detected with this sample?
LSS Green Belt v11.1 MT - Analyze Phase
9
© Open Source Six Sigma, LLC
1-Sample t Example
This means we will be able to
detect a difference of only 1.24 if
the population has a Standard
Deviation of 1 unit.
MINITABTM Session Window
Power and Sample Size
1-Sample t Test
Testing Mean = null (versus not = null)
Calculating power for Mean = null + difference
Alpha = 0.05 Assumed Standard Deviation = 1
Sample
Size Power Difference
9
0.9
1.23748
LSS Green Belt v11.1 MT - Analyze Phase
10
© Open Source Six Sigma, LLC
Example: Follow the Road Map
5. State Statistical Solution
Stat > Basic Statistics > Normality Test…
Are the data in the values column Normal?
Probability Plot of Values
Normal
99
Mean
StDev
N
AD
P-Value
95
90
4.789
0.2472
9
0.327
0.442
Percent
80
70
60
50
40
30
20
10
4.2
4.4
LSS Green Belt v11.1 MT - Analyze Phase
4.6
4.8
Values
11
5.0
5.2
5.4
© Open Source Six Sigma, LLC
1-Sample t Example
Click “Graphs”
-Select all 3
Click “Options…”
- In CI enter ’95’
LSS Green Belt v11.1 MT - Analyze Phase
12
© Open Source Six Sigma, LLC
Histogram of Values
Histogram of Values
(with Ho and 95% t-confidence interval for the mean)
2.0
Frequency
1.5
1.0
0.5
0.0
_
X
Ho
4.4
4.5
4.6
4.7
4.8
Values
4.9
5.0
5.1
Note our target Mean (represented by red Ho) is outside our
population confidence boundaries which tells us there is a
significant difference between population and target Mean.
LSS Green Belt v11.1 MT - Analyze Phase
13
© Open Source Six Sigma, LLC
Box Plot of Values
Boxplot of Values
(with Ho and 95% t-confidence interval for the mean)
_
X
Ho
4.4
4.5
4.6
4.7
4.8
4.9
5.0
5.1
Values
LSS Green Belt v11.1 MT - Analyze Phase
14
© Open Source Six Sigma, LLC
Individual Value Plot (Dot Plot)
Individual Value Plot of Values
(with Ho and 95% t-confidence interval for the mean)
_
X
Ho
4.4
4.5
4.6
4.7
4.8
4.9
5.0
5.1
Values
LSS Green Belt v11.1 MT - Analyze Phase
15
© Open Source Six Sigma, LLC
Session Window
Ha
Ho
s
One-Sample T: Values
Test of mu = 5 vs not = 5
(X i  X) 2

n 1
i 1
n
SE Mean 
Variable
N
Mean
StDev
SE Mean
95% CI
Values
9
4.78889
0.24721
0.08240
(4.59887, 4.97891)
S
n
T
P
-2.56
0.034
T-Calc = Observed – Expected over SE Mean
T-Calc = X-bar – Target over Standard Error
T-Calc = 4.7889 – 5 over .0824 = - 2.56
N – sample size
Mean – calculate mathematic average
StDev – calculated individual Standard Deviation (classical method)
SE Mean – calculated Standard Deviation of the distribution of the Means
Confidence Interval that our population average will fall between 4.5989 and 4.9789
LSS Green Belt v11.1 MT - Analyze Phase
16
© Open Source Six Sigma, LLC
Evaluating the Results
Since the P-value of 0.034 is less than 0.05 reject the null hypothesis.
Based on the samples given there is a difference between the
average of the sample and the desired target.
Ho
X
6. State Practical Conclusions
The new supplier’s claim they can meet the target of 5 for the
hardness is not correct.
LSS Green Belt v11.1 MT - Analyze Phase
17
© Open Source Six Sigma, LLC
Manual Calculation of 1- Sample t
Let’s compare the manual calculations to what the
computer calculates.
– Calculate t-statistic from data:
t
X  Target
4.79  5.00

 2.56
s
0.247
n
9
– Determine critical t-value from t-table in reference section.
• When the alternative hypothesis has a not equal sign it is a
two-sided test.
• Split the α in half and read from the 0.975 column in the ttable for n -1 (9 - 1) degrees of freedom.
LSS Green Belt v11.1 MT - Analyze Phase
18
© Open Source Six Sigma, LLC
Manual Calculation of 1- Sample t
T - Distribution
degrees of
freedom
1
2
3
4
5
.600
0.325
0.289
0.277
0.271
0.267
.700
0.727
0.617
0.584
0.569
0.559
.800
1.376
1.061
0.978
0.941
0.920
.900
3.078
1.886
1.638
1.533
1.476
.950
6.314
2.920
2.353
2.132
2.015
.975
12.706
4.303
3.182
2.776
2.571
.990
31.821
6.965
4.541
3.747
3.365
.995
63.657
9.925
5.841
4.604
4.032
6
7
8
9
10
0.265
0.263
0.262
0.261
0.260
0.553
0.549
0.546
0.543
0.542
0.906
0.896
0.889
0.883
0.879
1.440
1.415
1.397
1.383
1.372
1.943
1.895
1.860
1.833
1.812
2.447
2.365
2.306
2.262
2.228
3.143
2.998
2.896
2.821
2.764
3.707
3.499
3.355
3.250
3.169
m
-2.56
The data supports the alternative
hypothesis that the estimate for the
Mean of the population is not 5.0.
-2.306
2.306
α/2 =.025
α/2=.025
0
Critical Regions
LSS Green Belt v11.1 MT - Analyze Phase
19
© Open Source Six Sigma, LLC
Confidence Intervals for Two-Sided t-test
The formula for a two-sided t-test is:
s
s
 μ  X  t α/2,n 1
n
n
or
X  t α/2,n 1
X  t crit SE mean  4.788  2.306 * .0824
4.5989 to 4.9789
X
4.5989
4.9789
Ho
4.7889
LSS Green Belt v11.1 MT - Analyze Phase
20
© Open Source Six Sigma, LLC
1-Sample t Exercise
Exercise objective: Utilize what you have learned to
conduct and analyze a one sample t-test using
MINITABTM.
1. The last engineering estimation said we would
achieve a product with average results of 32 parts per
million (ppm).
2. We want to test if we are achieving this performance
level, we want to know if we are on target with 95%
confidence in our answer. Use worksheet
HYPOTTESTSTUD with data in column “ppm VOC”
3. Are we on Target?
LSS Green Belt v11.1 MT - Analyze Phase
21
© Open Source Six Sigma, LLC
1-Sample t Exercise: Solution
Since we do not know the population Standard Deviation we
will use the 1 sample T test to determine if we are at Target.
LSS Green Belt v11.1 MT - Analyze Phase
22
© Open Source Six Sigma, LLC
1-Sample t Exercise: Solution
After selecting column C1 and
setting “Test mean” to 32.0,
click “Graphs…” and select
“Histogram of data” to get a
good visualization of the
analysis.
Depending on the test you are
running you may need to
select “Options…” to set your
desired Confidence Interval
and hypothesis. In this case
the MINITABTM defaults are
what we want.
LSS Green Belt v11.1 MT - Analyze Phase
23
© Open Source Six Sigma, LLC
1-Sample t Exercise: Solution
Because the null
hypothesis is within the
confidence level you
know we will “fail to
reject” the null
hypothesis and accept
the equipment is running
at the target of 32.0.
LSS Green Belt v11.1 MT - Analyze Phase
Histogram of ppm VOC
(with Ho and 95% t-confidence interval for the mean)
10
8
Frequency
Because we used the
option of “Graphs…” we
get a nice visualization
of the data in a
histogram AND a plot of
the null hypothesis
relative to the
confidence level of the
population Mean.
6
4
2
_
X
0
Ho
20
24
25
30
35
ppm VOC
40
45
50
© Open Source Six Sigma, LLC
1-Sample t Exercise: Solution
In MINITABTM’s Session Window (ctrl – M) you can see the P-value
of 0.201. Because it is above 0.05 we “fail to reject” the null
hypothesis so we accept the equipment is giving product at a target
of 32.0 ppm VOC.
LSS Green Belt v11.1 MT - Analyze Phase
25
© Open Source Six Sigma, LLC
Hypothesis Testing Roadmap
Normal
Test of Equal Variance
1 Sample Variance
Variance Equal
Variance Not Equal
Two samples
Two samples
2 Sample T
1 Sample t-test
One Way ANOVA
LSS Green Belt v11.1 MT - Analyze Phase
2 Sample T
26
One Way ANOVA
© Open Source Six Sigma, LLC
2 Sample t-test
A 2-sample t-test is used to compare two Means.
Stat > Basic Statistics > 2-Sample t
MINITABTM performs an independent two-sample t-test to generate a
confidence interval.
Use 2-Sample t to perform a Hypothesis Test and compute a
confidence interval of the difference between two population Means
when the population Standard Deviations, σ’s, are unknown.
Two tailed test:
– H0: μ1 = μ2
If P-value > 0.05 fail to reject Ho
– Ha: μ1 ≠ μ2
If P-value < 0.05 reject Ho
One tailed test:
– H0: μ1 = μ2
– Ha: μ1 > or < μ2
m1
LSS Green Belt v11.1 MT - Analyze Phase
27
m2
© Open Source Six Sigma, LLC
Sample Size
Three fields must be filled in
and one left blank.
LSS Green Belt v11.1 MT - Analyze Phase
28
© Open Source Six Sigma, LLC
Sample Size
Power and Sample Size
2-Sample t Test
Testing Mean 1 = Mean 2 (versus not equal)
Calculating power for Mean 1 = Mean 2 + difference
Alpha = 0.05 Assumed Standard Deviation = 1
Sample
Size Power Difference
10
0.9
1.53369
15
0.9
1.22644
20
0.9
1.05199
25
0.9
0.93576
30
0.9
0.85117
35
0.9
0.78605
40
0.9
0.73392
The various sample
sizes show how much
of a difference can be
detected assuming the
Standard Deviation = 1.
The sample size is for each group.
LSS Green Belt v11.1 MT - Analyze Phase
29
© Open Source Six Sigma, LLC
2-Sample t Example
1. Practical Problem:
• We have conducted a study in order to determine the effectiveness
of a new heating system. We have installed two different types of
dampers in home ( Damper = 1 and Damper = 2).
• We want to compare the BTU.In data from the two types of
dampers to determine if there is any difference between the two
products.
2. Statistical Problem:
Ho: μ1 = μ2
Ha: μ1 ≠ μ2
3. 2-Sample t-test (population Standard Deviations unknown).
α = 0.05
β = 0.10
No, not that kind of damper!
LSS Green Belt v11.1 MT - Analyze Phase
30
© Open Source Six Sigma, LLC
2 Sample t Example
4. Sample Size:
• Open the MINITABTM worksheet: “Furnace.MTW”
• Scroll through the data to see how the data is coded.
• In order to work with the data in the BTU.In column we will need
to unstack the data by damper type.
LSS Green Belt v11.1 MT - Analyze Phase
31
© Open Source Six Sigma, LLC
2 Sample t Example
Data > Unstack Columns…
LSS Green Belt v11.1 MT - Analyze Phase
32
© Open Source Six Sigma, LLC
2 Sample t Example
LSS Green Belt v11.1 MT - Analyze Phase
33
© Open Source Six Sigma, LLC
2 Sample t Example
MINITABTM Session Window
Power and Sample Size
2-Sample t Test
Testing Mean 1 = Mean 2 (versus not =)
Calculating power for Mean 1 = Mean 2 +
difference
Alpha = 0.05 Assumed Standard Deviation
=1
Sample
Size Power Difference
40
0.9
0.733919
50
0.9
0.654752
The sample size is for each group.
LSS Green Belt v11.1 MT - Analyze Phase
34
© Open Source Six Sigma, LLC
Example: Follow the Roadmap…
5. State Statistical Solution
LSS Green Belt v11.1 MT - Analyze Phase
35
© Open Source Six Sigma, LLC
Normality Test – Is the Data Normal?
Probability Plot of BTU.In_1
Normal
99
Mean
StDev
N
AD
P-Value
95
90
9.908
3.020
40
0.475
0.228
Percent
80
70
60
50
40
30
20
10
5
1
5
LSS Green Belt v11.1 MT - Analyze Phase
10
BTU.In_1
36
15
20
© Open Source Six Sigma, LLC
Normality Test – Is the Data Normal?
Probability Plot of BTU.In_2
Normal
99
Mean
StDev
N
AD
P-Value
95
90
10.14
2.767
50
0.190
0.895
Percent
80
70
60
50
40
30
20
10
5
1
2
4
LSS Green Belt v11.1 MT - Analyze Phase
6
8
10
BTU.In_2
37
12
14
16
18
© Open Source Six Sigma, LLC
Test of equal Variance (Bartlett’s Test)
LSS Green Belt v11.1 MT - Analyze Phase
38
© Open Source Six Sigma, LLC
Test of Equal Variance
Test for Equal Variances for BTU.In
F-Test
Test Statistic
P-Value
Damper
1
Sample 1
1.19
0.558
Levene's Test
Test Statistic
P-Value
2
2.0
2.5
3.0
3.5
95% Bonferroni Confidence Intervals for StDevs
0.00
0.996
4.0
Sample 2
Damper
1
2
5
10
15
20
BTU.In
LSS Green Belt v11.1 MT - Analyze Phase
39
© Open Source Six Sigma, LLC
2 Sample t-test Equal Variance
LSS Green Belt v11.1 MT - Analyze Phase
40
© Open Source Six Sigma, LLC
Box Plot
Boxplot of BTU.In by Damper
20
BTU.In
15
10
5
2
1
Damper
5. State Statistical Conclusions: Fail to reject the null hypothesis.
6. State Practical Conclusions: There is no difference between the
dampers for BTU’s in.
LSS Green Belt v11.1 MT - Analyze Phase
41
© Open Source Six Sigma, LLC
Minitab Session Window
Calculated
Average
SE Mean 
Number of
Samples
-1.450
(X i  X) 2
s 
n 1
i 1
n
S
n
0.980
Two- Sample T-Test
(Variances Equal)
-0.38
Ho: μ1 = μ2
Ha: μ1≠ or < or > μ2
LSS Green Belt v11.1 MT - Analyze Phase
42
© Open Source Six Sigma, LLC
Exercise
Exercise objective: Utilize what you have learned to
conduct and analyze a 2 sample t-test using
MINITABTM.
1. Billy Bob’s Pool Care has conducted a study on the
effectiveness of two chlorination distributors in a swimming
pool. (Distributor 1 & Distributor 2).
2. The up and coming Billy Bob Jr., looking to prove himself,
wants a comparison done on the Clor.Lev_Post data from
the two types of distributors in order to determine if there is
any difference between the two products.
3. With 95% confidence is there a significant difference
between the two distributors?
4. Use data within MINITABTM Worksheet “Billy Bobs Pool.mtw”
LSS Green Belt v11.1 MT - Analyze Phase
43
© Open Source Six Sigma, LLC
2 Sample T-Test: Solution
1. What do we want to know: With 95% confidence is there a
significant difference between the two distributors?
2. Statistical Problem:
Ho: μ1 = μ2
Ha: μ1 ≠ μ2
3. 2-Sample t-test (population
Standard Deviations unknown).
α = 0.05 β = 0.10
4. Now we need to look at the data to
determine the Sample Size but let’s
see how the data is formatted first.
LSS Green Belt v11.1 MT - Analyze Phase
44
© Open Source Six Sigma, LLC
2 Sample T-Test: Solution
Data > Unstack Columns…
• “Unstack the data in:” Select ‘Clor.Levl_Post’
• “Using subscripts in:” Select ‘Distributor’
LSS Green Belt v11.1 MT - Analyze Phase
45
© Open Source Six Sigma, LLC
2 Sample T-Test: Solution
• Clor.Lev_Post_1 =
Distributor 1
• Clor.Lev_Post_2 =
Distributor 2
LSS Green Belt v11.1 MT - Analyze Phase
46
© Open Source Six Sigma, LLC
2 Sample T-Test: Solution
LSS Green Belt v11.1 MT - Analyze Phase
47
© Open Source Six Sigma, LLC
2 Sample T-Test: Solution
•
We want to determine what
is the smallest difference
that can be detected based
on our data.
•
Fill in the three areas and
leave “Differences:” blank
so that MINITABTM will tell
us the differences we need.
LSS Green Belt v11.1 MT - Analyze Phase
48
© Open Source Six Sigma, LLC
2 Sample T-Test: Solution
The smallest difference that can be calculated is based on the
smallest sample size.
In this case:
.7339 rounded to.734
LSS Green Belt v11.1 MT - Analyze Phase
49
© Open Source Six Sigma, LLC
2 Sample T-Test: Solution
Follow the path: “Stat > Basic Statistics > Normality Test…”
LSS Green Belt v11.1 MT - Analyze Phase
50
© Open Source Six Sigma, LLC
2 Sample T-Test: Solution
Check Normality for ‘Clor.Lev_Post_1’
The result shows
us a P-value of
0.304 so our data
is Normal.
LSS Green Belt v11.1 MT - Analyze Phase
51
© Open Source Six Sigma, LLC
2 Sample T-Test: Solution
Check Normality for ‘Clor.Lev_Post_2’
The result shows
us a P-value of
0.941 so our data
is also Normal.
LSS Green Belt v11.1 MT - Analyze Phase
52
© Open Source Six Sigma, LLC
2 Sample T-Test: Solution
Test for Equal Variances
MINITABTM Path:
“Stat > ANOVA > Test
for Equal Variances…”
LSS Green Belt v11.1 MT - Analyze Phase
53
© Open Source Six Sigma, LLC
2 Sample T-Test: Solution
For the “Response:” we select our stacked column
‘Clor.Lev_Post’
For our “Factors:” we select our stacked column ‘Distributor’
LSS Green Belt v11.1 MT - Analyze Phase
54
© Open Source Six Sigma, LLC
2 Sample T-Test: Solution
Look at the P-value of 0.113 ~
This tells us there is no statistically significant difference in the
variance in these two data sets.
What does this mean….We can finally run a 2 sample t–test with
Equal Variances?
LSS Green Belt v11.1 MT - Analyze Phase
55
© Open Source Six Sigma, LLC
2 Sample T-Test: Solution
For “Samples:” enter ‘Clor.Lev_Post’
For “Subscripts:” enter ‘Distributors’
LSS Green Belt v11.1 MT - Analyze Phase
56
© Open Source Six Sigma, LLC
2 Sample T-Test: Solution
Look at the Box Plot and Session Window.
There is NO significant difference between the distributors.
Hmm, we’re
a lot alike!
Two-sample T for Clor.Lev_Post
Distributor N Mean StDev SE Mean
1
40 17.84 3.97 0.63
2
50 17.41 3.12 0.44
Difference = mu (1) - mu (2)
Estimate for difference: 0.436
95% CI for difference: (-1.049, 1.920)
T-Test of difference = 0 (vs not =): T-Value = 0.58 P-Value = 0.561 DF = 88
Both use Pooled StDev = 3.5209
LSS Green Belt v11.1 MT - Analyze Phase
57
© Open Source Six Sigma, LLC
Hypothesis Testing Roadmap
Normal
Test of Equal Variance
1 Sample Variance
Variance Equal
Variance Not Equal
Two samples
Two samples
2 Sample T
1 Sample t-test
One Way ANOVA
LSS Green Belt v11.1 MT - Analyze Phase
2 Sample T
58
One Way ANOVA
© Open Source Six Sigma, LLC
Unequal Variance Example
Don’t just sit there…. open
it!
LSS Green Belt v11.1 MT - Analyze Phase
59
© Open Source Six Sigma, LLC
Normality Test
Probability Plot of Sample 3
Normal
Run a Normality Test…
Let’s compare the data
in Sample one and
Sample three columns.
99.9
Mean
StDev
N
AD
P-Value
99
95
90
4.852
3.134
100
0.274
0.658
Percent
80
70
60
50
40
30
20
10
5
1
Probability Plot of Sample 1
0.1
Normal
99.9
Mean
StDev
N
AD
P-Value
99
Percent
95
90
80
70
60
50
40
30
20
-5
4.853
1.020
100
0.374
0.411
0
5
Sample 3
10
15
Our data sets are
Normally Distributed.
10
5
1
0.1
1
2
3
4
5
Sample 1
LSS Green Belt v11.1 MT - Analyze Phase
6
7
8
60
© Open Source Six Sigma, LLC
Test for Equal Variance
Standard Deviation
of Samples
Stat>ANOVA>Test of Equal Variance
Test for Equal Variances for Stacked
F-Test
Test Statistic
P-Value
1
0.11
0.000
We use F-Test Statistic
because our data is
Normally Distributed.
P-value is less than 0.05
so our variances are not
equal.
C4
Levene's Test
Test Statistic
P-Value
2
1.0
1.5
2.0
2.5
3.0
3.5
95% Bonferroni Confidence Intervals for StDevs
67.07
0.000
4.0
C4
1
2
Medians of Samples
LSS Green Belt v11.1 MT - Analyze Phase
-5
0
61
5
Stacked
10
15
© Open Source Six Sigma, LLC
2-Sample t-test Unequal Variance
UNCHECK
“Assume equal
variances” box.
LSS Green Belt v11.1 MT - Analyze Phase
62
© Open Source Six Sigma, LLC
2-Sample t-test Unequal Variance
Boxplot of Stacked by C4
15
Indicates
Sample
Means
Stacked
10
5
0
-5
1
2
C4
LSS Green Belt v11.1 MT - Analyze Phase
63
© Open Source Six Sigma, LLC
2-Sample t-test Unequal Variance
Individual Value Plot of Stacked vs C4
15
Indicates
Sample
Means
Stacked
10
5
0
-5
1
2
C4
LSS Green Belt v11.1 MT - Analyze Phase
64
© Open Source Six Sigma, LLC
2-Sample t-test Unequal Variance
Two-Sample T-Test
(Variances Not Equal)
Ho: μ1 = μ2 (P-value > 0.05)
Ha: μ1 ≠ or < or > μ2 (P-value < 0.05)
Stat>Basic Stats> 2 sample T (Deselect Assume Equal Variance)
LSS Green Belt v11.1 MT - Analyze Phase
65
© Open Source Six Sigma, LLC
Hypothesis Testing Roadmap
Normal
Test of Equal Variance
1 Sample Variance
Variance Equal
Variance Not Equal
Two samples
2 Sample T
1 Sample t-test
Two samples
One Way ANOVA
LSS Green Belt v11.1 MT - Analyze Phase
2 Sample T
66
One Way ANOVA
© Open Source Six Sigma, LLC
Paired t-test
•
A Paired t-test is used to compare the Means of two measurements from the
same samples generally used as a before and after test.
Stat > Basic Statistics > Paired t
•
MINITABTM performs a paired t-test. This is appropriate for testing the
difference between two Means when the data are paired and the paired
differences follow a Normal Distribution.
•
Use the Paired t command to compute a confidence interval and perform a
Hypothesis Test of the difference between population Means when
observations are paired. A paired t-procedure matches responses that are
dependent or related in a pair-wise manner.
delta
•
•
(d)
This matching allows you to account for
variability between the pairs usually resulting in
a smaller error term, thus increasing the sensitivity
of the Hypothesis Test or confidence interval.
– Ho: μδ = μo
– Ha: μδ ≠ μo
mbefore mafter
Where μδ is the population Mean of the differences and μ0 is the hypothesized
Mean of the differences, typically zero.
LSS Green Belt v11.1 MT - Analyze Phase
67
© Open Source Six Sigma, LLC
Example
1. Practical Problem:
• We are interested in changing the sole material for a popular
brand of shoes for children.
• In order to account for variation in activity of children wearing the
shoes each child will wear one shoe of each type of sole
material. The sole material will be randomly assigned to either
the left or right shoe.
2. Statistical Problem:
Ho: μδ = 0
Ha: μδ ≠ 0
3. Paired t-test (comparing data that must remain paired).
α = 0.05 β = 0.10
Just checking your
souls, er…soles!
LSS Green Belt v11.1 MT - Analyze Phase
68
© Open Source Six Sigma, LLC
Example
4. Sample Size:
• How much of a difference can be detected with 10 samples?
Open Minitab Worksheet “EXH_STAT DELTA.MTW”
LSS Green Belt v11.1 MT - Analyze Phase
69
© Open Source Six Sigma, LLC
Paired t-test Example
Now that’s a
tee test!
MINITABTM Session Window
Power and Sample Size
1-Sample t Test
Testing Mean = null (versus not = null)
Calculating power for Mean = null + difference
Alpha = 0.05 Assumed Standard Deviation = 1
This means we will be able to detect
a difference of only 1.15 if the
Standard Deviation is equal to 1.
LSS Green Belt v11.1 MT - Analyze Phase
70
Sample
Size
10
Power Difference
0.9
1.15456
© Open Source Six Sigma, LLC
Paired t-test Example
5. State Statistical Solution
Calc > Calculator
We need to calculate the difference
between the two distributions. We are
concerned with the delta; is the Ho
outside the t-calc (confidence interval)?
LSS Green Belt v11.1 MT - Analyze Phase
71
Check this box so MinitabTM will
recalculate as new data is entered.
© Open Source Six Sigma, LLC
Analyzing the Delta
Following the Hypothesis Test roadmap we first test the ABDelta distribution for Normality.
Probability Plot of AB Delta
Normal
99
Mean
StDev
N
AD
P-Value
95
90
0.41
0.3872
10
0.261
0.622
Percent
80
70
60
50
40
30
20
10
5
1
-0.5
LSS Green Belt v11.1 MT - Analyze Phase
0.0
0.5
AB Delta
72
1.0
1.5
© Open Source Six Sigma, LLC
1-Sample t
Stat > Basic Statistics > 1-Sample t-test…
Since there is only one column,
AB Delta, we do not test for
Equal Variance per the
Hypothesis Testing roadmap.
Check this data for statistical
significance in its departure
from our expected value of
zero.
LSS Green Belt v11.1 MT - Analyze Phase
73
© Open Source Six Sigma, LLC
Box Plot
MINITABTM Session Window
Box Plot of AB Delta
One-Sample T: AB Delta
Test of mu = 0 vs not = 0
Variable N Mean
StDev SE Mean
AB Delta 10 0.410000 0.387155 0.122429
95% CI
T
P
(0.133046, 0.686954) 3.35 0.009
5. State Statistical Conclusions: Reject the null hypothesis
6. State Practical Conclusions: We are 95% confident there is a
difference in wear rates between the two materials.
LSS Green Belt v11.1 MT - Analyze Phase
74
© Open Source Six Sigma, LLC
Paired T-Test
Another way to analyze this data is to use the paired t-test
command.
Stat>Basic Statistics>Paired T-test
Click on “Graphs…” and
select the graphs you would
like to generate.
LSS Green Belt v11.1 MT - Analyze Phase
75
© Open Source Six Sigma, LLC
Paired T-Test
Boxplot of Differences
(with Ho and 95% t-confidence interval for the mean)
The P-value from this
Paired T-Test tells us the
difference in materials is
statistically significant.
_
X
Ho
-1.2
-0.9
-0.6
-0.3
Differences
0.0
Paired T-Test and CI: Mat-A, Mat-B
Paired T for Mat-A - Mat-B
N
Mean
Mat-A
10 10.6300
Mat-B
10 11.0400
Difference 10 -0.410000
StDev
2.4513
2.5185
0.387155
SE Mean
0.7752
0.7964
0.122429
95% CI for Mean difference: (-0.686954, -0.133046)
T-Test of Mean difference = 0 (vs not = 0): T-Value = -3.35 P-Value = 0.009
LSS Green Belt v11.1 MT - Analyze Phase
76
© Open Source Six Sigma, LLC
Paired T-Test
The wrong way to analyze this data is to use a 2-sample t-test:
MINITABTM Session Window
Two-sample T for Mat-A vs Mat-B
N Mean StDev SE Mean
Mat-A 10 10.63 2.45
0.78
Mat-B 10 11.04 2.52
0.80
Difference = mu (Mat-A) - mu (Mat-B)
Estimate for difference: -0.410000
95% CI for difference: (-2.744924, 1.924924)
T-Test of difference = 0 (vs not =): T-Value = -0.37
P-Value = 0.716 DF = 18
Both use Pooled StDev = 2.4851
LSS Green Belt v11.1 MT - Analyze Phase
77
© Open Source Six Sigma, LLC
Paired t-test Exercise
Exercise objective: Utilize what you have learned to conduct
an analysis a paired t-test using MINITABTM.
1. A corrugated packaging company produces material that uses
creases to make boxes easier to fold. It is a Critical to Quality
characteristic to have a predictable Relative Crease Strength. The
quality manager is having her lab test some samples labeled 1-11.
Then those same samples are being sent to her colleague at
another facility who will report their measurements on those same
1-11 samples.
2. The US quality manager wants to know with 95% confidence what
the average difference is between the lab located in Texas and the
lab located in Mexico when measuring Relative Crease Strength.
3. Use the data in columns “Texas” & “Mexico” in
“HypoTestStud.mtw” to determine the answer to the quality
manager’s question.
LSS Green Belt v11.1 MT - Analyze Phase
78
© Open Source Six Sigma, LLC
Paired t-test Exercise: Solution
Calc > Calculator…
Because the two labs
agreed to exactly report
measurement results for
the same parts and the
results were put in the
correct corresponding row
we are able to do a paired ttest.
The first thing we must do
is create a new column with
the difference between the
two test results.
LSS Green Belt v11.1 MT - Analyze Phase
79
© Open Source Six Sigma, LLC
Paired t-test Exercise: Solution
We must confirm the differences (now in a new calculated column) are
from a Normal Distribution. This was confirmed with the AndersonDarling Normality Test by doing a graphical summary under Basic
Statistics.
Summary for TX_MX-Diff
A nderson-Darling N ormality Test
-0.50
-0.25
0.00
0.25
0.50
A -S quared
P -V alue
0.45
0.222
M ean
S tDev
V ariance
S kew ness
Kurtosis
N
0.22727
0.37971
0.14418
-0.833133
-0.233638
11
M inimum
1st Q uartile
M edian
3rd Q uartile
M aximum
0.75
-0.50000
-0.10000
0.40000
0.50000
0.70000
95% C onfidence Interv al for M ean
-0.02782
0.48237
95% C onfidence Interv al for M edian
-0.11644
0.50822
95% C onfidence Interv al for S tDev
95% Confidence Intervals
0.26531
0.66637
Mean
Median
0.0
LSS Green Belt v11.1 MT - Analyze Phase
0.2
0.4
80
0.6
© Open Source Six Sigma, LLC
Paired t-test Exercise: Solution
As we have seen before this 1 Sample T analysis is found with:
Stat>Basic Stat>1-sample T
LSS Green Belt v11.1 MT - Analyze Phase
81
© Open Source Six Sigma, LLC
Paired t-test Exercise: Solution
Even though the Mean difference is 0.23 we have a 95% confidence interval that
includes zero so we know the 1-sample t-test’s null hypothesis was “failed to be
rejected”. We cannot conclude the two labs have a difference in lab results.
Histogram of TX_MX-Diff
(with Ho and 95% t-confidence interval for the mean)
5
The P-value is greater than
0.05 so we do not have the
95% confidence we wanted
to confirm a difference in the
lab Means. This confidence
interval could be reduced
with more samples taken
next time and analyzed by
both labs.
LSS Green Belt v11.1 MT - Analyze Phase
Frequency
4
3
2
1
0
_
X
Ho
-0.50
82
-0.25
0.25
0.00
TX_MX-Diff
0.50
0.75
© Open Source Six Sigma, LLC
Hypothesis Testing Roadmap
Normal
Test of Equal Variance
1 Sample Variance
Variance Equal
Variance Not Equal
Two samples
Two samples
2 Sample T
1 Sample t-test
One Way ANOVA
LSS Green Belt v11.1 MT - Analyze Phase
2 Sample T
83
One Way ANOVA
© Open Source Six Sigma, LLC
Summary
At this point you should be able to:
• Determine appropriate sample sizes for testing Means
• Conduct various Hypothesis Tests for Means
• Properly analyze results
LSS Green Belt v11.1 MT - Analyze Phase
84
© Open Source Six Sigma, LLC
A Simple, Fresh, Clean Approach to Lean Six Sigma
Project Tracking and Program Management.
Signup for a free trial now at…
www.SixGrid.com
LSS Green Belt v11.1 MT - Analyze Phase
© Open Source Six Sigma, LLC