File - freesixsigmasite.com

Download Report

Transcript File - freesixsigmasite.com

Hypothesis Testing
Hypothesis Testing
Why do we need it?
– simply, we are looking for something – a statistical measure - that will allow us
to conclude there is truly a difference between a set of data from two samples.
Mathematically, we infer this with some degree of confidence in our decision.
------------------------------------------------------------------------------
What does it prove ?
– helps us determine whether observed differences are:
statistically significant
or
due to chance (random or common cause variation)
Test a Hypothesis – (HØ and Ha)
Null Hypothesis, (Ho)
• comes from word nullify (to negate)
• associated with distribution of chance events
• typically, the null hypothesis is: “2 samples are the same, except for variation
caused by chance”
------------------------------------------------------------------------------
Alternate Hypothesis, (Ha)
• used as an alternative to the null hypothesis
• those hypotheses that identify a distribution of events that is not a chance
distribution
• typically, the alternative hypothesis: “2 samples are fundamentally different”
Null Hypothesis and Risk
Alpha Risk finding a difference when one doesn’t really exist [A FALSE REJECT]
[ probability of making an incorrect decision ] usually 5% or less
i.e. – Jury decided GUILTY verdict when person was really INNOCENT
- Rejecting a good part on the assembly line (aka: producers risk)
-------------------------------------------------------------------------------
Beta Risk - NOT finding a difference when there is one [A FALSE ACCEPT]
[ probability of making a right decision when you’re really wrong ]
less than 10% chance it could have occurred randomly
i.e. - taxi driver thought corner was dangerous when it was safe
- Accepting a defective part from the assembly line (aka: consumers risk)
NOTE:
Statistically, P value is probability of occurrence by “chance only”
(Ho = true (no difference) then a “high” >.05 p-value occurs)
Alpha Beta Risk
(GENERAL RULES)
Hypothesis testing; Tests “NULL” hypothesis [Ho = NO difference ]
against an alternative hypothesis [Ha = groups (data) are different ]
---------------------------------------------------------------------------------If p value < .05 (reject Ho and conclude Ha) Are different (truly)
If p value > .05 (cannot reject Ho) … So, there truly is NO difference
-----------------------------------------------------------------------------------
Why use?? To detect differences that may be important to the
business. Is minor difference in averages due to random variation or
reflect a true difference. Want to see the impact of our intervention.
Statistical Difference vs. Practical Importance
You Decide …
• If there are large amounts of data, or the variation within the data is very small,
hypothesis tests can detect very small differences between samples
• While the samples are statistically different, the differences may not mean
much in the PRACTICAL world.
• DOES IT MAKE BUSINESS SENSE ??
• DOES IT PASS THE COMMON SENSE TEST ?
What are the Data Assumptions ?
• If data is continuous, we assume underlying distribution is Normal.
• You may need to transform non-Normal data. (i.e.: cycle times).
When comparing groups from different populations we assume:
• independent samples
• achieved through random sampling
• samples are representative (unbiased) of the population
When comparing groups from different processes we assume:
• each process is stable
• there are no special causes or shifts over time
• samples are representative of the process (unbiased)
Also note:
Pre-test and Post-test would violate independence. By knowing one, there is a possibility
to predict the other. Any repeated measures of the same individuals would also violate
independence.
Two Sample T Test
“Are the means of these two normally distributed groups really different from each other?”
If the P value is .05 or less, it is usually accepted that the groups are different.
One Way ANOVA
Similar to the two sample T test, except that it can handle more than two groups. Again, the
groups must be normal. We also have the added requirement that the variances (and the
standard deviations) of all the groups are approximately equal. A P value of .05 or less
indicates that the mean of at least one group is different from the rest.
Mann-Whitney
Similar to, and less powerful than Two Sample T, but does not require normally distributed
data.
Homogeneity of Variance
“Are the variances (and hence the standard deviations) of these groups of data equal?”
Often used preparatory to ANOVA. If the P value of Levine's test is .05 or less, the
variances are assumed to be unequal.
Normality Test
“Is this data normally distributed?” If the P value of the Anderson-Darling test is .05 or
less, the data is presumed to be not normal. For small groups of data, the “fat pencil test” is
more meaningful.
Multi-Vary Study
A passive examination of the process as it runs in its normal state. By noting the state of key
input variables, and the simultaneous state of output variables, useful correlations can often be
found. Sometimes a Multi-Vari study will reveal the sources of problems. In other cases, the
outputs of a Multi-Vari study become the inputs to a designed experiment. Outputs are often
shown as Main Effects Plots and/or Boxplots.
Chi Square Test
Used with count data, arranged in a matrix of rows and columns. For example, TREATED and
UNTREATED columns, and LIVED and DIED rows, in a 2x2 matrix. Counts entered into
each cell are the number of people in each category. P value of the Chi Square test indicates
whether or not the rows and columns are statistically independent, i.e, does ‘treatment’ or the
lack of it influence survival?
Regression
Regression is used with interval/ratio/variable inputs and outputs. It answers the questions,
“Are the inputs and outputs linearly correlated?” and “If they are linearly correlated, what is the
formula that connects them?”. One output is an equation of the form Y=mX+b, where Y is the
output variable, m is the slope of the line, X is the input variable, and b is a constant (the Y
intercept). Another output is an R2 value. An R2 of 86% says that 86% of the observed
variation is explained by the straight line model, and 14% is not.
Regression with more than one input variable is called Multiple Linear Regression.
For all tests:
p>0.05 Fail to Reject Ho (Null)
p<0.05 Reject Ho
Continuous Data
Non-Normal Data
“Roadmap”
Hypothesis Testing
Attribute Data
Normality Test
Normal
Ho: Data is normal
Ha: Data is NOT normal
Minitab:
Stat > Basic Stat > Normality Test
(Use Anderson-Darling)
One
Sample
Two or More
Samples
Ho: Two Factors are INDEPENDENT
Ha: Two Factors are DEPENDENT
Minitab:
Stat > Tables > Chi-Square Test
Two or More Samples
Levene’s Test
Ho: 1=2=3…
Ha: At least one is different
Minitab:
Stat > ANOVA > Homog of Variance
Contingency
Table
Bartlett’s Test
Ho: 1=2=3…
Ha: At least one is different
Minitab:
Stat > ANOVA > Homog of Variance
One Sample
Equal Variance (Three or more samples
One Sample
One Sample T-Test
Ho: M1 = M target
Ha: M1  M target
Minitab:
Stat > Nonparametric > 1 sample sign (OR)
Stat > Nonparametric > 1 sample - Wilcoxon
Ho: 1 =  target
Ha: 1   target
Minitab:
Stat > Basic Stat > 1 Sample T
Two Samples
Two or More
Samples
Ho: M1 = M2 = M3...
Ha: At least one is different
Minitab:
Stat > Nonparametric > Mann Whitney (OR)
Stat > Nonparametric > Kruskal Wallis (OR)
Stat > Nonparametric > Moods Median (OR)
Stat > Nonparametric > Friedmans
Two-Sample T-Test
(Variances Not Equal)
Ho: 1=2
Ha: 12
Minitab:
Stat > Basic Stat > 2 Sample T
(Check Box for Unequal Variance)
One Way
ANOVA
Ho: 1=2=3…
Ha: At least one is different
Minitab:
Stat > ANOVA > One-Way
Two-Sample T-Test
(Variances Equal)
Ho: 1=2
Ha: 12
Minitab:
Stat > Basic Stat > 2 Sample T
(Check Box for Equal Variance)