No Slide Title
Download
Report
Transcript No Slide Title
Today’s lesson
• Case study involving a quality control (QC)
application using the material that we have
covered to date.
• In class demonstration of how to use SPSS
to get the numbers out.
• Discussion of meaning of results.
Today’s lesson
• Confidence interval for the mean of a
normal distribution using standard normal
and t-distribution.
• On Thursday, we finish Chapter Eleven
with one sample t and z tests and a review
(yet again) of the structure of tests of
statistical hypotheses.
Case Study
• QC application in which the objective is to
specify a probability model for the
“manufacturing” process.
• That is, specify the A part in an A vs. B
comparison.
• See Finch et al., Statistics in Medicine,
1999, pages 1279-1289.
Context
• Application is the filtering of red blood to
remove white blood cells.
• Your white blood cells in your blood are
good.
• My white blood cells in the unit of blood
that I donated to be transfused to you are
bad for you.
– Transfusion reaction
– Possible infection vector
Practical Context
• Health regulations are that a unit of filtered
blood cannot contain more than so many
“residual white blood cells” (RWBC).
– US standard is 5x106 rwbc in a unit.
– European standard is 1x106 rwbc in a unit.
Practical Context
• Measured RWBC is product of three
numbers.
– Constant scaling factor
– volume of blood donated in the unit
– Nageotte count, which is the number of white
blood cells observed in a small volume of
sampled filtered blood.
First Question
• What should be the dependent variable
monitored in the QC application?
• Answer is to use the Nageotte count rather
than the RWBC.
– American standard is then a Nageotte count of
167.
– European standard is then a Nageotte count of
33.
Components of Variance
• Here used Fisher’s fundamental idea of
finding “components of variance.”
• Specifically, variance of RWBC has a nonquality related component of variance from
the variation of the volume of blood in the
donated unit.
Second Question
• What actually happened in the QC process
when the manufacturer’s staff did the work?
• That is, use the descriptive options in a
statistical package to describe the data.
Getting numbers out
• Enter SPSS
• Access correct file (CAREFUL,
CAREFUL, CAREFUL!!!)
• Statistics menu
• Descriptive submenu
• Frequencies option
Third Question
• Then, specify a probabilistic model that fits
the data reasonably well so that predictions
can be made.
• Nageotte count variable is a ratio scale of
measurement.
• Nageotte count is discrete, not continuous.
• Variance is very much larger than the mean.
– Hence focused on the negative binomial
distribution (NBD).
Fourth Question
• ASS-U-ME a negative binomial
distribution.
• How well does it fit the data observed?
• Use a goodness of fit test (we won’t cover
this in detail until after your first exam).
Observed and Expected Nageotte
Counts
i
0
Oi
283
Ei
292.2
1
2
3
4
5
6 or more
54
18
15
0
4
12
30.5
15.6
10.1
7.3
5.5
24.7
Interpretation
• One observation violated American, and
two violated European rule.
• NBD model does not predict maximum
observed value of 205.
• NBD model does not fit well but captures
the rough order of variation (up to a
Nageotte count of 33).
• Choose a nonparametric test procedure
because null distribution is not obvious.
My Most Common Three
Mistakes in Making Predictions
• Eliminate “outliers” from the historical data
that I am using to make my prediction.
• Predict a ratio scale variable without
anticipating that the variance of the variable
will increase when the mean increases.
• ASS-U-ME independence of observations
when predicting a time series with
autocorrelation.
Fifth Question
• How big of a sample is necessary to
determine whether a user of this product is
“in control.”
• Simulation study suggested that under
optimistic conditions 20 is a minimal
sample size but that 80 may be required.
• Client’s practice has evolved to use about
50.
Chapter Eleven: Testing a
Hypothesis about a Single Mean
• Definition of Student’s t Distribution
• Using tables of Student’s t distribution.
• Using the observed significance level from
Student’s t.
• Using the confidence limit from Student’s t.
Historical Background of
Student’s t
• Origin is quality control in the brewing
industry (Guinness).
• How can statistical procedures be applied
with very small samples?
• Nature of the advance is to describe the null
distribution of the statistic that is actually
used.
Definition of the Student’s t
distribution
• The pdf is also a bell-shaped curve
• Continuous distribution, unimodal,
symmetric, less rapid fall-off of probability
for values far from mean.
• Appendix C table (546-547) gives two sided
tail probability by degrees of freedom.
• Most statistics texts give percentile points.
Basic numeric facts of Student’s t
percentiles
• Student t 95-th percentiles are larger than
the 95-th standard normal percentile.
• Same holds for any percentile greater than
50.
• The difference becomes larger as the
“degrees of freedom” is smaller.
Example One-Sample Z test
Problem
• Test the null hypothesis H0: E(Y)=500 with
level of significance 0.01 against the
alternative hypothesis H1: E(Y)<500. ASSU-ME Y is normal with known standard
deviation 100 using the sample mean of a
random sample of four observations. This
statistic has value 360.
Solution
• Determine the side of the test, here leftsided.
• Determine the standard error of the statistic,
here 100/40.5=50
• Determine the critical value of the test
statistic.
– In original form, 500-2.326(50)=383.7
– In standard unit form, -2.326.
Solution Continued
• Compare the statistic to the critical value:
– In original units, the observed mean of 360 is to
the left of the critical value of 383.7.
– In standard-score form, the z value of the mean
is (statistic-hypothesized expected value)/se of
statistic=(460-500)/50=-2.8, to the left of the
critical value -2.326.
• Make decision: reject H0 at the 0.01 level of
significance.
Example One-Sample T test
Problem
• Test the null hypothesis H0: E(Y)=500 with
level of significance 0.01 against the
alternative hypothesis H1: E(Y)<500. ASSU-ME Y is normal with unknown standard
deviation. The mean of a random sample of
four observations has value 360, and the
unbiased estimate of the variance is 6400.
Note that the corresponding estimate of the
standard deviation is 80.
Solution
• Determine the side of the test, here leftsided.
• Determine the estimated standard error of
the statistic, here 80/40.5=40.
• Student’s contribution: determine the
degrees of freedom. For a one-sample t-test,
it is number of observations minus one, here
4-1=3.
Solution Continued
• Determine the critical value of the test
statistic. Don’t forget Student’s stretch of
the critical value
– In original form, 500-t3,2.326(40)=500-4.541(40)
=318.36
– In standard unit form, -4.541.
Solution Continued
• Compare the statistic to the critical value:
– In original units, the observed mean of 360 is to
the right of the critical value of 318.36.
– In standard-score form, the z value of the mean
is (statistic-hypothesized expected value)/se of
statistic=(360-500)/40=-3.5, to the right of the
critical value -4.541.
• Make decision: accept H0 at the 0.01 level
of significance
Major points covered
• Review of material using a case study that
applies descriptive statistics.
• Introduction (review) of Student’s t.
• The one sample standard normal test.
• The one sample Student’s t test.
To come
• Finish Chapter 11 with one sample
confidence intervals.
• Begin Chapter 12, the paired t-test.