Overview of Statistical Inference

Download Report

Transcript Overview of Statistical Inference

QM 2113 -- Fall 2003
Statistics for Decision Making
Basic Inference
Instructor: John Seydel, Ph.D.
Student Objectives
Use sample data to generate and interpret interval estimates of
population parameters
Apply the margin of error concept in determining the quality of
parameter estimates
Work with the Student’s t distribution when developing inferences
for quantitative data
Determine the required number of sample observations for
achieving a desired precision in estimating population/process
parameters
Compare and contrast the two types of statistical inference
Define hypothesis testing and summarize the basic process
Discuss errors that can be made in statistical inferences
Use sample data to test claims about population parameters
Sampling Distributions (Review)
Data Type
Parameter Estimator
Quantitative
m
x
Qualitative
p
p
StdError

n
p (1  p )
n


Note: these estimators are approximately normally
distributed; i.e., their sampling distributions are
approximately normal
s
n
p(1  p)
n
Review of Simple Inference:
Estimation
Recall: there are only 2 types of inference


Estimation (confidence intervals)
Hypothesis testing
Confidence intervals


Parameter ≈ Point Estimate ± Margin of Error
Margin of error is based upon confidence level
 Margin of error = z-score ∙ standard error
 Example, for confidence of 95%:



2 ∙ (s/√n)
2 ∙ (√[(p)(1 - p)/n])
Example (Exercise 7-4)
Estimation, the Procedure
Determine parameter needing to be estimated
Gather data
Calculate appropriate sample statistics


Quantitative data: x-bar & s
Qualitatative data: p
Determine the margin of error



Appropriate z-score
Calculate the standard error
Calculate: Z ∙ StdErr
Put it all together:


Parameter = Estimator ± Margin of Error
Estimate: (Parameter – Margin) to (Parameter + Margin)
Interpret/apply results
If appropriate, gather additional data*
Note: no sketch!
*
More on this phases a little later
Interpreting Interval Estimates
Strictly:
Of all the samples that could be taken from this population, __% of
them will result in _____s that are within _____ of the overall
population _____.
Practically:


We are __% confident that the overall population _____ is equal to
_____, give or take _____.
We can be __% confident that the overall _____ is between _____
and _____.
How we typically express findings in the popular
press:
The survey indicates the the overall _____ of the population is
_____. (Margin of error on these findings is _____.)
Here’s a good way to look at interval estimates:
The margin of error provides an indication of how well the sample
statistic estimates the population/process parameter of interest
Now, apply to previous examples
Addressing the Quality of the
Estimators
Again, note that
The margin of error provides an indication of how
well the sample statistic estimates the
population/process parameter of interest
Suppose the margin of error is too large; now what?



Forget the whole thing?
Make up what you want?
...?
Of course, not!
Go out and get more data


How much is enough?
Do we just do this over and over again until our precision is
sufficient (i.e., margin of error is small enough)?
Actually, there’s a way to deal with this . . .
Determining Necessary Sample
Sizes
Deriving the needed equations




Write a formula for the margin of error
Plug in known, required, or estimated values
Solve for n
Generally requires some sort of pilot sample
Demonstration


Quantitative data . . . (Equation 7-8)
Qualitative data . . . (Equation 7-13)
The resulting formulae are simple
Applications: Exercises 7-21 and 7-36
Rules of thumb for minimum sample sizes (if normal
distribution is to be applied)


Quantitative data: n>30
Qualitative data: np ≥ 5 and n(1-p) ≥ 5
The Student’s t Distribution
Not just a sample size issue
Used


Always with quantitative data whenever standard deviation
is unknown
Never, ever with qualitative data!
However, it’s so close to the normal once sample
sizes get sufficiently large
Use special tables


Work opposite of normal tables (probability on inside)
Involve a third parameter: degrees of freedom (i.e.,
adjusted sample size)
We now call that standard score value t instead of z,
but it refers to the same thing
Examples (Exercise 7-3)
Intermission: Some Excel Stuff
Chart formatting



Main title: 12 point
Axis titles: 10 point
Axis labels: 8 point
Printing


Use preview
Work with setup options
 Portrait/landscape
 Fit to page
 Gridlines, row/column labels

Set print area if needed
The Other Kind of Inference:
Hypothesis Testing
Recall that there are only 2 types of inference


Estimation (confidence intervals)
Hypothesis testing
Hypothesis testing



Starts with a hypothesis (i.e., claim, assumption,
standard, etc.) about a population parameter (m, p,
, b1, distribution, . . . )
Sample results are compared with the hypothesis
Based upon how likely the observed results are,
given the hypothesis, a conclusion is made
Hypothesis Testing
Start by defining hypotheses

Null (H0):
 What we’ll believe until proven otherwise
 We state this first if we’re seeing if something’s changed

Alternate (HA):
 Opposite of H0
 If we’re trying to prove something, we state it as HA and
start with this, not the null
Then state willingness to make wrong
conclusion (a)
Draw a sketch of the sampling distribution
Determine the decision rule (DR)
Gather data and compare results to DR
Errors in Hypothesis Testing
Type I: rejecting a true H0
Type II: accepting a false H0
Probabilities
a = P(Type I)
b = P(Type II)
Power = P(Rejecting false H0) = P(No error)
Controlling risks


Decision rule controls a
Sample size controls b
Worst error:


Type III (solving the wrong problem)!
Hence, be sure H0 and HA are correct
Hypothesis Testing Examples
Quantitative data (from text): 3, 4, 5
Qualitative data


We haven’t discussed this, but it works the
same!
Text: 28, 29, 30
Now, about p-values


Just another way to express the DR
Note: three types of DR
Summary of Objectives
Use sample data to generate and interpret interval estimates of
population parameters
Apply the margin of error concept in determining the quality of
parameter estimates
Work with the Student’s t distribution when developing
inferences for quantitative data
Determine the required number of sample observations for
achieving a desired precision in estimating population/process
parameters
Compare and contrast the two types of statistical inference
Define hypothesis testing and summarize the basic process
Discuss errors that can be made in statistical inferences
Use sample data to test claims about population parameters
Appendix
Sampling
Population
Sample
Statistic
Parameter
Sample Size Formulae
Quantitative data (inferences concerning the
average):
Z 2 2 Z 2 s 2
n

e2
e2
Qualitative data (inferences concerning a
proportion)
2
2


Based on prior estimates: n 
Worst-case scenario:
Z p (1  p ) Z p(1  p)

2
e
e2
Z 2 0.5(1  0.5)
Z2
n
 0.25 2
2
e
e
Inferences: Using the Normal
Table in Reverse
For inference, we usually start with a
probability (i.e., a confidence level or error
probability)
Then we need to determine the z-score
(sometimes called a t-score) associated with
that probability
Finally, we determined the average or
proportion that is z (or t) standard errors
away from the base average
Stating the Decision Rule
First, note that no analysis should take place
before DR is in place!
Can state any of three ways



Critical value of observed statistic (x-bar or p-hat)
Critical value of test statistic (z)
Critical value of likelihood of observed result (p-value)
Generally, test statistics are used when results
are generated manually and p-values are used
when results are determined via computer
Always indicate on sketch of distribution