File - Professor Fell
Download
Report
Transcript File - Professor Fell
Chapter 8
Parameter Estimates and Hypothesis Testing
Estimating the Population Standard Deviation
•
The SD and the mean of a population is an estimate because we don’t have
all the scores (this is why it’s called “inferential statistics” because we are
estimating)
•
Estimating σ: the sample SD tends to underestimate the σ
-this is due to sampling error
-since the sample SD underestimates σ it is called a biased estimator
To correct for the biased estimator we subtract 1 from N
•
Estimating the Standard Error of the Mean
•
•
Estimated Standard Error of the Mean: allows us to predict what the standard
deviation of an entire distribution of means would be if we had measured the
whole population.
-ie. the standard deviation of the sampling distribution of means
-σM symbolizes the standard deviation of an entire distribution of means
-We can estimate σM from a single sample
*when we do this it is called the Estimated Standard Error of the Means
*symbolized as SEM
•
Formula:
Standard error of the
mean can’t be a negative
number, in the same way
an SD can’t!
•
Simplest raw score formula:
Standard Error of the Mean Worksheet
Estimating the Population Mean
•
Point Estimate: the sample mean (M) is used to estimate μ
-the most precise (best) estimate
•
Confidence Intervals: a range of values is estimated within which it is assumed
that μ is contained
-Goal is to bracket the μ within a specific bracket of high and low sample
means.
EX: Let’s say we want to predict the average temperature for the next 4th of July.
We could predict that the temp will fall between 95 & 105 degrees.
Q: What would be the problem with saying the temp will be between 0 & 120
degrees?
A: We wouldn’t be able to plan a picnic with that confidence interval because it
isn’t precise at all!
**.99 confidence interval isn’t as precise as .95 because it gives a bigger
range that the mean could fall in
Confidence Intervals
•
Using the normal curve to calculate confidence intervals (when σ is known):
-A probability value can be calculated that indicates the degree of confidence
we might have that μ is really in this interval
-Typically, we like to be at least 95% sure that the predicted μ falls within our
confidence interval.
.95 confidence interval = ±1.96 σM + M
**At a .95, any value that falls beyond z=1.96 leads to the conclusion
that the mean is not part of the known population
.99 confidence interval = ±2.58 σM + M
**At a .99, any value that falls beyond z=2.58 leads to the conclusion
that the mean is not part of the known population
Usually we don’t know
σ so we can’t calculate
confidence intervals
this way
Standard Error of the Mean Homework Due Next Class
t-distribution
•
•
•
•
•
Table C
Use the t-distribution to calculate confidence intervals
-when σ is unknown or you have a small sample size (N=30 or less)
*usually it is unknown & with sampling, we often have a small N!
The smaller the sample size, the less certain we are of normality of
the entire sampling distribution.
Therefore, we use a t-distribution which is a family of distributions
each of which deviates from normality depending on sample size.
t-distributions are distinguished by their degrees of freedom which
are based on sample size.
-df = N – 1
-as the df increases the t-distribution becomes more like the
normal distribution
The critical values on the t-distribution are at the .05 (95%
confidence level) and .01 (99% confidence level) levels
William Sealy Gossett
Developer of the tdistribution
Calculating Confidence Intervals
•
Use the t-distribution to determine critical values
-NOTE: Critical values of t should be calculated using 3 digits after the
decimal (as they appear in Table C & D).
Step 1: Calculate df
df = N – 1
Step 2: find ±t.05 or ±t.01 (as instructed)
look up value in Table C at either the .05 level or .01 using the df
Step 3: Complete this formula
*Note answer will be 2 numbers (a range)
One Sample t Tests
•
We can use t-tests to answer research questions
•
T-tests answer statistical questions such as:
1) Is the difference between the sample mean & μ statistically significant?
2) What is the probability that a sample mean could deviate from μ the amount that it
does?
3) Is the sample from this population or not?
•
Example Research Problem: A researcher theorizes that the population mean among
college students taking the new Social Conformity Test is a “neutral” 100. Scores higher
than 100 represent more conformity than average and scores lower than represent less
conformity than average. A random sample of 30 students was selected and found to have
a mean of 103 with a standard deviation of 10.83.
-Q: Using the t-test what statistical questions could we ask about this problem?
One Sample t Tests
• We must test the Null Hypothesis (Ho), the hypothesis of no difference
Ho: μ1 = μ2 There is no significant difference between the sample mean and
the population mean.
OR
The sample is from the population.
Ha: μ1 ≠ μ2 The alternate hypothesis (the hypothesis of difference) says that
the sample mean deviates enough from μ that we can conclude
the sample is NOT from the population in question.
***Note: μ1 is our sample mean or our “point estimate”
-ie. It’s a μ because theoretically, it estimates our population mean
Calculating One Sample t Tests
Step 1: determine the number of degrees of freedom
df = N – 1
Step 2: calculate a t-value for our sample mean so we can see it’s relation to the μ
EX: using the previous example N=30 M=103
SD=10.83
Step 3: Look at Table C Handout under the calculated df at the .05 level & compare the
calculated t-value.
-If your calculated t-value is equal to or greater than the table value then you reject the
null hypothesis.
-We can NOT reject our null because the table value is 2.045 (df=29)
**we conclude that there is no difference between the sample mean & μ
One-sample t test Worksheet
One-sample t test Homework due Next Class
Statistical Significance
•
In order to reject Ho, the t-values must fall within the .05 or .01 critical areas
•
Using the previous example, let’s say we did reject the null hypothesis & accept the alternate
hypothesis
•
We must now say how unlikely our null hypothesis is
-Instead of saying: “It is highly unlikely that there is not a difference between the sample mean and
population mean” OR “it is highly unlikely that our sample mean is from our population”
-We would say ”there is a significant difference between the population mean and the sample mean”
or “the means have been found to be significantly different”
-ie. the probability that the groups are the same or that the sample mean is from the population is
very, very low
•
Well, how low is very low?
-when the probability that the groups or means are the same is less than 5% or 1%
-ie. “There is less than a 5% chance that these groups are the same”
•
Significance levels are called alpha levels and are represented by alpha symbol (α) or
p-values (p<.05 or p<.01%)
Type I Error
•
We use strict levels of significance to reduce the probability of committing a
Type I Error (aka. Alpha Error) or Type II Error
-Alpha Error: we reject the null (Ho) but we should have accepted it
*ie. Saying there is a difference between the groups when there isn’t
•
The probability of making a type one error is equal to alpha (.05 or .01)
-p <.05 (or α <.05) means that the probability of making a mistake in
rejecting Ho is less than 5 in 100
*ie. At the .95 significance level, we are willing to make a mistake in
rejecting Ho 5% of the time
-p <.01 (or α <.01) means that the probability of making a mistake in
rejecting Ho is less than 1 in 100
*ie. At the .99 significance level, we are willing to make a mistake in
rejecting Ho 1% of the time
Two Tail (Nondirectional)
Sample two-tail question: Is there a
difference between groups?
•Use Table C for critical values
•It doesn’t matter if the t-value is
positive or negative
One Tail (Directional)
•Use Table D for critical values
•It DOES matter if the t-value is
positive or negative and will
depend on the Ha
Sample one-tail question: Does
one group perform better (or
worse), score higher (or lower)
than another? Is one drug more
(or less) likely to be effective?
•If you hypothesize a positive tvalue & get a negative one or
vice versa, you must accept Ho
(even if it’s larger than the table
value)