No Slide Title

Download Report

Transcript No Slide Title

ITED 434
Quality Assurance
Statistics Overview: From HyperStat Online
Textbook
http://davidmlane.com/hyperstat/index.html
by David Lane, Ph.D. Rice University
Class Objectives






Learn about the standard normal distribution
Discuss descriptive and inferential statistics
Learn how to calculate proportions under
the normal curve.
Discuss sampling distributions
Learn how to calculate sample size from a
normal distribution
Discuss Hypothesis Testing 2 approaches:
– Classical method
– P-value method
28-Oct-03
ITED 434 - J. Wixson
2
Standard normal distribution


The standard normal distribution is a normal
distribution with a mean of 0 and a standard deviation
of 1. Normal distributions can be transformed to
standard normal distributions by the formula:
X is a score from the original normal distribution,  is
the mean of the original normal distribution, and  is
the standard deviation of original normal distribution.
28-Oct-03
ITED 434 - J. Wixson
3
Standard normal distribution

A z score always reflects the number of standard deviations
above or below the mean a particular score is.

For instance, if a person scored a 70 on a test with a mean of 50
and a standard deviation of 10, then they scored 2 standard
deviations above the mean. Converting the test scores to z
scores, an X of 70 would be:

So, a z score of 2 means the original score was 2 standard
deviations above the mean. Note that the z distribution will only
be a normal distribution if the original distribution (X) is normal.
28-Oct-03
ITED 434 - J. Wixson
4
Applying the formula
Applying the formula will always produce a transformed variable
with a mean of zero and a standard deviation of one. However, the
shape of the distribution will not be affected by the transformation. If
X is not normal then the transformed distribution will not be normal
either. One important use of the standard normal distribution is for
converting between scores from a normal distribution and percentile
ranks.
Areas under portions of the standard
normal distribution are shown to the
right. About .68 (.34 + .34) of the
distribution is between -1 and 1 while
about .96 of the distribution is
between -2 and 2.
28-Oct-03
ITED 434 - J. Wixson
5
Area under a portion of the
normal curve - Example 1
If a test is normally distributed with a mean of
60 and a standard deviation of 10, what
proportion of the scores are above 85?
From the Z table, it is calculated that .9938 of
the scores are less than or equal to a score 2.5
standard deviations above the mean. It follows
that only 1-.9938 = .0062 of the scores are
above a score 2.5 standard deviations above the
mean. Therefore, only .0062 of the scores are
above 85.
28-Oct-03
ITED 434 - J. Wixson
6
Example 2
 Suppose you wanted to know the
proportion of students receiving scores
between 70 and 80. The approach is to
figure out the proportion of students
scoring below 80 and the proportion
below 70.
The z table is used to determine
that .9772 of the scores are
below a score 2 standard
deviations above the mean.
 The difference between the two
proportions is the proportion scoring
between 70 and 80.
 First, the calculation of the proportion
below 80. Since 80 is 20 points above the
mean and the standard deviation is 10, 80
is 2 standard deviations above the mean.
28-Oct-03
ITED 434 - J. Wixson
7
Example 2 Cont’d
 The difference between the
two proportions is the
proportion scoring between
70 and 80.
 Next, calculate the proportion
below 70. Note that the area
of the curve below 70 is 1
standard deviation, or .1359
 To calculate the proportion
between 70 and 80, subtract
the proportion above 80 from
the proportion below 70.
That is .8413 - .0228 = .1359.
 Therefore, only 13.59% of
the scores are between 70
and 80.
28-Oct-03
To calculate the
proportion below
70:
ITED 434 - J. Wixson
8
Example 3
Assume a test is normally distributed with a mean of 100
and a standard deviation of 15. What proportion of the
scores would be between 85 and 105?
The solution to this problem is similar to the solution to
the last one. The first step is to calculate the proportion of
scores below 85.
Next, calculate the proportion of scores below 105.
Finally, subtract the first result from the second to find
the proportion scoring between 85 and 105.
28-Oct-03
ITED 434 - J. Wixson
9
Example 3
Begin by calculating the proportion
below 85. 85 is one standard deviation
below the mean:
Using the z-table with the value of
-1 for z, the area below -1 (or 85 in
terms of the raw scores) is .1587.
Do the same for 105
28-Oct-03
ITED 434 - J. Wixson
10
Example 3
The z-table shows that the
proportion scoring below .333
(105 in raw scores) is .6304.
The difference is .6304 - .1587
= .4714. So .4714 of the scores
are between 85 and 105.
28-Oct-03
Go to: http://davidmlane.com/hyperstat/z_table.html for
Z table.
ITED 434 - J. Wixson
11
Sampling Distributions
28-Oct-03
ITED 434 - J. Wixson
12
Sampling Distributions
If you compute the mean of a sample of 10 numbers, the
value you obtain will not equal the population mean
exactly; by chance it will be a little bit higher or a little
bit lower.
If you sampled sets of 10 numbers over and over again
(computing the mean for each set), you would find that
some sample means come much closer to the population
mean than others. Some would be higher than the
population mean and some would be lower.
Imagine sampling 10 numbers and computing the mean
over and over again, say about 1,000 times, and then
constructing a relative frequency distribution of those
1,000 means.
28-Oct-03
ITED 434 - J. Wixson
13
5 Samples
28-Oct-03
ITED 434 - J. Wixson
14
10 Samples
28-Oct-03
ITED 434 - J. Wixson
15
15 Samples
28-Oct-03
ITED 434 - J. Wixson
16
20 Samples
28-Oct-03
ITED 434 - J. Wixson
17
100 Samples
28-Oct-03
ITED 434 - J. Wixson
18
1,000 Samples
28-Oct-03
ITED 434 - J. Wixson
19
10,000 Samples
28-Oct-03
ITED 434 - J. Wixson
20
Sampling Distributions
The distribution of means is a very good approximation
to the sampling distribution of the mean.
The sampling distribution of the mean is a theoretical
distribution that is approached as the number of samples
in the relative frequency distribution increases.
With 1,000 samples, the relative frequency distribution
is quite close; with 10,000 it is even closer.
As the number of samples approaches infinity, the
relative frequency distribution approaches the sampling
distribution
28-Oct-03
ITED 434 - J. Wixson
21
Sampling Distributions
 The sampling distribution of the mean for a sample size of
10 was just an example; there is a different sampling
distribution for other sample sizes.
 Also, keep in mind that the relative frequency distribution
approaches a sampling distribution as the number of
samples increases, not as the sample size increases since
there is a different sampling distribution for each sample
size.
28-Oct-03
ITED 434 - J. Wixson
22
Sampling Distributions
 A sampling distribution can also be defined as the
relative frequency distribution that would be
obtained if all possible samples of a particular
sample size were taken.
 For example, the sampling distribution of the mean
for a sample size of 10 would be constructed by
computing the mean for each of the possible ways
in which 10 scores could be sampled from the
population and creating a relative frequency
distribution of these means.
 Although these two definitions may seem different,
they are actually the same: Both procedures
produce exactly the same sampling distribution.
28-Oct-03
ITED 434 - J. Wixson
23
Sampling Distributions
Statistics other than the mean have sampling
distributions too. The sampling distribution of the
median is the distribution that would result if the
median instead of the mean were computed in each
sample.
Students often define "sampling distribution" as the
sampling distribution of the mean. That is a serious
mistake.
Sampling distributions are very important since
almost all inferential statistics are based on sampling
distributions.
28-Oct-03
ITED 434 - J. Wixson
24
Sampling Distribution of the mean
The sampling distribution of the mean is a very important
distribution. In later chapters you will see that it is used to
construct confidence intervals for the mean and for significance
testing.
Given a population with a mean of  and a standard deviation of
, the sampling distribution of the mean has a mean of  and a
standard deviation of / N , where N is the sample size.
The standard deviation of the sampling distribution of the mean is
called the standard error of the mean. It is designated by the
symbol .
28-Oct-03
ITED 434 - J. Wixson
25
Sampling Distribution of the mean
Note that the spread of the sampling distribution of the mean
decreases as the sample size increases.
An example of the effect of sample size is shown above.
Notice that the mean of the distribution is not affected by
sample size.
28-Oct-03
ITED 434 - J. Wixson
26
Spread
A variable's spread is the degree scores on the variable differ
from each other.
If every score on the variable were
about equal, the variable would have
very little spread.
There are many measures of spread.
The distributions on the right side of
this page have the same mean but
differ in spread: The distribution on
the bottom is more spread out.
Variability and dispersion are
synonyms for spread.
28-Oct-03
ITED 434 - J. Wixson
27
Standard Error in Relation to Sample Size
Notice that the graph is consistent
with the formulas. If is m= 10 for a
sample size of 1 then m should
be equal to
for a sample
size of 25. When s is used as an
estimate of σ, the estimated
standard error of the mean is
.
The standard error of the mean is
used in the computation of
confidence intervals and
significance tests for the mean.
28-Oct-03
ITED 434 - J. Wixson
28
60
50
40
30
20
10
0
-10
10
20
30
Number of tests
40
50
60
70
80
90
100
N
-20
-30
-40
-50
-60
28-Oct-03
Figure 11.3
Width of confidence interval versus number of tests.
ITED 434 - J. Wixson
29
SEE TABLE 11.1
Summary of confidence limit formulas
28-Oct-03
ITED 434 - J. Wixson
30
SEE TABLE 10.6
Summary of common probability distributions.
28-Oct-03
ITED 434 - J. Wixson
31
Central Limit Theorem
The central limit theorem states that given a distribution
with a mean μ and variance σ2, the sampling distribution
of the mean approaches a normal distribution with a
mean (μ) and a variance σ2/N as N, the sample size,
increases.
Go to Central Limit Demonstration:
http://oak.cats.ohiou.edu/~wallacd1/ssample.html
28-Oct-03
ITED 434 - J. Wixson
32
Central Limit Theorem

The central limit theorem also states that the larger our set
of samples the more normal our distribution will be.

Thus, the sampling distribution of the mean will have a
normal shape and be come increasingly normal in shape
as the number of samples increases.

The sampling distribution of the mean will be normal
regardless of the shape of the population distribution.

Whether the population distribution is normal,positively or
negatively skewed, unimodal or bimodal in shape,the
sampling distribution of the mean will have a normal
shape.
28-Oct-03
ITED 434 - J. Wixson
33
Central Limit Theorem (Cont’d)

In the following example we start out with a uniform
distribution. The sampling distribution of the mean,
however, will contain variability in the mean values
we obtain from sample to sample. Thus, the sampling
distribution of the mean will have a normal shape,
even though the population distribution does not.
Notice that because we are taking a sample of values
from all parts of the population, the mean of the
samples will be close to the center of the population
distribution.
28-Oct-03
ITED 434 - J. Wixson
34
28-Oct-03
ITED 434 - J. Wixson
35
Hypothesis Testing
36
Classical Approach

The Classical Approach to hypothesis testing is to
compare a test statistic and a critical value. It is best
used for distributions which give areas and require
you to look up the critical value (like the Student's t
distribution) rather than distributions which have you
look up a test statistic to find an area (like the normal
distribution).

The Classical Approach also has three different
decision rules, depending on whether it is a left tail,
right tail, or two tail test.

One problem with the Classical Approach is that if a
different level of significance is desired, a different
critical value must be read from the table.
28-Oct-03
ITED 434 - J. Wixson
37
Why not accept the null hypothesis?

A null hypothesis is not accepted just because it is
not rejected.

Data not sufficient to show convincingly that a
difference between means is not zero do not prove
that the difference is zero.

No experiment can distinguish between the case of
no difference between means and an extremely
small difference between means.

If data are consistent with the null hypothesis, they
are also consistent with other similar hypotheses.
28-Oct-03
ITED 434 - J. Wixson
38
Left Tailed Test
H1: parameter < value
Notice the inequality points to the left
Decision Rule: Reject H0 if t.s. < c.v.
Right Tailed Test
H1: parameter > value
Notice the inequality points to the right
Decision Rule: Reject H0 if t.s. > c.v.
Two Tailed Test
H1: parameter not equal value
Another way to write not equal is < or >
Notice the inequality points to both sides
Decision Rule: Reject H0 if t.s. < c.v. (left) or
t.s. > c.v. (right)
The decision rule can be summarized as follows:
Reject H0 if the test statistic falls in the critical region
(Reject H0 if the test statistic is more extreme than the critical value)
28-Oct-03
ITED 434 - J. Wixson
39
P-Value Approach

The P-Value Approach, short for Probability Value, approaches
hypothesis testing from a different manner. Instead of comparing
z-scores or t-scores as in the classical approach, you're
comparing probabilities, or areas.

The level of significance (alpha) is the area in the critical region.
That is, the area in the tails to the right or left of the critical
values.

The p-value is the area to the right or left of the test statistic. If it
is a two tail test, then look up the probability in one tail and
double it.

If the test statistic is in the critical region, then the p-value will be
less than the level of significance. It does not matter whether it is
a left tail, right tail, or two tail test. This rule always holds.

Reject the null hypothesis if the p-value is less than the
level of significance.
28-Oct-03
ITED 434 - J. Wixson
40
P-Value Approach (Cont’d)

You will fail to reject the null hypothesis if the p-value is greater
than or equal to the level of significance.

The p-value approach is best suited for the normal distribution
when doing calculations by hand. However, many statistical
packages will give the p-value but not the critical value. This is
because it is easier for a computer or calculator to find the
probability than it is to find the critical value.

Another benefit of the p-value is that the statistician immediately
knows at what level the testing becomes significant. That is, a pvalue of 0.06 would be rejected at an 0.10 level of significance,
but it would fail to reject at an 0.05 level of significance.
Warning: Do not decide on the level of significance after
calculating the test statistic and finding the p-value.
28-Oct-03
ITED 434 - J. Wixson
41
P-Value Approach (Cont’d)
 Any proportion equivalent to the following statement is
correct:
The test statistic is to the p-value as the critical
value is to the level of significance and the test is
know as a “significance test.”
 The null hypothesis is rejected if p is at or below the
significance level; it is not rejected if p is above the
significance level.
 The degree to which p ends up being above or below
the significance level does not matter.
28-Oct-03
ITED 434 - J. Wixson
42
Hypothesis Testing

Hypothesis testing is a method of inferential statistics.

Researchers very frequently put forward a null
hypothesis in the hope that they can discredit it.

Data are then collected and the viability of the null
hypothesis is determined in light of the data.

If the data are very different from what would be
expected under the assumption that the null hypothesis
is true, then the null hypothesis is rejected.

If the data are not greatly at variance with what would
be expected under the assumption that the null
hypothesis is true, then the null hypothesis is not
rejected.
28-Oct-03
ITED 434 - J. Wixson
43
Hypothesis Testing

Note: Failure to reject the null hypothesis is not the
same thing as accepting the null hypothesis.
28-Oct-03
ITED 434 - J. Wixson
44
Steps to Hypothesis Testing
 1. The first step in hypothesis testing is to specify the null hypothesis
(H0) and the alternative hypothesis (H1). If the research concerns
whether one method of presenting pictorial stimuli leads to better
recognition than another, the null hypothesis would most likely be
that there is no difference between methods (H0: µ1 - µ2 = 0). The
alternative hypothesis would be H1: µ1= µ2. If the research
concerned the correlation between grades and SAT scores, the null
hypothesis would most likely be that there is no correlation (H0:
ρ= 0). The alternative hypothesis would be H1: ρ0.
 2. The next step is to select a significance level. Typically the .05 or the
.01 level is used.
 3. The third step is to calculate a statistic analogous to the parameter
specified by the null hypothesis. If the null hypothesis were defined by
the parameter µ1- µ2, then the statistic M1 - M2 would be computed.
28-Oct-03
ITED 434 - J. Wixson
45
Steps to Hypothesis Testing
 4. The fourth step is to calculate the probability value (often
called the p value) which is the probability of obtaining a
statistic as different or more different from the parameter
specified in the null hypothesis as the statistic computed from
the data. The calculations are made assuming that the null
hypothesis is true. (click here for a concrete example)
 5. The probability value computed in Step 4 is compared with the
significance level chosen in Step 2. If the probability is less
than or equal to the significance level, then the null hypothesis
is rejected; if the probability is greater than the significance
level then the null hypothesis is not rejected. When the null
hypothesis is rejected, the outcome is said to be "statistically
significant"; when the null hypothesis is not rejected then the
outcome is said be "not statistically significant."
28-Oct-03
ITED 434 - J. Wixson
46
Steps to Hypothesis Testing

6. If the outcome is statistically significant, then the null hypothesis
is rejected in favor of the alternative hypothesis. If the rejected null
hypothesis were that µ1- µ2 = 0, then the alternative hypothesis
would be that µ1= µ2. If M1 were greater than M2 then the
researcher would naturally conclude that µ1 µ2. (Click here to see
why you can conclude more than µ1= µ2).

7. The final step is to describe the result and the statistical
conclusion in an understandable way. Be sure to present the
descriptive statistics as well as whether the effect was significant
or not. For example, a significant difference between a group that
received a drug and a control group might be described as follow:
– Subjects in the drug group scored significantly higher (M = 23)
than did subjects in the control group (M = 17), t(18) = 2.4, p =
0.027.
28-Oct-03
ITED 434 - J. Wixson
47
Steps to Hypothesis Testing

The statement that "t(18) =2.4" has to do with how the
probability value (p) was calculated. A small minority of
researchers might object to two aspects of this wording.

First, some believe that the significance level rather than the
probability level should be reported. The argument for reporting
the probability value is presented in another section.

Second, since the alternative hypothesis was stated as µ1= µ2,
some might argue that it can only be concluded that the
population means differ and not that the population mean for the
drug group is higher than the population mean for the control
group.
28-Oct-03
ITED 434 - J. Wixson
48
Steps to Hypothesis Testing

This argument is misguided. Intuitively, there are strong reasons
for inferring that the direction of the difference in the population
is the same as the difference in the sample. There is also a
more formal argument. A non-significant effect might be
described as follows:
– Although subjects in the drug group scored higher (M = 23)
than did subjects in the control group, (M = 20), the
difference between means was not significant, t(18) = 1.4, p
= .179.

It would not have been correct to say that there was no
difference between the performance of the two groups. There
was a difference. It is just that the difference was not large
enough to rule out chance as an explanation of the difference. It
would also have been incorrect to imply that there is no
difference in the population. Be sure not to accept the null
hypothesis.
28-Oct-03
ITED 434 - J. Wixson
49