Normal Curve and Hypothesis Testing

Download Report

Transcript Normal Curve and Hypothesis Testing

Lecture 16:
UNIVARIATE STATISTICS, THE NORMAL
CURVE AND INTRO TO HYPOTHESIS TESTING
2
Assn 2 Comments
Using Central Tendencies in Recoding
3
 “collapsing” variables
Dispersion
4
 Range

Difference between highest value and
the lowest value.
 Standard Deviation

A statistic that describes how tightly
the values are clustered around the
mean.
 Variance


A measure of how much spread a
distribution has.
Computed as the average squared
deviation of each value from its mean
Properties of Standard Deviation
5
 Variance is just the square of
the S.D. (or, S.D is the square
root of the variance)
 If a constant is added to all
scores, it has no impact on S.D.
 If a constant is multiplied to all
scores, it will affect the
dispersion (S.D. and variance)
S = standard deviation
X = individual score
M = mean of all scores
n = sample size (number
of scores)
Why Variance Matters…
6
 In many ways, this is the purpose of many statistical
tests: explaining the variance in a dependent
variable through one or more independent variables.
Common Data Representations
7
 Histograms (hist)
 Simple graphs of the frequency of groups of scores.
 Stem-and-Leaf Displays (stem)
 Another way of displaying dispersion, particularly useful
when you do not have large amounts of data.
 Box Plots (graph box)
 Yet another way of displaying dispersion. Boxes show 75th
and 25th percentile range, line within box shows median,
and “whiskers” show the range of values (min and max)
8
Issues with Normal Distributions
 Skewness
 Kurtosis
9
Estimation and Hypothesis Tests: The Normal
Distribution
 A key assumption for many variables (or
specifically, their scores/values) is that they are
normally distributed.
 In large part, this is because the most common
statistics (chi-square, t, F test) rest on this
assumption.
10
Hypothesis Testing and the ‘normal’ Curve
11
Logic of Hypothesis Testing
 Null Hypothesis:
 H0: μ1 = μc
μ1 is the intervention population
mean
 μc is the control population mean

 Alternative
Hypotheses:


 In English…
 “There is no significant
difference between the
intervention population mean
and the control population
mean”

H1: μ1 < μc
H1: μ1 > μc
H1: μ1 ≠ μc
12
Conventions in Stating Hypotheses
 Three basic approaches to using variables in hypotheses:

Compare groups on an independent variable to see
impact on dependent variable

Relate one or more independent variables to a dependent
variable.

Describe responses to the independent, mediating, or
dependent variable.
13
One and Two-Tailed Tests: Defining Critical Regions
The z-score
 Infinitely many normal
distributions are possible, one
for each combination of mean
and variance– but all related to
a single distribution.
 Standardizing a group of scores
changes the scale to one of
standard deviation units.
z
Y 

 Allows for comparisons with
scores that were originally on a
different scale.
15
z-scores (continued)
 Tells us where a score is located within a
distribution– specifically, how many standard
deviation units the score is above or below the mean.
 Properties
 The mean of a set of z-scores is zero (why?)
 The variance (and therefore standard deviation) of a set of zscores is 1.
16
Area under the normal curve
 Example, you have a variable x with mean
of 500 and S.D. of 15. How common is a
score of 525?

Z = 525-500/15 = 1.67

If we look up the z-statistic of 1.67 in a z-score
table, we find that the proportion of scores less
than our value is .9525.

Or, a score of 525 exceeds .9525 of the
population. (p < .05)

Z-score table
z
Y 
17

18
When to use z-score and why?
Advantages?
Disadvantages?
Sampling Distributions, N, and Expected Values
 Sampling Distribution
 The probability distribution of the sampling means
 As ‘N’ increases, we know more about the variability of
our variable of interest, and can make better judgments
about the possible population mean.
 Expected Values
 For continuously distributed variables, the expected
value is the mean.