2010 - VAM Resource Center
Download
Report
Transcript 2010 - VAM Resource Center
Hypothesis testing
Intermediate Food Security Analysis Training
Rome, July 2010
Hypothesis testing
Hypothesis testing involves:
1.
2.
defining research questions and
assessing whether changes in an independent variable are
associated with changes in the dependent variable by
conducting a statistical test
Dependent and independent variables
Dependent variables are the outcome variables
Independent variables are the predictive/ explanatory variables
2
Examples…
Research question: Is educational level of the mother
related to birthweight?
What is the dependent and independent variable?
Research question: Is access to roads related to
educational level of mothers?
Now?
3
Tests statistics
To test hypotheses, we rely on test statistics…
Test statistics are simply the result of a particular
statistical test
The most common include:
1.
T-tests calculate T-statistics
2.
ANOVAs calculate F-statistics
3.
Correlations calculate the pearson correlation coefficient
4
Significant test statistic
Is the relationship observed by chance, or because there actually is a
relationship between the variables???
This probability is referred to as a p-value and is expressed a decimal
percent (ie. p=0.05)
If the probability of obtaining the value of our test statistic by chance is
less than 5% then we generally accept the experimental hypothesis as
true: there is an effect on the population
Ex: if p=0.1-- What does this mean? Do we accept the experimental
hypothesis?
This probability is also referred to as significance level (sig.)
5
Hypothesis testing Part 1:
Continuous variables
Topics to be covered in this
presentation
T- test
One way analysis of variance (ANOVA)
Correlation
By the end of this session, the participant should be able to:
7
Hypothesis testing…
WFP tests a variety of hypothesis…
Some of the most common include:
1. Looking at differences between groups of people (comparisons of
means)
Ex. Are different livelihood groups more likely to have different levels food
consumption??
2. Looking at the relationship between two variables…
Ex. Is asset wealth associated with food consumption??
8
How to assess differences in two
means statistically
T-tests
9
T-test
A test using the t-statistic that establishes whether two
means differ significantly.
Independent means t-test:
It is used in situations in which there are two experimental
conditions and different participants have been used in each
condition.
Dependent or paired means t-test:
This test is used when there are two experimental
conditions and the same participants took part in both
conditions of experiment.
10
T-test: assumptions
Independent T-tests works well if:
continuous variables
groups to compare are composed of different people
within each group, variable’s values are normally distributed
there is the same level of homogeneity in the 2 groups.
11
Normal distribution
Normal distributions are perfect
symmetrical around the
mean (mean is equal to zero)
Values close to the mean (zero)
have higher frequency.
Values very far from the mean
are less likely to occur (lower
frequency)
Variance
Variance measures how cases are similar on a specific variable (level of
homogeneity)
V = sum of all the squared distances from the Mean / N
Variance is low → cases are very similar to the mean of the distribution (and to
each other). The group of cases is therefore homogeneous (on this variable)
Variance is high → cases tend to be very far from the mean (and different from
each other). The group of cases is therefore heterogeneous (on this variable)
Homogeneity of Variance
T-test works well if the two groups have the same homogeneity
(variance) on the variable. If one group is very homogeneous and
the another is not, T-test fails.
The independent t-test
The
independent t-test compares two means, when those
means have come from different groups of people;
15
To conduct an independent t-test in SPSS
1.
2.
3.
4.
5.
Click on “Analyze” drop down menu
Click on “Compare Means”
Click on “Independent- Sample T-Test…”
Move the independent and dependent variable into
proper boxes
Click “OK”
18
T-test: SPSS procedure
Drag the
variables into the
proper boxes
define values for
the independent
variable
19
One note of caution about independent
t-tests
It is important to ensure that the assumption of homogeneity of
variance is met:
To do so:
Look at the column labelled Levene’s Test for Equality of Variance.
If the Sig. value is less than .05 then the assumption of homogeneity
of variance has been broken and you should look at the row in the
table labelled Equal variances not assumed.
If the Sig. value of Levene’s test is bigger than .05 then you should
look at the row in the table labelled Equal variances assumed.
20
T-test: SPSS output
Group Statistics
beneficiary household
as per CP records
coping s trategies index 1
2
N
Mean
40.9019
42.3750
581
568
Std. Deviation
30.70829
32.38332
Std. Error
Mean
1.27399
1.35877
Look at the Levene’s Test …
If the Sig. value of the test is less than .05, groups have different variance. Read
the row “Equal variances not assumed”
If the Sig. value of test is bigger than .05, read the row labelled “Equal variances
assumed”
Independent Samples Test
Levene's Test for
Equality of Variances
F
coping s trategies index Equal variances
ass umed
Equal variances
not as sumed
.004
Sig.
.950
t-tes t for Equality of Means
t
df
Sig. (2-tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference
Lower
Upper
-.791
1147
.429
-1.47311
1.86149
-5.12542
2.17921
-.791
1140.469
.429
-1.4731121
1.86261
-5.12764
2.18143
What to do if we want to statistically
compare differences in three means?
Analysis of variance
(ANOVA)
22
Analysis of Variance (ANOVA)
ANOVAs test tells us if there are any difference among
the different means but not how (or which) means differ.
ANOVAs are similar to t-tests and in fact an ANOVA
conducted to compare two means will give the same
answer as a t-test.
23
Calculating an ANOVA
ANOVA formulas: calculating an ANOVA by hand is
complicated and knowing the formulas are not necessary…
Instead, we will rely on SPSS to calculate ANOVAs…
24
Example of One-Way ANOVAs
Research question: Do mean child malnutrition (GAM) rates differ according
to mother’s educational level (none, primary, or secondary/ higher)?
Report
WAZNEW
Mother' s education level
Mean
N
St d. Deviation
No educati on
-1.3147
736
1. 32604
Primary
-1.0176
3247
1. 21521
Secondary
-.5525
907
1. 25238
Higher
-.1921
172
1. 33764
T otal
-.9494
5062
1. 27035
ANOVA
WAZNEW
Sum of Squares
df
Mean Square
Between Groups
354.567
3
118.189
Within Groups
7812.148
5057
1. 545
T otal
8166.715
5060
25
F
76.507
Sig.
.000
To calculate one-way ANOVAs in SPSS
In SPSS, one-way ANOVAs are run using the following steps:
Click on “Analyze” drop down menu
1.
Click on “Compare Means”
2.
Click on “One-Way ANOVA…”
3.
Move the independent and dependent variable into
proper boxes
4.
Click “OK”
26
ANOVA: SPSS procedure
1. Analyze; compare
means; one-way
ANOVA
2. Drag the independent
and dependent
variable into proper
boxes
3. Ask for the descriptive
4. Click on ok
27
ANOVA: SPSS output
Along with the mean for each group, ANOVA produces the
F-statistic. It tells us if there are differences between the
means. It does not tell which means are different.
Look at the F’s value and at the Sig. level
ANOVA
coping s trategies index
Sum of
Squares
Between Groups 25600.110
Within Groups
1116564
Total
1142164
28
df
10
1138
1148
Mean Square
2560.011
981.163
F
2.609
Sig.
.004
Determining where differences exist
In addition to determining that differences exist among the
means, you may want to know which means differ.
There is one type of test for comparing means:
Post hoc tests are run after the experiment has been
conducted (if you don’t have specific hypothesis).
29
ANOVA post hoc tests
Once you have determined that differences exist among the
means, post hoc range tests and pairwise multiple
comparisons can determine which means differ.
Tukeys post hoc test is the amongst the most popular and
are adequate for our purposes…so we will focus on this
test…
30
To calculate Tukeys test in SPSS
In SPSS, Tukeys post hoc tests are run using the following
steps:
1.
2.
3.
4.
5.
6.
7.
8.
Click on “Analyze” drop down menu
Click on “Compare Means”
Click on “One-Way ANOVA…”
Move the independent and dependent variable into proper boxes
Click on “Post Hoc…”
Check box beside “Tukey”
Click “Continue”
Click “OK”
31
Determining where differences exist
in SPSS
Once you have determined that differences exist among the means → you
may want to know which means differ…
Different types of tests exist for pairwise multiple comparisons
Pairwise comparisons: SPSS output
Once you have decided which post-hoc test is appropriate
Look at the column “mean difference” to know the difference between
each pair
Look at the column Sig.: if the value is less than .05 then the means of the
two pairs are significantly different
Multiple Comparisons
Dependent Variable: coping strategies index
Tukey HSD
(I) ass et wealth
ass et poor
ass et medium
ass et rich
(J) as set wealth
ass et medium
ass et rich
ass et poor
ass et rich
ass et poor
ass et medium
Mean
Difference
(I-J)
8.5403*
22.5906*
-8.5403*
14.0503*
-22.5906*
-14.0503*
Std. Error
1.6796
2.7341
1.6796
2.5873
2.7341
2.5873
*. The mean difference is significant at the .05 level.
Sig.
.000
.000
.000
.000
.000
.000
95% Confidence Interval
Lower Bound Upper Bound
4.599
12.481
16.175
29.006
-12.481
-4.599
7.979
20.121
-29.006
-16.175
-7.979
34 -20.121
Now what if we would like to measure
how well two variables are associated
with one another?
Correlations
35
Correlations
T-tests and ANOVAs measure differences between means
Correlations explain the strength of the linear relationship
between two variables…
Pearson correlation coefficients (r) are the test statistics
used to statistically measure correlations
36
Types of correlations
Positive correlations: Two variables are positively correlated if increases
(or decreases) in one variable results in increases (or decreases) in the
other variable.
Negative correlations: Two variables are negatively correlated if one
increases (or decreases) and the other decreases (on increases).
No correlations: Two variables are not correlated if there is no linear
relationship between them.
-1--------------------------0---------------------------1
Strong negative
correlation
37
No correlation
Strong positive
correlation
Illustrating types of correlations
Perfect positive correlation
Positive correlation
Test statistic= 1
Test statistics>0 and <1
Perfect negative correlation
Negative correlation
Test statistic=
38-1
Test statistic<0 and >-1
Example for the Kenya Data
Correlation between children’s weight and height…
1400
1200
1000
Height of child
800
600
400
200
0
100
200
300
Weight of child
Cases w eighted by CHWEIGHT
Is this a positive or negative correlation??
In what range would the test statistics fall?
To calculate a Pearson’s correlation
coefficient in SPSS
In SPSS, correlations are run using the following steps:
1.
Click on “Analyze” drop down menu
2.
Click on “Correlate”
3.
Click on “Bivariate…”
4.
Move the variables that you are interested in assessing
the correlation between into the box on the right
5.
Click “OK”
40
example in SPSS…
Correlations
wealth
wealth
FCS
Pears on Correlation
Sig. (2-tailed)
N
Pears on Correlation
Sig. (2-tailed)
N
1
10
.932**
.000
10
FCS
.932**
.000
10
1
**. Correlation is s ignificant at the 0.01 level
(2-tailed).
Using SPSS we get Pearson’s correlation (0.932)
41
10
1.
Lets refresh briefly, what does a correlation of 0.932
mean??
2.
What does *** mean?
42
Summary
Check out pg 171 of
CFSVA manual for an
overview of the test
43