“null” hypothesis is largely credited to Karl

Download Report

Transcript “null” hypothesis is largely credited to Karl

HYPOTHESIS
TESTING
All of the figures in this PowerPoint are from:
Statistics without Math – Magnusson & Mourao
The concept of accepting a hypothesis through the rejection
of a “null” hypothesis is largely credited to Karl Popper.
A null hypothesis is a statement about how the world would
be if our conjecture is wrong.
Difference in means –
Observed is 3.8-7.7 = -3.9
Null is 7.7-7.0 = 0.7
Enough to reject the null hypothesis?
We can assure ourselves
that the difference
between our observed
distribution and that of the
null is not accidental by
generating a large
number of predictions
based on the null
hypothesis.
We could repeat this exercise a lot (say 100) times, then determine the
percentage of predicted outcomes that have, in this case, a difference
between the means as big or bigger than our observed value of -3.9. If
these are fewer than 5%, we reject the null hypothesis.
William S. Gossett standardized this process by dividing the difference in
means by the s.d. to derive the t statistic (Student, 1908), now known as
Student’s t-test.
It is conventional to represent the distributions of data horizontally,
rather than vertically.
Assuming the data are normally distributed, we can use the
theoretical characteristics of the distribution as the basis for
testing the differences between the means.
This is, basically, what mathematicians/statisticians do.
They don’t physically sample their null populations.
Type I Error – falsely rejecting the null hypothesis and
deciding that a phenomenon exists when it does not.
Type II Error – accepting the null hypothesis when it is false.
Generally inversely proportional to the probability of making a
Type I error.
We avoid Type I errors by setting bar high for rejecting the
null hypothesis (< 5%). This is very important to the progress
of science – we don’t want to build knowledge based on
faulty previous work.
The ability of a statistical test to reject the null hypothesis
when it is indeed false is called the “power of the test”.
Now let’s look at an example where we compare more than
2 groups. Streams with carnivorous fish, without fish and
with herbivorous fish.
When we do pair-wise comparisons using t-tests, we
compound Type I errors.
Sir Ronald Fisher used the comparison of variances.
If the variability (e.g. range)
is similar in each category
and the means differ, then
the total variability will be
greater than the variability
within any one category.
Vi < VT
and VR = VT / Vi < 1
With no difference among
means,
Vi = VT and VR = 1
Fisher used the ratio of the variances, now known as the F-statistic
Computer programs create the null distribution based on the
mean variability (variance). If the variance is grossly dissimilar
among groups, the null variance will be underestimated and we
may be led to commit a Type I error in our Fisher’s test (aka
analysis of variance or ANOVA).
It is very important to visually inspect one’s data (and test for
similarity of variances – e.g. Levene’s test).
The variability between the means of the two groups is due to
the factor (in this case fish). The difference between this and the
total variability is the residual variability (not attributed to any
particular cause).
When we do this with variance, it’s called partitioning variance.
F is the ratio of the factor mean square to the residual mean
square. Mean squares are the sums of squares divided by the df,
and are analogous to the variances. Less a few constants:
F = (d2Factor + d2Residual) / (d2Residual)
When variance due to the factor is zero (null hypothesis is
correct), F = 1.
You can run into problems when you categorize a continuous phenomenon.
Plotting the results of the “narrow” sampling regime at two temperatures:
John estimates a probability of 0.78 that the null hypothesis is correct.
Mary estimates a probability of 0.035 that the null hypothesis is
correct, so she rejects the null hypothesis.
VResidual + VFactor + VLevels = VTotal
Plotting the results of the “wide” sampling regime at two temperatures:
John now rejects the null hypothesis (P = 0.013), and
Mary accepts the null hypothesis (P = 0.22).
VResidual + VFactor + VLevels + VWidth = VTotal
We might expect a direct relationship between the
number of trees and the area of a given reserve.
Here, our plot shows how closely the data conform to our
hypothetical relationship (Y = a + bX, where a = 0 and b = 1).
With insect activity and temperature, we may not know
the true relationship and we draw a line that represents
what the relationship may be based upon the distribution
of the data.
We can fit the line to
the data based on
minimizing the
distance of the points
to the line (A), using
the minimum area of
the triangles formed
by the horizontal and
vertical lines from the
points to the lines (B),
or by minimizing the
sum of the squared
vertical distances of
the points to the line
(least squares
regression)(C).
The same basic concepts apply to regression analysis
as in ANOVA.
If the residual variation approaches the total variation, we
assume that there is no effect of the measured variable, i.e.
VFactor = 0 (null hypothesis is correct).
If we had studied a smaller temperature range, we decrease the
variation due to the factor, while maintaining the residual variation and
we lose our ability to detect an effect.
trees
trees
+
monkeys
+
+
shrubs
monkeys
Vertical lines
represent the
variability not
explained by
the linear
model.
We can use
these residuals
to calculate the
partial
regressions.
Monkeys = -0.667 X trees
Monkeys = 1.667 X shrubs
Monkeys = 0.33 + (-0.667 X trees) + (1.667 X shrubs)
[equation for the multiple regression]
THINKING ABOUT
YOUR DATA,
BIOLOGICALLY
Computer-Generated “Phantom” Variables
In this theoretical example (in which the data were randomly
generated), a graduate student is appalled that none of the
factors his advisor suggested are significant.
So, he tests for all possible
interactions. He finds a
significant interaction
(p=0.001) between snags
and herbivorous fish. There
are numerous biological
explanations for an
interaction of snags and
herbivorous fish on crayfish
density to allow for an indepth discussion in the
thesis.
The problem is, this
example is base on random
data. With 25 possible
effects/interactions in an
ANOVA, we would expect
one to be “significant” at the
0.05 level by chance.
Data for 30 lakes:
Pollution = heavy metal conc. in ppb
Fish = mean # per gill-net per h
Phytoplankton = chlorophyll conc.
Crayfish = # per trap-hour
Problem is that the data are all on different scales.
Divide each by s.d. of that variable, effectively
putting all in units of s.d. – e.g. an increase in one
s.d. of pollution leads to a decrease of so many
s.d.’s in the # of crayfish.
Called standardized estimates of parameters. Can
use these standardized coefficients in “path
analysis”.
Run multiple regression on standardized variables:
Crayfish = 0.0 - 0.16 x pollution – 0.39 x fish + 0.55 x phytoplankton
The effect of pollution is not significant (p=0.53), that of fish is
questionable, (p=0.07), there is a strong effect of phytoplankton
(p=0.01).
This is counterintuitive, because a simple regression indicates a
significant (p=0.03) positive effect of pollution on crayfish (slope =
0.41). This is because Fig. 10.2 only shows “direct effects” and
doesn’t represent the system biologically.
Need to look at “indirect effects” as well. Get standardized
regressions for each direct effect by simple regression.
Calculate indirect effects by multiplying path coefficients along paths:
To get the overall effect of pollution, we add the direct and indirect effects:
(-0.16)+(0.26)+(0.31) = 0.41, which is the simple regression value.
Predicting the mass of a tree from its diameter is not a linear function,
but may conform to a power function of the form:
Biomass = a x Diameterb +e1
Ignoring the error term, if we take the logs of both sides, we
transform the equation into a linear one that can be treated with
ordinary least-squares methods:
log10(biomass) = log10(a) + b log10(diameter) + e2
log10(biomass) = -0.775 + 2.778 log10(diameter)
Take antilog of the equation to yield:
Biomass = 0.168 x Diameter2.778