Test of Significance
Download
Report
Transcript Test of Significance
No criminal on the run
The concept of test of significance
FETP India
Competency to be gained
from this lecture
Formulate and test null hypotheses
Key issues
• Null and alternate hypotheses
• Type I and Type II errors
• Statistical testing
What is the question at hand?
• Estimating a quantity?
• Test a hypothesis?
Hypotheses
Taking into account the sampling
variation in decision-making
• Studies are on sample of subjects and not
on an entire population
• There is sampling variation
• Allowance should be given for sampling
variation while a decision taking
Hypotheses
Rationalizing decision-making
• Research studies test hypotheses
Experiment and data collection
• Hypotheses are tested on the basis of
inference from available data
• Considering a difference as significant may
be subjective
• The concept of statistical significance is a
decision-making tool to make a subjective
decision objective
Hypotheses
A man is brought to court
accused of a crime
• The judge needs to start from the hypothesis
that the person is innocent
• The evidence is brought in:
Fingerprints
Pictures
Hypotheses
Assessing whether the evidence is caused
by chance or not
• The judge assesses whether the evidence
could be due to chance
• If the probability that the evidence is caused
by chance is high:
The judge accepts the hypothesis of innocence
• If the probability that the evidence is caused
by chance is low:
The judge rejects the hypothesis of innocence
Hypotheses
Hypotheses formulated by
epidemiologists
• Ho: Null hypothesis (=“innocence”)
The difference observed is caused by chance, or
sampling variation
• H1: Alternate hypothesis
The probability that the difference observed is
caused by chance alone is low
Hypotheses
From sampling distribution
to hypothesis testing
• Epidemiologists decide a critical / rejection
region
That decision is arbitrary
• If the value falls under an extreme,
rejection region, the null hypothesis is
rejected
Hypotheses
Type I and type II errors
• Type I
Rejection error, also called alpha error
Rejecting the null hypothesis when it is true
Punishing an innocent
Particularly unacceptable to society
Must be minimized
• Type II
Acceptance error, also called beta error
Accepting the null hypothesis when it is false
Releasing a guilty person charged
Errors
Balancing the risk of errors
• If the judge wants to always avoid type I
error, he can release everyone
He will always commit the type II error
• If the judge wants to always avoid type II
error, he can charge everyone
He will always commit the type I error
• To balance the risk of errors, we will fix one
error and try to minimize the other
Errors
Which error is more important?
Hypertension
HIV
Many
Few
Concluding that new treatment is
better when it is not
Unfortunate
Not so
unfortunate
Concluding that new treatment
is no better when it is better
Not so
unfortunate
Very
unfortunate
Which error is more important?
Type I
Type II
Effective drugs already
available?
Errors
Examples of errors
• An example where type I error is important
If a new drug becomes available for HIV, we must
minimize the risk to reject a drug that would
work
• An example where type II error is important
If a new drug becomes available for
hypertension, since lots of anti-hypertensive are
already available, we cannot take a risk and can
only accept a drug that is completely safe
Errors
Behind errors are the right decisions
• 1-alpha
Probability of accepting the null hypothesis when
it is the right decision
• 1-beta
Probability of rejecting the null hypothesis when
it is the right decision
Also called statistical power
Errors
Alpha and beta error
Decision
Accept Ho
Reject Ho
Ho is true
Good decision
1-alpha
Alpha error
Ho is false
Beta error
Good decision
1 - beta
Truth
Errors
Sampling fluctuation in samples of 100
subjects for height measurement
• Even when statistically
sound sampling techniques
are employed
Sampling error
of mean
=1
Population
of 10,000
Mean height = 65”
S.d. = 10”
66”
67”
63”
65”
64”
The mean in samples of 100
will not necessarily be 65”
Variation from sample to
sample
• This must be taken into
account when interpreting
differences
• This method is called a
significance test
Testing
Magnitude of allowance
• Consider an expected difference of 0%
1%, 2%, 3%
• Not large
20%, 30%
• Large, not willing to consider the difference as 0%
• WHY?
If the true difference is 0%, the chance
(probability) of getting a difference exceeding
20% is very small
Testing
Decision rule
• Formulate a decision rule based on the
probability of getting the observed
difference
Null hypothesis (Ho)
• Assuming Ho is true, compute the probability
of obtaining the observed difference
• If the probability is low:
Reject Ho
• Else, accept Ho
Testing
Choosing a rejection level
• The definition of low probability is subjective
• Conventionally:
Low probability = 5% (P=0.05)
If P < 0.05, the observed difference is ‘significant
(Statistically)
P< 0.01, sometimes termed as ‘Highly significant’
• Computation of P-values:
Statistical exercise
Depends on the nature of data and design of the study
• Necessary condition: Probability sample
No test of significance on convenience or quota samples
Testing
Concept of test of significance
• Question:
Could the population mean
be 65” ?
• Hypothesis:
Population of
10, 000
Population mean = 65”
• Question:
What is the probability of
obtaining a sample mean of
68” from this population
when sample size = 100 ?
• If this probability is small
(e.g. < 5%)
A random sample
of size 100 is drawn
Mean height = 68”
Reject the Hypothesis
• If not, accept the
Hypothesis
Testing
Test of significance:
Computation of probability
Observed mean = 68”
Postulated mean = 65”
Standard deviation = 10”
Sample size = 100
Sampling error (s.e.) of mean = 10 / 100 = 1
Compute:
Observed mean - Postulated mean
68-65
----------------------------------------- = -------- = 3
s.e. of mean
1
• Critical value for significance at 5% level = 1.96
• Since 3 > 1.96, the difference is statistically significant
• Exact probability = 0.0027 , i.e., 0.27%
•
•
•
•
Testing
What if the distribution is not normal?
• Transform the data (e.g., drug
concentration, cell counts) to some other
scale to obtain a normal distribution
e.g., logarithm, square root
• If not feasible, and provided sample size
exceeds 30, make use of the result that
mean is approximately normally distributed
Testing
Estimating the sample size
• The epidemiologist examines the willingness
to commit:
Alpha error
Beta error
• Sample size calculation is the step at which
decisions will be made in this respect
Testing
Interpretation of significance
• “Significant” does not necessarily mean that
the observed difference is REAL or
IMPORTANT
• “Significant” only means that it is unlikely
(<5%) that the difference is due to chance
• Trivial differences can be statistically
significant if they are based on large
numbers
Testing
Interpretation of non-significance
• “Non - significant” does not necessarily mean that
there is no real difference
• “Non - significant” means only that the observed
difference could easily be due to chance
Probability of at least 5%
• There could be a real or important difference but
due to inadequate sample size we might have
obtained a non-significant result
Testing
Significance does not systematically
mean causation: Potential explanations
for a significant association
x
?
?
?
Chance: Addressed by the significance test
Bias
Confounding factor
Causation
Consider after the first three have been ruled out
Test for causality criteria
Testing
The choice of a one-sided test depends
upon the alternate hypothesis
• One-sided test
• When the alternate hypothesis is in one
direction
• The actual P-values need to be quoted
instead of stating just p < 0.05 or p < 0.01
Testing
Quick checklist for statistical testing
A statistical test is indeed needed
The test used is adapted
The test is calculated correctly
The interpretation of the test is appropriate
Testing
Key messages
• Under the null hypotheses, differences
observed are caused by chance alone
• Type I error consists in rejecting the null
hypothesis when it is true while type II error
consists in accepting the null hypothesis
when it is false
• Statistical tests estimate the probability that
a difference observed may be caused by
chance alone