Transcript Ch10

10
Hypothesis Testing
Statistical hypothesis testing
•
The expression level of a gene in a given condition is measured
several times. A mean x of these measurements is calculated. From
many previous experiments, it is known that the mean expression
level of the given gene in normal conditions is m. How can you decide
which genes are significantly regulated in a microarray experiment?
For instance, one can apply an arbitrary cutoff such as a threshold of
at least twofold up or down regulation.
One can formulate the following hypotheses:
1. The gene is up-regulated in the condition under study: x>m
2. The gene is down-regulated in the condition under study: x<m
3. The gene is unchanged in the condition under study: x=m
4. Something has gone awry during the lab experiments and the genes
measurements are completely off; the mean of the measurements may
be higher or lower than the normal: x≠m.
Statistical hypothesis testing
When a hypothesis test is viewed as a decision procedure, two types of error are
possible, depending on which hypothesis, H0 or H1, is actually true. If a test rejects
H0 (and accept H1) when H0 is true, it is called a type I error, a, (rejection error).
a = P (reject H0 | H0 is true)
If a test fails to reject H0 when H1 is true, it is called a type II error (acceptance
error).
b = P (do not reject H0 | H0 is false)
The following shows the results of the different decisions.
Decision
H0
Do not reject H0
Reject H0
H0 is True
Correct decision
Type I error
H0 is False
Type II error
Correct decision
Statistical hypothesis testing
• The next step is to generate two hypotheses. The two hypotheses must be
mutually exclusive and all inclusive.
• Mutually exclusive – the two hypotheses cannot be true both at the same time
• All inclusive means that their union has to cover all possibilities
• Expression ratios are converted into probability values to test the hypothesis
that particular genes are significantly regulated
• Null hypothesis H0 that there is no difference in signal intensity across the
conditions being tested
• The other hypothesis (called alternate or research hypothesis) named H1. If
we believe that the gene is up-regulated, the research hypothesis will be H1:
x > m, The null hypothesis has to be mutually exclusive and also has to
include all other possibilities, therefore, the null hypothesis will be H0: x≦ m.
• One assigns a p-value for testing the hypothesis. The p-value is the
probability of a measurement more extreme than a certain threshold occurring
just by chance.
• The probability of rejecting the null hypothesis when it is true is the
significance level a , which is typically set at p<0.05, in other words we
accept that 1 in 20 cases our conclusion can be wrong.
Statistical hypothesis testing
• Single sample, test of hypothesis (TOH)
• x = sample mean, s = sample standard deviation
Type of test
TOH
Known
parameters
Normal
distribution test
x
m, s, x, s
t-test
x
m, x, s
Unknown
parameters
Statistics
Z
s
t
xm
s/ n
xm
s
n
Chi-square
s
s, x, s
x
2 
(n  1) s 2
s2
Statistical hypothesis testing
One-tail testing
• The alternative hypothesis specifies that the parameter is
greater than the values specified under H0, e.g. H1: m>15.
such a hypothesis is called upper one-tail testing.
Example
• The expression level of a gene is measured 4 times in a
given condition. The 4 measurements are used to calculate
a mean expression level of x=90. it is known from the
literature that the mean expression level of the given gene,
measured with the same technology in normal conditions
is m=100 and the standard deviation is s=10. We expect
the gene to be down-regulated in the condition under
study and we would like to test whether the data support
this assumption.
• The alternative hypothesis H1 is “the gene is downregulated” or
H0: x≧m, therefore, H1 x<m
• This is an example of a one-tail hypothesis in which we
expect the values to be in one particular tail of the
distribution.
Statistical hypothesis testing
• From the sampling theorem, the means of samples are
distributed approximately as a normal distribution.
• Sample size = 4, Mean x = 90
• Standard deviation s = 10
• Assuming a significance level of 5%
• The null hypothesis is rejected if the computed p-value is
lower than the critical value (0.05)
• We can calculate the value of Z as
x  m 90  100
Z

 2
s / n 10 / 4
The probability of having such a value just by chance, i.e. the p-value, is :
p(Z < -2) = 0.02275
The computed p-value is lower than our significance threshold 0.02275 < 0.05,
therefore we reject the null hypothesis. In other words, we accept the alternate
hypothesis. We stated that “the gene is down-regulated at 5% significance
level”.
This will be understood by the knowledgeable reader as a conclusion that is
wrong in 5% of the cases or fewer.
Normal
distribution
table
Normal distribution table
NORMDIST - Area under the curve start from left hand side
Z=0
Z=2
Standard normal distribution
NORMSDIST
Statistical hypothesis testing
Two-tail testing
• A novel gene has just been discovered. A
large number of expression experiments
measured the mean expression level of
this gene as 100 with a standard deviation
of 10. Subsequently, the same gene is
measured 4 times in 4 cancer patients.
The mean of these 4 measurements is 109.
Can we conclude that this gene is
differential expressed in cancer?
• We do not whether the gene will be upregulated or down-regulated.
X
• Null hypothesis H0: = 100,
• Alternative hypothesis H1: X ≠ 100
• At a significant level of 5%  2.5% for
the left tail and 2.5% for the right tail
• Z = (109 – 100)/(10/√4) = 9/(10)*2 = 1.8
• p-value, p(Z≧1.8) = 1 – p(Z≦1.8) = 1 –
0.9641 = 0.0359 > 0.025  that is the pvalue is higher than the significant level,
so we cannot reject the null hypothesis
X
2.5%
2.5%
Tests involving the mean – the t distribution
• Hypothesis testing
• Parametric testing – where the data are known or assumed to follow a
certain probability distribution (e.g. normal distribution)
• Non-parametric testing – where no a priori knowledge is available and
no such assumptions are made.
• The t distribution test or student’s t distribution test is a parametric test,
it was discovered by William S. Gossett, a 32-year old research
chemist employed by the famous Irish brewery (釀造,如啤酒)
Guinness.
Tests involving the mean – the t distribution
•
1.
2.
3.
•
Tests involving a single sample may focus on the mean of the sample
(t-test, where variance of the population is not known) and the
variance (2-test). The following hypotheses may be formulated if the
testing regards the mean of the sample:
H0: m = c, H1: m≠c
H0: m≧c, H1: m<c
H0: m≦c, H1: m>c
The first hypotheses corresponds to a two-tail testing in which no a
prior knowledge is available, while the second and the third
correspond to a one-tail testing in which the measured value c is
expected to be higher and lower than the population mean m,
respectively.
Tests involving the mean – the t distribution
• The expression level of a gene is known to have a mean expression level of
18 in the normal human population. The following expression values have
been obtained in five measurements: 21, 18, 23, 20, 18. Is this data consistent
with the published mean of 18 at a 5% significant level?
• Population s.d. s is not known  t-test, calculate sample s.d. s to estimate s
• H0 : x = m = 18, H1 : x ≠ m  18  two-tail test
• Calculate the t-test statistics
xm
20  18
t

s
n
2.12 / 5
 2.11
Remember using n-1 when calculating standard deviation s.
Tests involving the mean – the t distribution
t-distribution
is symmetric
Degree of freedom, n, n=5-1=4. Using a table of the t-distribution with four degree of
freedom, the p-value associated with this test statistic is found to be between 0.05
and 0.1. The 5% two-tail test corresponds to a critical value of 2.776. Since the pvalue is greater than 0.05 (t-value=2.11 < critical value=2.776), the evidence is not
strong enough to reject the null hypothesis of mean 18  accept H0.
The t-distribution table
- cumulative probability
starting from left hand side
Two-tails
a=0.10, 0.05
The t-distribution table
– Excel – TINV gives the two-tails critical value
Two-tails
Excel – TINV, p=0.05, degree of freedom is 3, tinv = 3.182
Tests involving the mean – the t distribution
The expression level of a gene is known to have a mean expression level
of 225 in the normal human population. The expression values have
been obtained in sixteen measurements, in which the sample mean
and s.d. are found to be 241.5 and 98.7259 respectively. Is this data
higher than the published mean at a 5% significant level?
• This is a left-hand one-tail test
• Null hypothesis H0: x≦m=225
• alternative hypothesis H1: x>m=225
• t-score = (241.5-225)/[98.7259/sqrt(16)] = 0.6685
• Degree of freedom = 15
• The 5% level corresponds to a critical value (t0.05(15)) of 1.753
• The t-score is less than the critical value, i.e. 0.6685 < 1.753.
• Based on the critical value, we can accept the null hypothesis.
• The gene expression data set is not higher than the published mean of
225 at a 5% significant level
Evaluate the significance of the following
gene expression differences – t test
Expected that (Exp./ref.) =1, is gene A or B or C up-regulated ?
Evaluate the significance of the following gene expression
differences – t test
•
•
•
•
Expect average ratio = 1, H0 : measured mean ≦1, H1: measured mean >1
left-hand one-tail test
t-score = (average -1)/(s/n0.5)
The p-values (for 16.37 and 6.71) are less than 0.05 (t0.05(4)=2.132) for genes
1 and 3 (reject H0), but not for 2. It is conclude that the level of expression is
increased only in genes 1 and 3.
Tests involving the variance – the chi-square distribution
The expression level of a gene is known to have a variance s2 = 5000 in the normal human
population. The same gene is measured 26 times and found to have a s2 = 9200 . Is there evidence
that the new measurement different from the population at a 2% significant level?
• Unknown population mean, 2 test
• Null hypotheses H0: s2 = s2 = 5000, that is the new measured variance is not different from the
population s
• The alternative hypotheses H1: s2 ≠ s2 = 5000 (two-tail test)
• The new variable of score is
(n  1) s 2
2
 
2
s
•
This variable with the interesting that if all possible samples of size n are drawn from a normal
population with a variance s2 and for each such sample the quantity is computed, these value will
always form the same distribution. This distribution will be a sample distribution called a 2 (chisquare) distribution.
p=0.99
two-tail test
reject H0
accept H0
For right hand tail,
p=0.01
reject H0
Tests involving the variance – the chi-square distribution
•
•
•
If the sample standard deviation s is close to the population standard deviation s, the value of
2 will be close to n-1 (degree of freedom)
If the sample standard deviation s is very different to the population standard deviation s, the
value of 2 will be very different from n-1
Use the 2 distribution to solve the above problem.
 
2
•
•
•
•
•
(n  1) s 2
s2

(26  1)9200
 46
5000
http://commons.bcit.ca/math/faculty/david_sabo/apples/math2441/section8/onevariance/chisqta
ble/chisqtable.htm
Assuming a 2% significant level, the critical values for 20.01(25) = 44.314 and 20.99(25) =
11.524 (right-hand tail)
Reject areas are 2 ≦ 11.524 or 2≧ 44.313
Since 46 > 44.313  reject null hypothesis
The measurement is different from the population at a 2% significant level
probability, a
The chi-square distribution
Excel - CHIINV,
uses right hand
tail
Chapter10 p242
Chapter10 p242
Chapter10 p245
Chapter10 p246
Chapter10 p250
Chapter10 p252
Chapter10 p253