Central Limit Theorem - Creighton University

Download Report

Transcript Central Limit Theorem - Creighton University

Hypothesis Testing
J.D. Bramble, Ph.D.
Creighton University Medical Center
Med 483 -- Fall 2006
Hypothesis Testing


Hypothesis or statistical testing is an
“inference-making” procedure.
Used to answer questions such as:


“ Is the sample mean equal to a specified value,
i.e., the population mean ?”
If an observed difference between the
experimental and control group exist, is it
real?
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Is the Difference Real ?

Sources of differences:





Non-random error or systematic error
Random error or experimental error
Treatment -- i.e., the experimental intervention
Non-random error is minimized by
maximizing internal validity
Statistical procedures help to deal with
random error.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Hypothesis testing




The goal of hypothesis testing is to reject the
null hypothesis--Ho.
Ho is never accepted -- “fail to reject”.
Hypothesis testing does not prove a hypothesis
Hypothesis testing indicates whether the
hypothesis is supported by the data.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Hypothesis testing


Based on a concept of “proof” by contradiction.
Is composed of the following steps:







Null hypothesis -- Ho
Alternative or research hypothesis -- Ha
Rejection region
Test statistic
Calculate results
Decision
Conclusion
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
BP’s of Medical Students

Concerned about the stress level of their
medical students, CUMC records the BPs of 100
medical students. The mean was 132.4 mmHg
and the standard deviation 14.0 mmHg.
Assuming that blood pressure is normally
distributed with  = 128.1 and s = 17. Does the
data suggest that CUMC medical students
exhibit higher blood pressures than the normal
population?
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Step 1 – Write the Hypotheses

The Direction of the hypotheses


Non-directional: a two-tailed test non directional statement (  )
Directional: a one-tailed test directional statement (< or >).

Write the alternative first then the appropriate null

For this example we are interested in an increase

Ho:  < 128.1 (i.e.,  is less than or equal to 140 mmHg systolic)

Ha:  > 128.1 (i.e.,  is greater that 140 mmHg systolic)
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Step 2 – Setting the level of
significance & rejection region


The rejection region or level of significance
is identified -- known as the alpha level ( )
The probability of incorrectly rejecting Ho



usually small to avoid rejecting Ho when true
0.05, 0.01, and 0.001
Identifying the critical value is done by
using the appropriate distribution table.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Step 2 – Setting the level of
significance & rejection region
The rejection region directional hypotheses at for  = 0.05
using a standard normal distribution (i.e., the z-distribution)
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Step 2 – Setting the level of
significance & rejection region
The rejection region non-directional hypotheses at for  = 0.05
using a standard normal distribution (i.e., the z-distribution)
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Step 3 -- The test statistic



Statistical tests are based largely on the specification of
the data and the problem being analyzed.
Each test has its own assumptions and requirements
that must be met.
One the test is determine the test statistic can be
computed.


z, t, F, chi square, etc.
The test statistic is compared to the critical value to
determine whether the null hypothesis is rejected.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Step 3 -- The test statistic




For this example, we are testing the mean of a sample
around the population mean or standard.
The z-test is appropriate when we are testing a sample
mean around the population mean and the value of s is
known or, alternatively, when the sample size is large.
The z-statistic is:
x
z
s/ n
x = the sample mean;  = the population mean; s = the
population standard deviation; and n = sample size
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Step 4 – Calculate the statistic

For our example, recall:
n  100, x  132.4;   128.1;s  17
x   132.4  128.1 4.3
z


 2.53
1.7
s n
17 100

Thus, the z-statistic or zstat = 2.53
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Step 5 – Making the decision

Only two decisions to choose from.



Compare zstat to zcrit


Reject Ho
Fail to reject Ho
Reject Ho if |zstat| > |zcrit|
A visual diagram of the distribution helps in
making the correct decision.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Step 5 – Making the decision


Recall zstat = 2.53 and zcrit = 1.645
Since zstat > zcrit we reject Ho
Reject Region
()
Fail to Reject
Region
1.645
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Step 5 – Making the decision




The probability of finding a value that
extreme is less than 0.05
The actual p-value cannot be obtained from
the table.
Could be reported as p < 0.05
p-values are usually reported in the
literature as exact values except when very
small--then reported as p < 0.001
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Step 6 -- Conclusions



What does this finding mean?!
This is where we interpret what the
mysterious black box “spit” out at us.
The decision to reject Ho means that on
average CUMC medical students have
statistically higher blood pressures than the
rest of the population.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
P-values

How sure are you that there is in fact, a difference
between populations.

Recall, the difference between samples may just be a
coincidence.

P-values tell you how rare such a coincidence would be.

p-values are the probability of getting a difference as
big or bigger than was found if H0 is really correct
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
P-values


Statistically significance v. importance
Borderline p-values


Extremely significant results


p = 0.049 v. p = 0.050 v. p = 0.051
p = 0.004 v. p = 0.000004
Non significant do not prove H0

the data are not strong enough to persuade one
to reject H0
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Errors in Hypothesis Testing

Two types of errors


Type I errors are committed when we reject
the null hypothesis when it is true.


Type I and Type II
The probability of committing this error is 
Type II errors are committed when we fail
to reject the null hypothesis when it is false.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Type I Errors




Type I errors are stated a priori.
Type I errors refer to the rate researchers are
willing to accept rejecting Ho when true.
If  = 0.05, then Ho is rejected when true
5% of the time.
You are 95% sure of not making a mistake
regarding Ho.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Type II Errors and Power




Type II errors are not typically discussed in
the literature.
The concept of power is related to 
Power is the probability of rejecting Ho.
when it is false and accepting H1.
Power is the capability of the study to detect
a true difference.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Summary of error types

Relationships between types of errors and
decisions
True state of nature
D
e
Fail to reject H0
c
(“Accept” H0)
i
s
Reject H0
i
o
n
H0 is true H0 is false
Correct
Type II
Type I
()
Correct
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Relationship of Type I and II
Errors and Power

Type I errors are similar to “false-positives”.

Type II errors are similar to “false-negatives”.

Power is similar to the sensitivity of a
diagnostic test.

The ability to detect the presence of a disease
when present.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
The Power of a Test


The probability that a test will produce a
significant difference at a given 
Depends on the following



the true difference between the populations
the sample size
and the significance level
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Power Analysis




Power analysis is the process of determining
the power of a study to detect an existing
difference.
Determine how large a sample needs to be to
detect a difference of a specific magnitude.
Done a priori.
Low power may result in missing the presence
of reasonable differences that may exist
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Power Analysis


In general, as sample size increases, the
power to detect an actual difference also
increases.
When no significant difference is detected,
the sample size should be examined.

Difference may exist but the sample size was
too small to detect it.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Power Analysis: Example

Comparing three methods of managing fever in
neurologic patients
Method
Acetaminophen only
Hypothermia with acetaminophen
Acetaminophen with sponging
n
7
7
7
Mean
110 min
100 min
144 min
Std Dev
60.57
20.7
60.59
A one way ANOVA of the time required to reduce temperature
was performed. The analysis revealed no significant difference
among the methods (p<0.27).
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Power Analysis: Example


Should we believe these results? Is there
really no significant difference between the
three methods?
Results from the power analysis:


power  0.2842
thus this test has a poor chance of detecting a
difference even though it may exist.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Power Analysis: Example

Comparing three methods of managing fever in
neurologic patients
Method
Acetaminophen only
Hypothermia with acetaminophen
Acetaminophen with sponging
n
7
7
7
Mean
110 min
100 min
144 min
Std Dev
60.57
20.7
60.59
A one way ANOVA of the time required to reduce temperature
was performed. The analysis revealed no significant difference
among the methods (p<0.27).
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Power Analysis: Example


Should we believe these results? Is there
really no significant difference between the
three methods?
Results from the power analysis:


power  0.2842
thus this test has a poor chance of detecting a
difference even though it may exist.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Estimating Sample Size

Determine sample size in advance


Enough subjects / Too many subjects
How large a sample size do I need to obtain
a significantly meaningful result?



How much variability (i.e., standard deviation)?
Level of confidence or significance level?
How much difference?
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Comparing Two Sample Means
J.D. Bramble, Ph.D.
Creighton University Medical Center
Med 483 -- Fall 2006
Objectives





To know when to use the t-test
To be able to write the appropriate
hypotheses
To be able to use Excel to analyze data with
the t-test
To be able to interpret the results of a t-test
To know when to use a paired and an
independent t-test
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Requirements or Assumptions





Measures are independent
Population is approximately normal
Homogeneity of variance.
Independent variable is discrete/categorical
Dependent variable is continuous
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Why Use the T-test


Because often the requirement of
“knowing” the population standard
deviation is at best a rough estimate.
Produces a more conservative statistic


for smaller samples
when less information is available about the
population
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Types of t-tests

Paired


Used when the data are not independent (i.e.,
repeated measures)
Independent or two-sample

Used to compare two samples
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Characteristics




The Student’s t distribution to determine the
critical values of t
There is a different t-distribution for every
sample size.
A particular t-distribution is specified by the
degrees of freedom.
Symmetric like the Normal distribution.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Paired t-test: when and why





Dependent groups
Same subjects vs. Matched pairs.
Subjects are compared against themselves.
Comparing measures of a paired or matched
subjects is stronger than comparing measures
of different subjects.
data is collected on them before and after.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Paired t-test: assumptions






The pairs are randomly selected or at least
representative of the population of interest.
Any matching occurs prior to data collection.
Each subject (e.g., pair) is selected independently.
The distribution of difference must follow a
Normal distribution.
The same number of subjects in each group
Both before and after measures exist
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Paired t-test

A study is designed to determine if a new
memory drug increases test scores. Five
subjects took a memory test before the
administration of the drug (T1) and then again
after a week of taking the drug (T2).
Researchers are interested to see if the new
drug increased one’s memory.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Example data
Patient
Test-1
Test-2
Diff
A
120
115
-5
B
80
95
15
C
90
105
15
D
110
120
10
E
95
100
5
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Paired t-test: steps






Set up null and alternative hypotheses
Determine significance level, , and tcrit
For each subject calculate the change in the
variable.
Calculate the mean and SE of these differences
and then the t-statistic
Compare tstat to tcrit and p-value to alpha
Make appropriate decisions and conclusions
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Paired t-test: hypotheses



(tcalc > tcrit)
H0: diff < 0
Ha: diff > 0
H0: diff > 0
Ha: diff < 0
H0: diff = 0
Ha: diff  0
t
(tcalc > tcrit)
t
(|tcalc| > tcrit)
t
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Rejection Region, Significance
Level, and Critical Value




The rejection region is where Ho is rejected.
Determined by the level of significance.
For one tailed test, when  = 0.05, the critical
value divides the distribution in two with 5% of
the distribution above the critical value.
This value can be found in the t-table with a
known  and sample size.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Finding the Critical Value

Significance level:  = 0.05

Sample size = 5

Degrees of freedom (df)


for a paired t-test = n – 1
Use the table to find t(0.05, 4)
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Paired t-test: the test statistic

The one sample t-statistic is:
t
x   diff
s
n
The statistic follows a t-distribution with n-1 degrees of
freedom
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Data
Student
Test-1
Test-2
Diff
A
120
115
-5
B
80
95
15
C
90
105
15
D
110
120
10
E
95
100
5
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Calculating the Test Statistic
x diff
( 5  15  15  10  5) 40


8
5
5
SEdiff  s
t 
n  8.36
x   diff
s
n
5  3.74
80

 2.14
3.74
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Summary


H0 = diff < 0
Ha = diff > 0

significance level:  = 0.05
critical value: t(0.05, 4) = 2.132

test statistic: = 2.14

J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Decision and Conclusion

Decision: tcrit = 2.132 and tstat = 2.14; thus,
since tstat > tcrit H0 is rejected.

Conclusion: the mean memory test scores
increased after taking the new memory
drug.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Independent samples

In many cases we may not be able to have paired
data. For example:




A paired study design may not be appropriate
A quasi-experimental design may not have paired data
Instead of paired data we now have independent
data.
In these cases, we use an independent t-test to
analyze the data.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Independent samples: assumptions

When dealing with independent data we
have to make an additional assumption

That not only the means of the sample
follow a Normal distribution, but so does
the variances of the samples.

Also, we assume that the variances, s12 and
s22, are equal
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Independent samples: assumptions



This assumptions implies that we are
examining populations that are believed to
have the same spread.
Some suggest the use of a formal test to
determine if s12 = s22
Accepting the assumption of equal variances
we now estimate the common unknown
variance -- s2
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Independent t-test: example

Researchers compared patients taking
placebo to those taking the new
antihypertensive drug. Is there a significant
difference in BPs between the groups?
Control
125
90
95
105
100
Drug
120
105
110
115
105
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Independent t-test: steps






Set up null and alternative hypotheses
Determine significance level, , and tcrit
Compute sp
Compute tstat
Compare tstat to tcrit and p-value to alpha
Make appropriate decisions and conclusions
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Two sample t-test: hypotheses


(tcalc > tcrit)
Ho: 1 > 2 or 1- 2 < 
Ha: 1 > 2 or 1- 2 > 
t


Ho: 1 > 2 or 1- 2 = 
Ha: 1 < 2 or 1 - 2 < 

Ho: 1= 2 or 1- 2 = 

Ha: 1  2 or 1- 2  
(tcalc > tcrit)
t
(|tcalc| > tcrit)
t
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Independent t-test: Critical Value

Significance level:  = 0.05

Sample size = 10

Degrees of freedom (df)


for an independent t-test assuming equal
variances df = n – 2
Use the table to find t(0.05, 8)
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Pooled variance: when s12 = s22


The common value is known as the pooled
sample variance -- sp.
It is derived by taking the square root of a
weighted average of s12 and s22.
s (n1  1)  s (n2  1)
s 
n1  n2  2
2
p
2
1
2
2
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Pooled Variance: when s12 = s22
s (n1  1)  s (n2  1)
sp 

n1  n2  2
2
1
2
2
13.51 (5  1)  6.52 (5  1)
 112.5
552
2
2
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
The test statistic when s12 = s22

The two sample t-statistic is:
t
x1  y 1
s
2
p
n1

s
2
p
n2
The statistic follows a t-distribution with n-2 degrees of freedom
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Independent t-test: calculating
the test statistic
x1  x 2
103 111
t

 1.93
sp sp
112.5 112.5


5
5
n1 n2
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Summary


H0: 1 - 2 = 0
Ha: 1 - 2  0

significance level:  = 0.05
critical value: t(0.05,8) = 2.306

test statistic: = -1.193

J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Independent t-test: example

Decision: since |-1.193 | < 2.306 (tstat < tcrit ), we
fail to reject H0

Conclusion: the data do no support the claim that
patients taking the antihypertensive drug had
significantly different blood pressure than patients
taking the placebo. Said another way, there is not
enough evidence to conclude the new drug is
significantly different from the placebo.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Chi-square
Testing the independence
between two categorical variables
J.D. Bramble, Ph.D.
Creighton University Medical Center
Med 483 -- Fall 2006
Objectives




Indicate the kinds of data and circumstances that
call for a chi-square test
Compute the expected value for a chi-square
contingency table
Compute the chi-square statistic and the
appropriate degrees of freedom
Indicate the type of hypotheses that can be tested
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Using the chi-square test

ANOVA and t-tests data require data that are
quantitative -- ratio and interval level

Often data is recorded as frequency, categorical
or qualitative information
This type of data is nominal and is usually
classified in a contingency table.


The use of the chi-square test helps us determine
if there is an association between the variables
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Using the chi-square test

Data must be independent

The chi-square compares the observed frequency
with the expected frequency

The expected frequencies are calculated based on
the hypothesis that there is no relationship

The test helps to determine if the deviation
between what is observed and what is expected
(O-E) are significant
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Chi-square example

To examine the explanation that the hereditary condition
sickle cell trait offers some protection against malaria
infection, 543 African children where checked for the trait
and malaria. Do the data provide evidence in favor of this
explanation?
Malaria
Heavy
Infection
Noninfected or
Lightly Infected
Sickle-cell Yes
trait
No
36
100
136
152
255
407
Total
188
355
543
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Step 1: hypotheses




H0: there is no relationship between sickle-cell
trait and malaria heavy malaria infection.
Ha: there is a relationship between sickle-cell trait
and malaria heavy malaria infection.
We first need to compute expected values.
Expected values are generated based on the null
hypothesis.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Step 2: Critical value



The critical value for the rejection region region is
found with regards to  and the appropriate
degrees of freedom.
The significance level is  = 0.05
The degrees of freedom = (c-1)(r-1)


where
c = the number of columns
r = the number of rows
The critical value 2(, df) is obtained from the
appropriate table
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Step 3: The test statistic

To get the 2 test statistics we must calculate the
expected observation assuming there is no relationship
between the two variables.

If there is no relationship between the variables the two
variables are independent of each other and we can
calculate the expected cells using the rules of probability.

Expected cells are what you would expect, assuming Ho
is true
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Computing Expected Cells

Expected cells are found by multiplying the
row total (Rn) by the column total (Cn) and
dividing by the grand total (T).
Malaria
Heavy
Noninfected or
Infection
Lightly Infected
Sickle-cell Yes
trait
No
Total
(R1*C1)/T
(R2*C1)/T
C1
(R1*C2)/T
(R2*C2)/T
C2
R1
R2
T
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Expected cells
Obs
Heavy
Infection
Noninfected or
Lightly Infected
Sickle-cell Yes
trait
No
36
100
136
152
255
407
Total
188
355
543
Exp
Heavy
Infection
Sickle-cell Yes
trait
No
Total
Noninfected or
Lightly Infected
47.09
88.91
136
140.91
266.09
407
188
355
J.D. Bramble, Ph.D.
543 MED
483 – Fall 2006
2 statistic


The chi-square statistic is obtained by
summing the squared deviation the
observed and expected values.
2 =  (O-E)2 / E

where O = observed cells
E = expected cells
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
2 statistic

2 =  (O-E)2 / E
(36-47.09)2/47.09 + (100-88.91)2/88.91 +
(152-140.91)2/140.91 + (255-266.09)2/266.09 =
= 2.61 +1.38 + 0.87 + 0.46 = 5.32
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Chi -square example: decision

Degrees of freedom = (2-1)(2-1) = 1

From the table: 2crit = 3.841

Since 2stat (5.32) > 2crit (3.84) we reject H0


Conclude that there is a significant association
between sickle-cell trait and malaria infection.
The table indicates that the association appears to be
positive in helping to prevent malaria infection
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Chi -square limitations



The closeness of the approximation or the
accuracy of the test depends on the
frequency of size of the various cells.
Thus, the basic rule that must be followed
is: the expected frequencies must not be too
small.
“Small” varies with type of 2 test.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
“Small” rule

The accepted general rule:
No expected frequency should be < 1 and not
more than 20% of the cells should have
expected frequencies < 5

If the rule is violated, results are not valid.
Merging or collapsing cells may fix the problem
If frequencies are too small other tests should be
used (i.e., Fisher’s exact test)


J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Using Excel
Chi-sqare test
Observed
Male
Female
Total
No
xxxx
xxxx
=sum(B5:B6)
Yes
xxxx
xxxx
=sum(C5:C6)
Total
=sum(B5:C5)
=sum(B6:C6)
=sum(B7:C7)
No
=D5*B7/D7
=D6*B7/D7
=sum(B11:B12)
Yes
=D5*C7/D7
=D6*C7/D7
=sum(C11:C12)
Total
=sum(B11:C11)
=sum(B12:C12)
=sum(B13:C13)
Expected
Male
Female
Total
pvalue
=chitest(B5:C6, B11:C12)
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Using Excel
Observed
Yes
No
Total
Heavy Infection
36
152
188
None or Light Infection
100
255
355
Total
136
407
543
Heavy Infection
47.08655617
140.9134438
188
None or Light Infection
88.91344383
266.0865562
355
Total
136
407
543
Expected
Yes
No
Total
pvalue
0.02099889
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Fisher’s Exact Test
a
c
a+c



b
a+b
d
c+d
b+d
n
p = (a+b)!(c+d)!(a+c)!(b+d)!
n!a!b!c!d!
The decision is made by comparing this probability to the significance
level
Must compute the probability for app possible outcomes with the same
fixed margins
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Fisher’s Example

12 lab rats are randomly assigned to two
equal-sized groups. The experimental group
is administered a proposed carcinogenic
agent. The rats are observed for the
development of tumors.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Example Cont’d
Tumor
No Tumor
Exp
4
2
6
Cntrl
1
5
6
5
7
12



Is the likelihood of developing a tumor the same for
both groups
H0: the groups and tumor development are
independent
H1: the two variables are not independent
J.D. Bramble, Ph.D.
MED 483 – Fall 2006
Example cont’d



Compute p(4 tumors) in the experimental
group: p=(6!6!5!7!)/(12!4!1!2!5!) = 0.1136
Compute p(5 tumors) =
p=(6!6!5!7!)/(12!5!0!1!6!) = 0.1136
Since 0.1136+0076=0.1212 is > 0.05 H0 can
not be rejected; thus we assume the two
groups are independent.
J.D. Bramble, Ph.D.
MED 483 – Fall 2006