Examples of General Linear Models

Download Report

Transcript Examples of General Linear Models

Experimental design and
statistical analyses of data
Lesson 1:
General linear models and design of
experiments
Examples of General Linear Models
(GLM)
Simple linear regression:
Ex:
Depth at which a white disc is no longer visible in a lake
y   0  1 x  
10
8
Depth (m)
y = depth at disappearance
Dependent
x = nitrogen concentration of water
variable
Slope
β0
6
β1
4
2
0
0
2
4
6
N/volume water
Intercept
Independent
The residual
ε expresses
variable
the deviation between the model
and the actual observation
8
10
Polynomial regression:
Ex::
y = depth at disappearance
x = nitrogen concentration of water
10
Depth (m)
8
y   0  1 x   2 x  
2
6
4
2
0
0
2
4
6
N/volume water
8
10
Multiple regression:
Eks:
y = depth at disappearance
x1 = Concentration of N
x2 = Concentration of P
10
10
8
8
6
Depth
6
4
4
2
2
0
0
0
0
2
2
4
Concentration of P
4
6
6
8
y   0  1 x1   2 x2   3 x1 x2  
Depth
8
Concentration of N
Analysis of variance (ANOVA)
Ex:
10
8
Depth
y = depth at disappearance
x1 = Blue disc
x2 = Green disc
6
4
2
x1= 0; x2x==00; x = 1
x1=1 1; x2=2 0
y   0  1 x1   2 x2  
0
White
Blue
Disc color
Green
Analysis of covariance (ANCOVA):
Ex:
10
8
Depth
y = depth at disappearance
x1 = Blue disc
x2 = Green disc
x3 = Concentration of N
6
4
2
0
0
2
4
6
Concentration of N
y   0  1 x1   2 x2   3 x3   4 x1 x3   5 x2 x3  
8
10
Nested analysis of variance:
Ex:
y = depth at disappearance
αi = effect of the ith lake
β(i)j = effect of the jth measurement in the ith lake
y     i   (i ) j  
What is not a general linear model?
y = β0(1+β1x)
y = β0+cos(β1+β2x)
Other topics covered by this course:
• Multivariate analysis of variance
(MANOVA)
• Repeated measurements
• Logistic regression
Experimental designs
Examples
Randomised design
• Effects of p treatments (e.g. drugs) are
compared
• Total number of experimental units
(persons) is n
• Treatment i is administrated to ni units
• Allocation of treatments among units is
random
Example of randomized design
• 4 drugs (called A, B, C, and D) are tested
(i.e. p = 4)
• 12 persons are available (i.e. n = 12)
• Each treatment is given to 3 persons (i.e. ni
= 3 for i = 1,2,..,p) (i.e. design is balanced)
• Persons are allocated randomly among
treatments
A
y1A
y2A
y3A
yA
y


nA
Drugs
C
y1C
y2C
y3C
B
y1B
y2B
y3B
jA
yB
y


nB
jB
yC
y


nC
jC
D
y1D
y2D
y3D
yD
y


nD
Total
jD
y

y 
ij
n
yA  yA  
yB  yB  
yC  yC  
yD  yD  
Note!
Different persons
yA  yA    0  
x1  1 y B  y B     0  1  
x 2  1 yC  yC     0   2  
x3  1 y D  y D     0   3  
y   0  1 x1   2 x2   3 x3  
yA  0
y B   0  1  1  yB  y A
yC   0   2   2  y C  y A
yD   0   3   3  y D  y A
Source
Estimate of  0
Treatments (  1  2  3 )
Residuals
Total
Degrees of freedom
1
p-1=3
n-p = 8
n = 12
Randomized block design
• All treatments are allocated to the
same experimental units
• Treatments are allocated at random
B
A
D
C
C
B
A
D
Blocks (b = 3)
B
D
A
C
Treatments (p = 4)
Treatments
1
Persons
2
3
Average
A
B
C
D
Average
y1 A
y1B
y1C
y1D
y1
y2 A
y2 B
y2C
y2 D
y2
y3 A
y3B
y3C
y3 D
y3
yA
yB
yC
yD
y
y   0  1 x1   2 x2   3 x3   4 x4   5 x5  
Blocks (b-1)
Treatments (p-1)
Randomized block design
Source
Degrees of freedom
Estimate of  0
Blocks (persons)
Treatments ( drugs )
Residuals
1
b-1=2
p-1 = 3
n-[(b-1)+(p-1)+1] = 6
Total
n = 12
Double block design (latin-square)
1
Sequence 2
3
4
1
B
A
C
D
Person
2
D
C
A
B
3
A
D
B
C
4
C
B
D
A
Rows (a = 4)
Columns (b = 4)
y   0  1 x1   2 x2   3 x3   4 x4   5 x5   6 x6   7 x7   8 x8   9 x9  
Sequence (a-1)
Persons (b-1)
Drugs (p-1)
Latin-square design
Source
Estimate of  0
Rows (sequences)
Blocks (persons)
Treatments ( drugs )
Residuals
Total
Degrees of freedom
1
a-1 = 3
b-1=3
p-1 = 3
n-[3(p-1)+1] = 6
n = p2 = 16
Factorial designs
• Are used when the combined effects of two
or more factors are investigated
concurrently.
• As an example, assume that factor A is a
drug and factor B is the way the drug is
administrated
• Factor A occurs in three different levels
(called drug A1, A2 and A3)
• Factor B occurs in four different levels
(called B1, B2, B3 and B4)
Factorial designs
Factor B
Factor A
B1
B2
B3
B4
Average
A1
y11
y12
y13
y14
y1
A2
y21
y22
y23
y24
y 2
A3
y31
y32
y33
y34
y 3
Average
y 1
y 2
y 3
y 4
y
yij   0  1 x1   2 x2   3 x3   4 x4   5 x5  
Effect of A
Effect of B
No interaction between A and B
Factorial experiment with no interaction
•
•
•
•
•
Survival time at 15oC and 50% RH: 17 days
Survival time at 25oC and 50% RH: 8 days
Survival time at 15oC and 80% RH: 19 days
What is the expected survival time at 25oC and 80% RH?
An increase in temperature from 15oC to 25oC at 50% RH decreases
survival time by 9 days
• An increase in RH from 50% to 80% at 15oC increases survival time
by 2 days
• An increase in temperature from 15oC to 25oC and an increase in RH
from 50% to 80% is expected to change survival time by –9+2 = -7
days
Factorial experiment with no interaction
25
20
Survival time (days)
80 % RH
50 % RH
15
10
5
0
10
15
20
Temperature (oC)
25
30
Factorial experiment with no interaction
25
20
Survival time (days)
80 % RH
50 % RH
15
10
5
0
10
15
20
Temperature (oC)
25
30
Factorial experiment with no interaction
25
20
Survival time (days)
80 % RH
50 % RH
15
10
5
0
10
15
20
Temperature (oC)
25
30
Factorial experiment with no interaction
25
20
Survival time (days)
80 % RH
50 % RH
15
10
5
0
10
15
20
Temperature (oC)
25
30
Factorial experiment with no interaction
25
yij   0  1 x1   2 x2  
20
Survival time (days)
2
15
1
10
0
5
0
10
15
20
Temperature (oC)
25
30
Factorial experiment with interaction
25
yij   0  1 x1   2 x2   3 x1 x2  
20
Survival time (days)
2
15
1
10
0
3
5
0
10
15
20
Temperature (oC)
25
30
Factorial designs
Factor B
Factor A
B1
B2
B3
B4
Average
A1
y11
y12
y13
y14
y1
A2
y21
y22
y23
y24
y 2
A3
y31
y32
y33
y34
y 3
Average
y 1
y 2
y 3
y 4
y
yij   0  1 x1   2 x2   3 x3   4 x4   5 x5   6 x1 x3   7 x1 x4   8 x1 x5   9 x2 x3  10 x2 x4  11x2 x5  
Effect of A
Effect of B
Interactions between A and B
Two-way factorial design
with interaction, but without replication
Source
Estimate of  0
Factor A (drug)
Factor B (administration)
Interactions between A and B
Residuals
Total
Degrees of freedom
1
a-1 = 2
b-1=3
(a-1)(b-1) = 6
n- ab = 0
n = ab = 12
Two-way factorial design
without replication
Source
Degrees of freedom
0
Estimate of
Factor A (drug)
Factor B (administration)
Residuals
1
a-1 = 2
b-1=3
n- a-b+1 = 6
Total
n = ab = 12
Without replication it is necessary to assume no interaction between factors!
Two-way factorial design
with replications
Source
Estimate of  0
Factor A (drug)
Factor B (administration)
Interactions between A and B
Residuals
Total
Degrees of freedom
1
a-1
b-1
(a-1)(b-1)
ab( r-1)
n = rab
Two-way factorial design
with interaction (r = 2)
Source
Degrees of freedom
Estimate of  0
Factor A (drug)
Factor B (administration)
Interactions between A and B
Residuals
1
a-1 = 2
b–1=3
(a-1)(b-1) = 6
ab( r-1) = 12
Total
n = rab = 24
Three-way factorial design
Factor A
Factor A
Factor B
Factor C
y ijk   0   1 x1   2 x 2   3 x3   4 x 4   5 x5   6 x6   7 x7   8 x8   9 x9   10 x10
Factor A
Factor B
Factor C
10 Main effects
 11x1 x3  12 x1 x4  13 x1 x5  14 x1 x6  15 x1 x7  16 x1 x8  17 x1 x9  18 x1 x10  19 x2 x3   20 x2 x4  
31 Two-way interactions
  41 x1 x3 x8   42 x1 x3 x9   43 x1 x3 x10   44 x1 x4 x8   45 x1 x4 x9     70 x2 x7 x8   71 x2 x7 x9   72 x2 x7 x10  
30 Three-way interactions
Three-way factorial design
Source
Estimate of  0
Factor A
Factor B
Factor C
Interactions between A and B
Interactions between A and C
Interactions between B and C
Interactions between A, B and C
Residuals
Total
Degrees of freedom
1
a-1 = 2
b–1=5
c-1 = 3
(a-1)(b-1) = 10
(a-1)(c-1) = 6
(b-1)(c-1) = 15
(a-1)(b-1)(c-1) = 30
abc( r-1) = 0
n = rabc = 72
Why should more than two levels of a factor
be used in a factorial design?
Two-levels of a factor
30
Survival time (days)
25
20
15
10
5
0
10
15
20
Temperature (oC)
25
30
Three-levels
factor qualitative
30
y   0  1 x1   2 x2  
Survival time (days)
25
1
20
15
10
2
0
5
0
10
15
20
25
Temperature (oC)
Low
Medium
High
30
Three-levels
factor quantitative
30
y   0  1 x   2 x 2  
Survival time (days)
25
20
15
10
5
0
10
15
20
Temperature (oC)
25
30
Why should not many levels of
each factor be used in a factorial
design?
Because each level of each factor
increases the number of
experimental units to be used
For example, a five factor experiment with
four levels per factor yields 45 = 1024
different combinations
If not all combinations are applied in an
experiment, the design is partially factorial