ASHG2010GeneralMAXtest2 - Statistical Genetics, Kyoto
Download
Report
Transcript ASHG2010GeneralMAXtest2 - Statistical Genetics, Kyoto
General MAX test for
complicated categorical
phenotypes and genotypes
ASHG
Washington D.C., USA
2010/11/02-06
Ryo Yamada, Takahisa Kawaguchi
Kyoto Univ. Kyoto, Japan
2 phenotypes (Case, Control)
x
3 genotypes (MM,Mm,mm)
Multiple genetic models
Dominant
Recessive
Additive
6 cells in 2x3
table are
placed as 6
vectors on a
plane
Tables with the same Pearson’s
chi-sq value draw an ellipse contour
Tables with the same chi-sq value for 1 df test on
2x3 table draw a parallel line as a contour.
A surface normal represents the test of 1 df
Ellipse → Circle
Easy handling
Parallel lines and surface normal of test
of 1 df rotate
Spherization
Relation between
Pearson’s chi-sq and 1-df chi-sq gets
simple
Test Vector
a
b
Tangent point
to the smaller
circle
• In the circular coordinate, the radius to the tangent point
is perpendicular to the plane.
• In the coordinate with ellipse, the radius is NOT
perpendicular to the tangent point.
Surface normals of three
genetic models in “spherized
coordinate”
Test expression in table form
Test expression in table form
dom
rec
add
MAX3 test and MAX test
• Two sets of parallel lines with arcs make
the test contours for the MAX test
MAX3
MAX
Arc
Complex categorical phenotypes
Disease
Genotype
R1
R2
R3
R4
MM
Mm
mm
total
C1
+
+
+
-
200
1260
1470
2930
C2
+
+
-
+
180
840
980
2000
C3
+
-
+
+
90
420
490
1000
C4
-
+
+
+
90
420
490
1000
C5
+
+
+
+
270
1260
1470
3000
total
830
4200
4900
9930
• Example. A disease is defined as:
• A disease is diagnosed when 3 or more out of 4 criteria
are met.
• 5x3 table
Complex categorical phenotypes
Phenotype
Stage
•
Genotype
MM
Mm
mm
total
#> O0
360
1680
2050
4090
# [,1]
[,2] [,3]
1
270
1260
1470
3000
Ts<-MaxTables(O)
#[1,] 2 20 4013520 630
735
1500
90
490
1000
#[2,] 31000 2000
500420
60
210
245
515
#[3,] 41000 2000
500Ts<-matrix(c(1,1,0,1,1,0,1,1,0,0,0,0,1,1,0,
915 120 4200
4990
10105
#[4,]total100 120
1,1,0,1,1,0,0,0,0,1,1,0,1,1,0,
#[5,] 50 90 30 1,1,0,0,0,0,1,1,0,1,1,0,1,1,0,
#>
0,0,0,1,1,0,1,1,0,1,1,0,1,1,0,
1,0,0,1,0,0,1,0,0,0,0,0,1,0,0,
1,0,0,1,0,0,0,0,0,1,0,0,1,0,0,
Example. A disease with ordered
stages:
• A disease is diagnosed when 1,0,0,0,0,0,1,0,0,1,0,0,1,0,0,
3 or more out of 4
criteria are met.
0,0,0,1,0,0,1,0,0,1,0,0,1,0,0),
• 5x3 table
ncol=N*M,byrow=TRUE)
緑は自由度=1
青は観察テーブル
左は通常スケール、右は対数スケー
• 下が、ステージ検定
ル
> gmtOut$PowOut[1]
[1] 1.988130e-25
>
> gmtOutc$PowOut[1]
[1] 0
>
>
Max chi-sq = 12.745,
corrected P = 0.0029
同じテーブルをMaxVectorsで
• > gmtOutd$PowOut[1]
• [1] 0.003005323
How to generalize MAX test
defined for 2x3 tables, to NxM
tables?
• Space
– df : 2 → (N-1)(M-1)
• 1-df tests
– Expression in NxM table should be defined.
– Their geometric counterparts are surface
normals in df-space.
discrete MAX test
continuous MAX test
• Discrete MAX test
– The model consists of the set of surface
normals.
• Continuous MAX test
– The model is the area that the surface
normals demarcate.
Ex. df=3
•
•
•
•
•
•
The Tips of green triangles are the surface normals for discrete model
Green triangles on the surface are the area of continuous model
Black dots : Observed tables
Red arcs the shortest path from observed table to the model
The arcs concentrate into the tips in “discrete MAX test”
The arcs reaches to the edges of the model area or the tips of the area
Discrete MAX test
Continusous MAX test
discrete MAX test
continuous MAX test
How to construct df-dimensional
expression
K categories are expressed as
(K-1)-simplex or K-complete graph
3 categories in a triangle
4 categories in a tetrahedron
and so on
NxM vectors can be placed
in df-dimensional space
Pearson’s chi-sq values draws
ellipsoid contour lines,
which can be spherized
• Expected values
determine shape of
ellipsoid
Spherization
Spherization
Tables on a
contour line have
the same
statistic value
Spherization = Eigenvalue decomposition
Eigenvalue
decomposition
Spherization-based P-value
estimation for general MAX test fits
well with the permutation method
Black : Permutation
Red : Sphere method
...
...
...
...
...
...
(N-1)(M-1) component test
matrices
of MAX test for NxM tables
R code and web-based calculator of the method for 2x3 table presented are available at;
http://www.genome.med.kyoto-u.ac.jp/wiki_tokyo/index.php/Estimate_of_Pvalue_of_MAX_for_2x3_tables
Comments and questions are wellcome → [email protected]
•
Collaborators
– Graduate school of Medicine, Kyoto University, Kyoto, Japan
•
•
•
•
Takahisa Kawaguchi
Katsura Hirosawa
Meiko Takahashi
Fumihiko Matsuda
– Lab for Autoimmune Diseases, CGM, RIKEN, Yokohama, Japan
•
•
•
•
Yukinori Okada
Yuta Kochi
Akari Suzuki
Kazuhiko Yamamoto