Examples of GO Modeling

Download Report

Transcript Examples of GO Modeling

WIIFM: examples of
functional modeling
GO Workshop
3-6 August 2010
Key points
Modeling is subordinate to the biological questions/hypotheses.
Together the Gene Ontology and canonical genetic networks/pathways
provide the central and complementary foundation for modeling functional
genomics data.
Annotation follows information and information changes daily: STEP 1 in
analyzing functional genomics data is re-annotating your dataset.
Examples of how we do functional modeling of genomics datasets.
#1
#2
#3
#4
#5
#6
#7
#8
#9
#10
#11
#12
#13
#14
#15
#16
#17
#18
#19
#20
#21
#22
#23
#24
#25
#26
#27
#28
#29
#30
#31
#32
#33
Reference
A
ALBU_CHICK Serum albumin precursor (Alpha-livetin) (Allergen Gal d 5)
APA1_CHICK Apolipoprotein A-I precursor (Apo-AI)
FIBA_CHICK Fibrinogen alpha/alpha-E chain precursor [Contains: Fibrinopep
Mol_id: 1; Molecule: Ovotransferrin; Chain: Null; Synonym: Conalbumin; Hete
PB2 protein [Influenza A virus (A/chicken/Taiwan/7-5/99(H6N1))]
C Chain C, Crystal Structure Of Native Chicken Fibrinogen
I50711 complement C3 precursor - chicken
TTHY_CHICK Transthyretin precursor (Prealbumin) (TBPA)
TIM2_CHICK Metalloproteinase inhibitor 2 precursor (TIMP-2) (Tissue inhibito
AAA6469
MYH9_CHICK Myosin heavy chain, nonmuscle (Cellular myosin heavy chain
S19188 myosin-V - chicken
FIBB_CHICK Fibrinogen beta chain precursor [Contains: Fibrinopeptide B]
A Chain A, Crystal Structure Of Wild Type Turkey Delta 1 Crystallin (Eye Le
type I polyketide synthase AVES 2 [Streptomyces avermitilis MA-4680]
Hyperion protein, 419 kD isoform [Gallus gallus] 0
vitronectin [Gallus gallus] ovirus 3]
CA36_CHICK Collagen alpha 3(VI) chain precursor
paired-type homeobox Atx [Gallus gallus] I beta su
I51298 transforming protein sno-N - chicken
TP2A_CHICK DNA topoisomerase II, alpha isozyme
ITA6_CHICK Integrin alpha-6 precursor (VLA-6)
glucose regulated thiol oxidoreductase protein precursor [Gallus gallus]
spectrin alpha chain [Gallus gallus] rsor
ATP-binding cassette transporter 1 [Gallus gallus]
cone-type transducin alpha subunit [Gallus gallus]
condensin complex subunit [Gallus gallus] s] hick
BA2B_CHICK Bromodomain adjacent to zinc finger domain 2B (Extracellular
ryanodine receptor type 3 [Gallus gallus]
type I polyketide synthase AVES 4 [Streptomyces avermitilis MA-4680]
structural muscle protein titin [Gallus gallus] n k
breast cancer susceptibility protein [Gallus gallus]
FAS_CHICK Fatty acid synthase [Includes: EC 2.3.1.38; EC 2.3.1.39; EC 2.
What is the Gene Ontology?
“a controlled vocabulary that can be applied to all organisms even
as knowledge of gene and protein roles in cells is accumulating and
changing”
the de facto standard for functional annotation
 assign functions to gene products at different levels, depending on
how much is known about a gene product
 is used for a diverse range of species
 structured to be queried at different levels, eg:
 find all the chicken gene products in the genome that are
involved in signal transduction
 zoom in on all the receptor tyrosine kinases
 human readable GO function has a digital tag to allow
computational analysis of large datasets

COMPUTATIONALLY AMENABLE ENCYCLOPEDIA OF
GENE FUNCTIONS AND THEIR RELATIONSHIPS
Ontologies
Canonical and
other Networks
GO Cellular Component
Pathway Studio 5.0
GO Biological Process
Ingenuity Pathway Analyses
GO Molecular Function
Cytoscape
BRENDA
Interactome Databases
Functional Understanding
Use GO for…….
1.
2.
3.
4.
Determining which classes of gene products
are over-represented or under-represented.
Grouping gene products.
Relating a protein’s location to its function.
Focusing on particular biological pathways
and functions (hypothesis-testing).
No.
No. x 106
25000
18
16
20000
14
12
15000
10
8
10000
6
4
5000
2
0
0
‘00
‘01
‘02
‘03
‘04
‘05
‘06
‘07
‘08
‘09
YEAR
70
75
80
85
90
95
00
05
Membrane proteins grouped by GO BP
B-cells
Stroma
cell cycle/cell proliferation
cell adhesion
cell growth
apoptosis
immune response
ion/proton transport
cell migration
cell-cell signaling
function unknown
development
endocytosis
proteolysis and peptidolysis
signal transduction
protein modification
LOCATION DETERMINES FUNCTION
GO is the “encyclopedia” of gene functions
captured, coded and put into a directed acyclic
graph (DAG) structure.
In other words, by collecting all of
the known data about gene
product biological processes,
molecular functions and cell
locations, GO has become the
master “cheat-sheet” for our
total knowledge of the genetic
basis of phenotype.
Because every GO annotation
term has a unique digital code,
we can use computers to mine the
GO DAGs for granular functional
information.
Instead of having to plough through thousands of papers at the library and make notes
and then decide what the differential gene expression from your microarray experiment
means as a net affect, the aim is for GO to have all the biological information
captured and then retrieve it and compile it with your quantitative gene product
expression data and provide a net affect.
Many people use “GO Slims” which capture only high-level terms which are more
often then not extremely poorly informative and not suitable for hypothesis-testing.
“GO Slim”
In contrast, we need to use the deep
granular information rich data suitable for
hypothesis-testing
Shyamesh Kumar BVSc
a-CD30 mab
The critical time point in MD
lymphomagenesis
Susceptible (L72)
18
mean total lesion score
16
Genotype
14
Susceptible (L72)
Resistant (L61)
12
10
Resistant ( L61)
Non-MHC associated resistance and
susceptibility
8
6
4
2
0
0
20
40
60
days post infection
Burgess et al,Vet Pathol 38:2,2001
80
100
a-CD8 mab
2008, 57: 1253-1262.
Hypothesis
At the critical time point of 21 dpi, MD-resistant
genotypes have a T-helper (Th)-1 microenvironment
(consistent with CTL activity), but MD-susceptible
genotypes have a T-reg or Th-2 microenvironment
(antagonistic to CTL).
Infection of chickens (L61 & L72), kill
and post-mortem at 21dpi and
sample tissues
Whole Tissue
Cryosections
Laser Capture Microdissection (LCM)
RNA extraction
RNA extraction
Duplex QPCR
Whole tissue mRNA expression
L6 (R)
40 – mean Ct value
25
20
*
*
*
L7 (S)
*
*
15
10
5
0
mRNA
Microscopic lesion mRNA expression
L6 (R)
40 – mean Ct value
25
20
*
L7 (S)
*
15
10
*
*
*
5
0
IL-4
IL-12
IL-18
TGFβ
mRNA
GPR-83 SMAD-7 CTLA-4
CYTOKINES AND T HELPER CELL DIFFERENTIATION
NAIVE
CD4+ T
CELL
APC
Th-1
T reg
Th-2
L6 Whole
NAIVE
CD4+ T
CELL
APC
L7 Whole
T reg
Smad 7
L7 Micro
IL 12
IL 4
Th-1, Th-2, T-reg ?
Th-2
Th-1
Inflammatory?
IL 4
IFN γ
IL 12
IL10
TGFβ
IL 18
CTL
Macrophage
NK Cell
Gene Ontology based hypothesis testing
QPCR data
Relative mRNA expression data
Gene Ontology annotation
Biological Process Modeling &
Hypothesis testing
Step I. GO-based Phenotype Scoring.
Gene product
Th1
Th2
Treg
Inflammation
Step III. Inclusion of quantitative
data to the phenotype scoring table
and calculation of net affect.
Gene product
Th1
Th2
Treg
Inflammation
1.58
-1.58
IL-2
1
ND
1
-1
IL-2
1.58
IL-4
-1
1
1
ND
IL-4
0.00
0.00
0.00
0.00
1
-1
1
IL-6
0.00
-1.20
1.20
-1.20
IL-6
IL-8
ND
ND
1
1
IL-8
0.00
0.00
1.18
1.18
IL-10
-1
1
1
0
IL-10
0.00
0.00
0.00
0.00
IL-12
0.00
0.00
0.00
0.00
IL-13
1.51
-1.51
0.00
0.00
IL-18
0.91
0.91
0.91
0.91
IFN-g
0.00
0.00
0.00
0.00
TGF-b
-1.71
0.00
1.71
-1.71
CTLA-4
-1.89
-1.89
1.89
-1.89
Step II.
by quantitative
IL-12Multiply
1
-1
ND
ND
IL-13
-1
1
ND
ND
data for
each
gene
product.
IL-18
1
1
1
1
IFN-g
1
-1
1
1
TGF-b
-1
0
1
-1
CTLA-4
-1
-1
1
-1
GPR-83
-1.69
-1.69
1.69
-1.69
GPR-83
-1
-1
1
-1
SMAD-7
0.00
0.00
0.00
0.00
SMAD-7
1
1
-1
1
Net Effect
-1.29
-5.38
10.15
-5.98
ND = No data
Whole Tissue
L7 (S)
L6 (R)
120
100
Net Effect
80
60
40
20
0
-20
-40
Th-1
Th-2
T-reg
Inflammation
Microscopic lesions
5mm
L6 (R)
60
L7 (S)
50
Net Effect
40
30
20
10
0
-
10
-
20
Th-1
Th-2
T-reg
Inflammation
Phenotype
L6 Resistant
L6 (R) Whole lymphoma
Pro
T-reg
Anti CTL
L7 Susceptible
Pro
T-reg
Pro Anti
Th-1 Th-2
Pro CTL
Anti
Th-1
Pro
Th-2
Anti CTL
Pro CTL
Translation to clinical research
Bindu Nanduri
Pig
Total mRNA and protein expression was measured from
quadruplicate samples of control, electroscalple and harmonic
scalple-treated tissue.
Differentially-expressed mRNA’s and proteins identified using
Monte-Carlo resampling1.
Using network and pathway analysis as well as Gene
Ontology-based hypothesis testing, differences in specific
phyisological processes between electroscalple and harmonic
scalple-treated tissue were quantified and reported as net
effects.
(1) Nanduri, B., P. Shah, M. Ramkumar, E. A. Allen, E. Swaitlo, S. C. Burgess*, and M. L. Lawrence*. 2008.
Quantitative analysis of Streptococcus Pneumoniae TIGR4 response to in vitro iron restriction by 2-D LC
ESI MS/MS. Proteomics 8, 2104-14.
Proportional distribution of mRNA functions
differentially-expressed by Electro and Harmonic Scalpel
Electroscalpel
HYPOTHESIS TERMS
Harmonic Scalpel
Immunity (primarily innate)
Inflammation
Wound healing
Lipid metabolism
Response to thermal injury
Angiogenesis
Total differentially-expressed
mRNAs: 4302
Total differentiallyexpressed mRNAs: 1960
Net functional distribution of differentially-expressed mRNAs:
Electro-scalple
Harmonic
scalple
Sensory response to pain
Angiogenesis
Response to thermal injury
Lipid metabolism
Wound healing
Classical inflammation
(heat, redness, swelling,
pain, loss of function)
Immunity (primarily innate)
35
30
25
20
15
10
Relative bias
5
0
5
Proportional distribution of protein functions
differentially-expressed by Electro and Harmonic Scalpel
Electro-scalpel
HYPOTHESIS TERMS
Harmonic scalpel
Immunity (primarily innate)
Inflammation
Wound Healing
Lipid metabolism
Response to thermal Injury
Angiogenesis
Hemorrhage
Total differentiallyexpressed proteins: 509
Total differentiallyexpressed proteins: 433
Net functional distribution of differentially-expressed proteins
Harmonic Scalpel
Electroscalpel
Hemorrhage
Sensory response to pain
Angiogenesis
Response to thermal injury
Lipid metabolism
Wound healing
Classical inflammation
(heat, redness, swelling, pain, loss of function)
Immunity (primarily innate)
8
6
4
2
0
Relative bias
2
4
6
www.agbase.msstate.edu