Gene Ontology-based hypothesis testing
Download
Report
Transcript Gene Ontology-based hypothesis testing
Examples of functional
modeling.
Iowa State Workshop
11 June 2009
All tools and materials from this workshop
are available online at the AgBase database
Educational Resources link.
For continuing support and assistance
please contact:
[email protected]
This workshop is supported by USDA CSREES grant number MISV-329140.
"Today’s challenge is to realise
greater knowledge and
understanding from the data-rich
opportunities provided by modern
high-throughput genomic
technology."
Professor Andrew Cossins,
Consortium for Post-Genome Science, Chairman.
Systems Biology Workflow
Nanduri & McCarthy CAB reviews, 2008
Key points
Modeling is subordinate to the biological questions/hypotheses.
Together the Gene Ontology and canonical genetic networks/pathways
provide the central and complementary foundation for modeling functional
genomics data.
Annotation follows information and information changes daily: STEP 1 in
analyzing functional genomics data is re-annotating your dataset.
Examples of how we do functional modeling of genomics datasets.
Who uses GO? http://www.ebi.ac.uk/GOA/users.html
#1
#2
#3
#4
#5
#6
#7
#8
#9
#10
#11
#12
#13
#14
#15
#16
#17
#18
#19
#20
#21
#22
#23
#24
#25
#26
#27
#28
#29
#30
#31
#32
#33
Reference
A
ALBU_CHICK Serum albumin precursor (Alpha-livetin) (Allergen Gal d 5)
APA1_CHICK Apolipoprotein A-I precursor (Apo-AI)
FIBA_CHICK Fibrinogen alpha/alpha-E chain precursor [Contains: Fibrinopep
Mol_id: 1; Molecule: Ovotransferrin; Chain: Null; Synonym: Conalbumin; Hete
PB2 protein [Influenza A virus (A/chicken/Taiwan/7-5/99(H6N1))]
C Chain C, Crystal Structure Of Native Chicken Fibrinogen
I50711 complement C3 precursor - chicken
TTHY_CHICK Transthyretin precursor (Prealbumin) (TBPA)
TIM2_CHICK Metalloproteinase inhibitor 2 precursor (TIMP-2) (Tissue inhibito
AAA6469
MYH9_CHICK Myosin heavy chain, nonmuscle (Cellular myosin heavy chain
S19188 myosin-V - chicken
FIBB_CHICK Fibrinogen beta chain precursor [Contains: Fibrinopeptide B]
A Chain A, Crystal Structure Of Wild Type Turkey Delta 1 Crystallin (Eye Le
type I polyketide synthase AVES 2 [Streptomyces avermitilis MA-4680]
Hyperion protein, 419 kD isoform [Gallus gallus] 0
vitronectin [Gallus gallus] ovirus 3]
CA36_CHICK Collagen alpha 3(VI) chain precursor
paired-type homeobox Atx [Gallus gallus] I beta su
I51298 transforming protein sno-N - chicken
TP2A_CHICK DNA topoisomerase II, alpha isozyme
ITA6_CHICK Integrin alpha-6 precursor (VLA-6)
glucose regulated thiol oxidoreductase protein precursor [Gallus gallus]
spectrin alpha chain [Gallus gallus] rsor
ATP-binding cassette transporter 1 [Gallus gallus]
cone-type transducin alpha subunit [Gallus gallus]
condensin complex subunit [Gallus gallus] s] hick
BA2B_CHICK Bromodomain adjacent to zinc finger domain 2B (Extracellular
ryanodine receptor type 3 [Gallus gallus]
type I polyketide synthase AVES 4 [Streptomyces avermitilis MA-4680]
structural muscle protein titin [Gallus gallus] n k
breast cancer susceptibility protein [Gallus gallus]
FAS_CHICK Fatty acid synthase [Includes: EC 2.3.1.38; EC 2.3.1.39; EC 2.
What is the Gene Ontology?
“a controlled vocabulary that can be applied to all organisms
even as knowledge of gene and protein roles in cells is
accumulating and changing”
the de facto standard for functional annotation
assign functions to gene products at different levels, depending
on how much is known about a gene product
is used for a diverse range of species
structured to be queried at different levels, eg:
find all the chicken gene products in the genome that are
involved in signal transduction
zoom in on all the receptor tyrosine kinases
human readable GO function has a digital tag to allow
computational analysis of large datasets
COMPUTATIONALLY AMENABLE ENCYCLOPEDIA OF
GENE FUNCTIONS AND THEIR RELATIONSHIPS
Use GO for…….
1.
2.
3.
4.
Determining which classes of gene
products are over-represented or underrepresented.
Grouping gene products.
Relating a protein’s location to its function.
Focusing on particular biological pathways
and functions (hypothesis-testing).
Membrane proteins grouped by GO BP:
B-cells
Stroma
cell cycle/cell proliferation
cell adhesion
cell growth
apoptosis
immune response
ion/proton transport
cell migration
cell-cell signaling
function unknown
development
endocytosis
proteolysis and peptidolysis
signal transduction
protein modification
LOCATION DETERMINES FUNCTION
GO is the “encyclopedia” of gene functions
captured, coded and put into a directed acyclic
graph (DAG) structure.
In other words, by collecting all of
the known data about gene
product biological processes,
molecular functions and cell
locations, GO has become the
master “cheat-sheet” for our
total knowledge of the genetic
basis of phenotype.
Because every GO annotation
term has a unique digital code,
we can use computers to mine the
GO DAGs for granular functional
information.
Instead of having to plough through thousands of papers at the library and make notes
and then decide what the differential gene expression from your microarray experiment
means as a net affect, the aim is for GO to have all the biological information
captured and then retrieve it and compile it with your quantitative gene product
expression data and provide a net affect.
Many people use “GO Slims” which capture only high-level terms which are more
often then not extremely poorly informative and not suitable for hypothesis-testing.
“GO Slim”
In contrast, we need to use the deep
granular information rich data suitable for
hypothesis-testing
Shyamesh Kumar BVSc
a-CD30 mab
The critical time point in MD
lymphomagenesis
Susceptible (L72)
18
mean total lesion score
16
Genotype
14
Susceptible (L72)
Resistant (L61)
12
10
Resistant ( L61)
Non-MHC associated resistance and
susceptibility
8
6
4
2
0
0
20
40
60
days post infection
Burgess et al,Vet Pathol 38:2,2001
80
100
a-CD8 mab
2008, 57: 1253-1262.
Hypothesis
At the critical time point of 21 dpi, MD-resistant
genotypes have a T-helper (Th)-1 microenvironment
(consistent with CTL activity), but MD-susceptible
genotypes have a T-reg or Th-2 microenvironment
(antagonistic to CTL).
Infection of chickens (L61 & L72), kill
and post-mortem at 21dpi and
sample tissues
Whole Tissue
Cryosections
Laser Capture Microdissection (LCM)
RNA extraction
RNA extraction
Duplex QPCR
Whole tissue mRNA expression
L6 (R)
40 – mean Ct value
25
20
*
*
*
L7 (S)
*
*
15
10
5
0
mRNA
Microscopic lesion mRNA expression
L6 (R)
40 – mean Ct value
25
20
*
L7 (S)
*
15
10
*
*
*
5
0
IL-4
IL-12
IL-18
TGFβ
mRNA
GPR-83 SMAD-7 CTLA-4
CYTOKINES AND T HELPER CELL DIFFERENTIATION
NAIVE
CD4+ T
CELL
APC
Th-1
T reg
Th-2
L6 Whole
NAIVE
CD4+ T
CELL
APC
L7 Whole
T reg
Smad 7
L7 Micro
IL 12
IL 4
Th-1, Th-2, T-reg ?
Th-2
Th-1
Inflammatory?
IL 4
IFN γ
IL 12
IL10
TGFβ
IL 18
CTL
Macrophage
NK Cell
Gene Ontology based hypothesis testing
QPCR data
Relative mRNA expression data
Gene Ontology annotation
Biological Process Modeling &
Hypothesis testing
Step I. GO-based Phenotype Scoring.
Gene product
Step III. Inclusion of quantitative
data to the phenotype scoring table
and calculation of net affect.
Gene product
Th1
Treg
Inflammation
1.58
-1.58
0.00
0.00
0.00
0.00
-1.20
1.20
-1.20
Th1
Th2
Treg
Inflammation
IL-2
1
ND
1
-1
IL-2
1.58
IL-4
-1
1
1
ND
IL-4
0.00
1
-1
1
IL-6
IL-6
Th2
IL-8
ND
ND
1
1
IL-8
0.00
0.00
1.18
1.18
IL-10
-1
1
1
0
IL-10
0.00
0.00
0.00
0.00
IL-12
0.00
0.00
0.00
0.00
IL-13
1.51
-1.51
0.00
0.00
Step II.
by quantitative
IL-12Multiply
1
-1
ND
ND
IL-13
-1
1
ND
ND
data for
each
gene
product.
IL-18
1
1
1
1
IL-18
0.91
0.91
0.91
0.91
IFN-g
1
-1
1
1
IFN-g
0.00
0.00
0.00
0.00
TGF-b
-1
0
1
-1
TGF-b
-1.71
0.00
1.71
-1.71
CTLA-4
-1
-1
1
-1
CTLA-4
-1.89
-1.89
1.89
-1.89
GPR-83
-1
-1
1
-1
GPR-83
-1.69
-1.69
1.69
-1.69
SMAD-7
1
1
-1
1
SMAD-7
0.00
0.00
0.00
0.00
Net Effect
-1.29
-5.38
10.15
-5.98
ND = No data
Whole Tissue
L7 (S)
L6 (R)
120
100
Net Effect
80
60
40
20
0
-20
-40
Th-1
Th-2
T-reg
Inflammation
Microscopic lesions
L6 (R)
60
5mm
L7 (S)
50
Net Effect
40
30
20
10
0
-
10
-
20
Th-1
Th-2
T-reg
Inflammation
Phenotype
L6 Resistant
L6 (R) Whole lymphoma
Pro
T-reg
Anti CTL
L7 Susceptible
Pro
T-reg
Pro Anti
Th-1 Th-2
Pro CTL
Anti
Th-1
Pro
Th-2
Anti CTL
Pro CTL
Translation to clinical research: Pig
Global mRNA and protein expression was measured Bindu Nanduri
from quadruplicate samples of control, X- and Y-treated tissue.
Differentially-expressed mRNA’s and proteins identified from
Affymetrix microarray data and DDF shotgun proteomics using
Monte-Carlo resampling*.
* Nanduri, B., P. Shah, M. Ramkumar, E. A. Allen, E. Swaitlo, S. C. Burgess*, and M. L. Lawrence*. 2008.
Quantitative analysis of Streptococcus Pneumoniae TIGR4 response to in vitro iron restriction by 2-D LC ESI
MS/MS. Proteomics 8, 2104-14.
Using network and pathway analysis as well as Gene Ontologybased hypothesis testing, differences in specific phyisological
processes between X- and Y-treated were quantified and
reported as net effects.
Proportional distribution of mRNA functions
differentially-expressed by X- and Y-treated tissues
Treatment Y
Treatment X
immunity (primarily innate)
inflammation
Wound healing
Lipid metabolism
response to thermal injury
angiogenesis
Total differentially-expressed
mRNAs: 4302
Total differentially-expressed
mRNAs: 1960
Net functional distribution of differentially-expressed mRNAs:
X- vs. Y-Treatment
X
Y
sensory response to pain
angiogenesis
response to thermal injury
Lipid metabolism
Wound healing
classical inflammation
(heat, redness, swelling, pain, loss
of function)
immunity (primarily innate)
35
30
25
20
15
10
Relative bias
5
0
5
Proportional distribution of protein functions
differentially-expressed by X- and Y-treated tissues
Treatment Y
Treatment X
immunity (primarily innate)
inflammation
Wound Healing
Lipid metabolism
response to Thermal Injury
Angiogenesis
hemorrhage
Total differentially-expressed
proteins: 509
Total differentially-expressed
proteins: 433
Net functional distribution of differentially-expressed Proteins:
X- vs. Y-Treatment
hemorrhage
sensory response to pain
Treatment X
Treatment Y
angiogenesis
response to thermal injury
lipid metabolism
Wound healing
classical inflammation
(heat, redness, swelling, pain, loss of function)
immunity (primarily innate)
8
6
4
2
0
Relative bias
2
4
6