Iterative literature searching
Download
Report
Transcript Iterative literature searching
Overview
Introduction
Biological network data
Text mining
Gene Ontology
Expression data basics
Expression, text mining, and GO
Modules and complexes
Domains and conclusion
Scenario
Ran a set of expression experiments to
study a given disease state.
Need to put the results into a
functional context.
Atherosclerosis
Most common fatal disease in the U.S.,
and not well-understood.
Microarray analysis
Analyzed 51 artery segments from the hearts from
22 heart transplant patients.
Classified segments by their disease pathology.
Will assess the differences between Type I (moderate) and
Type V (severe) atherosclerosis.
Performed microarray analysis of each segment.
Agilent expression array with probesets for 13,000 human
genes.
SAM microarray statistic
For each gene i, contrasted expression in Type I and
Type V lesions with SAM (Proc Natl Acad Sci USA 98:
5116-21, 2001).
High positive SAM score: gene expressed more highly in
Type V lesions.
Large negative SAM score: gene expressed more highly in
Type I lesions.
Analysis pipeline
1.
Biomarker
identification
For formal studies,
use machine learning
methods
For exploratory work,
select several genes
with extreme SAM
scores.
Analysis pipeline, continued
2. Biomarker association
Basic question: for this context, what is
common among the biomarker genes?
Approaches
Exhaustive reading
GO analysis
Literature searching
pros and cons of this
approach
Pro: associations are
Con: might not find
specific to this
disease context
Pro: identifies
relevant literature
associations on all
of your biomarkers
Con: might find
associations on
other genes
Iterative literature searching
Perform an initial search
Color the network by SAM d-score
Identify any new “responsive” genes
Add to biomarker list
Repeat
Discussion topic
Why not use all genes with extreme SAM
scores as biomarkers? Why iterate?
Once you have a good
network:
1.
2.
3.
4.
5.
Use BiNGO to identify the enriched GO
terms
Look at the genes corresponding to
selected enriched terms
Check the literature search sentences for
those genes
Choose one or two sentences, look at the
abstracts.
Iterate if desired (or go to lunch)
Final points
No right or wrong answers, only
plausible or novel hypotheses.
You can take any approach you wish.
“If it was easy, everyone would be
doing it”.