Transcript General
Overview of Microbial
Pathway and Genome
Databases
Overview
SRI International
Bioinformatics
Survey
of other databases / web sites that
integrate hundreds of microbial genomes and
pathway information
Most of these resources are described in
publications that can be found via PubMed
Differences
among each resource include:
Genomes included
What other information is integrated with the genome data
Value-added computational processing applied to each
genome
Query, visualization, and analysis tools available at each site
Overall Comparison to BioCyc
SRI International
Bioinformatics
Many of the other databases contain more genomes than
BioCyc
This will change in 2011 as BioCyc transitions to RefSeq as its genome
source
BioCyc Tier 1 and Tier 2 databases more highly curated
than other databases
BioCyc has more extensive query, visualization, and
analysis tools than other sites
BioCyc desktop version can be installed locally, and allows
editing of PGDBs
Some other sites re-annotate the genomes, which may or
may not improve data quality
Microbial Genome Resources
SRI International
Bioinformatics
– Comprehensive Microbial Resource
Entrez
IMG – Integrated Microbial Genomes
KEGG – Kyoto Encyclopedia of Genes and
Genomes
PATRIC
SEED/NMPDR
UMBBD – Univ of Minnesota Biocatalysis
Biodegradation Database
CMR
CMR – Comprehensive Microbial
SRI International
Bioinformatics
Resource
J. Craig Venter Institute
http://cmr.jcvi.org/tigr-scripts/CMR/CmrHomePage.cgi
~700
genomes
Genome data only, no pathways
Genome browser, gene pages
Many comparative operations
Will be discontinued later in 2010
Entrez Genomes
National Center for Biotechnology
Information
SRI International
Bioinformatics
http://www.ncbi.nlm.nih.gov/sites/genome
Web
portal to Genbank genomes
Genome browser, gene pages
IMG – Integrated Microbial Genomes
Joint Genome Institute
SRI International
Bioinformatics
http://img.jgi.doe.gov/cgi-bin/pub/main.cgi
1,911
microbial genomes (approx half are draft
quality)
Genome browser, gene pages
Many comparative operations
Genome context analyses available
SRI International
Bioinformatics
PATRIC
Virginia Bioinformatics Institute
http://patric.vbi.vt.edu/
Genome
browser, gene pages
KEGG pathways
SEED / NMPDR
Argonne National Laboratory
http://www.nmpdr.org/FIG/wiki/view.cgi
782
microbial genomes
Funding ended in 2009
Unique features:
Systems
Essential genes
Comparative genomics tools
Community annotation
SRI International
Bioinformatics
UMBBD
University of Minnesota
SRI International
Bioinformatics
http://umbbd.msi.umn.edu/
Database
of ~150 microbial biodegradation
pathways
Does not include full microbial genomes
KEGG – Kyoto Encyclopedia of Genes
and
SRI International
Bioinformatics
Genomes
Kyoto University
http://www.genome.ad.jp/kegg/
1,382
organisms
KEGG reannotates each genome
Static reference pathway maps are colored with
the genes present in each organism
Comparison with KEGG
KEGG vs MetaCyc: Reference pathway collections
KEGG maps are not pathways
Nuc Acids Res 34:3687 2006
SRI International
Bioinformatics
KEGG maps contain multiple biological pathways
Two genes chosen at random from a BioCyc pathway are more likely to be
related according to genome context methods than from a KEGG pathway
KEGG maps are composites of pathways in many organisms -- do not identify
what specific pathways elucidated in what organisms
KEGG has no literature citations, no comments, less enzyme detail
KEGG assigns half as many reactions to pathways as MetaCyc
KEGG vs organism-specific PGDBs
KEGG does not curate or customize pathway networks for each organism
Highly curated PGDBs now exist for important organisms such as E. coli,
yeast, mouse, Arabidopsis
SRI International
Bioinformatics
Comparison of Pathway Tools to
KEGG
Inference
tools
KEGG does not predict presence or absence of pathways
KEGG lacks pathway hole filler, operon predictor
Curation tools
KEGG does not distribute curation tools
No ability to customize pathways to the organism
Pathway Tools schema much more comprehensive
Visualization and analysis
KEGG does not perform automatic pathway layout
KEGG metabolic-map diagram extremely limited
No comparative pathway analysis