Transcript General

Overview of Microbial
Pathway and Genome
Databases
Overview
SRI International
Bioinformatics
 Survey
of other databases / web sites that
integrate hundreds of microbial genomes and
pathway information
 Most of these resources are described in
publications that can be found via PubMed
 Differences
among each resource include:
 Genomes included
 What other information is integrated with the genome data
 Value-added computational processing applied to each
genome
 Query, visualization, and analysis tools available at each site
Overall Comparison to BioCyc
SRI International
Bioinformatics

Many of the other databases contain more genomes than
BioCyc
 This will change in 2011 as BioCyc transitions to RefSeq as its genome
source

BioCyc Tier 1 and Tier 2 databases more highly curated
than other databases

BioCyc has more extensive query, visualization, and
analysis tools than other sites

BioCyc desktop version can be installed locally, and allows
editing of PGDBs

Some other sites re-annotate the genomes, which may or
may not improve data quality
Microbial Genome Resources
SRI International
Bioinformatics
– Comprehensive Microbial Resource
 Entrez
 IMG – Integrated Microbial Genomes
 KEGG – Kyoto Encyclopedia of Genes and
Genomes
 PATRIC
 SEED/NMPDR
 UMBBD – Univ of Minnesota Biocatalysis
Biodegradation Database
 CMR
CMR – Comprehensive Microbial
SRI International
Bioinformatics
Resource
J. Craig Venter Institute
 http://cmr.jcvi.org/tigr-scripts/CMR/CmrHomePage.cgi
 ~700
genomes
 Genome data only, no pathways
 Genome browser, gene pages
 Many comparative operations
 Will be discontinued later in 2010
Entrez Genomes
National Center for Biotechnology
Information
SRI International
Bioinformatics
 http://www.ncbi.nlm.nih.gov/sites/genome
 Web
portal to Genbank genomes
 Genome browser, gene pages
IMG – Integrated Microbial Genomes
Joint Genome Institute
SRI International
Bioinformatics
 http://img.jgi.doe.gov/cgi-bin/pub/main.cgi
 1,911
microbial genomes (approx half are draft
quality)
 Genome browser, gene pages
 Many comparative operations
 Genome context analyses available
SRI International
Bioinformatics
PATRIC
Virginia Bioinformatics Institute
 http://patric.vbi.vt.edu/
 Genome
browser, gene pages
 KEGG pathways
SEED / NMPDR
Argonne National Laboratory
 http://www.nmpdr.org/FIG/wiki/view.cgi
 782
microbial genomes
 Funding ended in 2009
 Unique features:
 Systems
 Essential genes
 Comparative genomics tools
 Community annotation
SRI International
Bioinformatics
UMBBD
University of Minnesota
SRI International
Bioinformatics
 http://umbbd.msi.umn.edu/
 Database
of ~150 microbial biodegradation
pathways
 Does not include full microbial genomes
KEGG – Kyoto Encyclopedia of Genes
and
SRI International
Bioinformatics
Genomes
Kyoto University
 http://www.genome.ad.jp/kegg/
 1,382
organisms
 KEGG reannotates each genome
 Static reference pathway maps are colored with
the genes present in each organism
Comparison with KEGG

KEGG vs MetaCyc: Reference pathway collections
 KEGG maps are not pathways
Nuc Acids Res 34:3687 2006






SRI International
Bioinformatics
KEGG maps contain multiple biological pathways
Two genes chosen at random from a BioCyc pathway are more likely to be
related according to genome context methods than from a KEGG pathway
KEGG maps are composites of pathways in many organisms -- do not identify
what specific pathways elucidated in what organisms
KEGG has no literature citations, no comments, less enzyme detail
KEGG assigns half as many reactions to pathways as MetaCyc
KEGG vs organism-specific PGDBs
 KEGG does not curate or customize pathway networks for each organism
 Highly curated PGDBs now exist for important organisms such as E. coli,
yeast, mouse, Arabidopsis
SRI International
Bioinformatics
Comparison of Pathway Tools to
KEGG
 Inference
tools
 KEGG does not predict presence or absence of pathways
 KEGG lacks pathway hole filler, operon predictor
 Curation tools
 KEGG does not distribute curation tools
 No ability to customize pathways to the organism
 Pathway Tools schema much more comprehensive
 Visualization and analysis
 KEGG does not perform automatic pathway layout
 KEGG metabolic-map diagram extremely limited
 No comparative pathway analysis