Transcript PPT

CalbiCyc, Metabolic Pathways at
the Candida Genome Database
Martha Arnaud
[email protected]
Outline
• Accessing data in the Candida Genome Database (CGD)
• Gene information in CGD: the Locus Summary page
• Biochemical pathways at CGD
• Pathway Prediction and Curation
• Our favorite PTools customizations / configuration options
Introduction to CGD
SGD-like resource for
Candida albicans
CGD started in 2004
All CGD data are freely available to
the public
Share codebase, tools, and website
organization with SGD
Manual curation of scientific literature
Pathways are just one of the types of
data we provide, and they
account for a modest fraction of
our site usage
Accessing data in CGD
Quick Search:
The main entrance point
Search CGD by gene
name or keyword
Fields searched:
- Gene Names
- Gene Descriptions
- Gene Ontology terms, synonyms, IDs
- People (colleagues, authors)
- PubMed ID
- S. cerevisiae Ortholog or Best Hit
- Biochemical pathways
Additional tools for
accessing data in CGD
Advanced Search:
Search for genes by
properties
Full-text literature search
Uses Textpresso
Developed by
Wormbase, Caltech
Sequence-related
searches and tools
BLAST
Pattern Match
Restriction Map
Primers
Genome Browser
(GMOD’s
GBrowse)
Community Resources
Search for colleagues
Browse Candida Labs
Bulk Data Downloads
Browse list of
downloadable files
Bulk Data Downloads
Browse list of
downloadable files
Downloads directory
on our web site
Gene information
in CGD
CGD focuses on
gene-based
information
Basic gene
information is found
on the “Locus
Summary Page”
(LSP)
Quick Search is the
easiest way to find
the LSP for a gene
Locus Summary Page
LSP
summarizes
gene
information
A “hub” that
links out to
more details
Locus Summary Page
Gene names, aliases
Gene description
Mutant phenotypes
Gene Ontology
Chromosomal location
Sequence retrieval
Sequence analysis
Genome browser
Orthologs
Pathways
Locus Summary Page
Gene names, aliases
Gene description
Mutant phenotypes
Gene Ontology
Chromosomal location
Sequence retrieval
Sequence analysis
Genome browser
Orthologs
Pathways
Locus Summary Page
Gene names, aliases
Gene description
Mutant phenotypes
Gene Ontology
Chromosomal location
Sequence retrieval
Sequence analysis
Genome browser
Orthologs
Pathways
Locus Summary Page
Gene names, aliases
Gene description
Mutant phenotypes
Gene Ontology
Chromosomal location
Sequence retrieval
Sequence analysis
Genome browser
Orthologs
Pathways
Locus Summary Page
Gene names, aliases
Gene description
Mutant phenotypes
Gene Ontology
Chromosomal location
Sequence retrieval
Sequence analysis
Genome browser
Orthologs
Pathways!
Biochemical Pathways at CGD
http://pathway.candidagenome.org/
Search…
List…
Browse…
PTools in “web mode”
Biochemical Pathways at CGD
• Zoom
• Link out to SGD
• Curated
pathway
summary
comments
• References
PTools Prediction of pathways for CGD
• Pathologic pathway database construction: January - April 2007
• Pathways released on our public web site: March 2008
Curation of pathways at CGD
Two-part curation approach:
Step 1. Triage
– Literature searches, assemble citation list
– Decide to keep or delete pathway
– Kept 181, deleted 227, added 15
Step 2. More intensive curation
– Pathway modifications
– Pathway comments
Current statistics: 156 pathways
107 with second-stage curation complete
14 triage and S. cerevisiae comments from SGD
35 triage only (references but, no free-text description)
Curation of pathways, continued
Curation notes
Curation challenge: Steep learning curve for the curation tools.
The tools are quite different, and the process is distinct, from the usual gene-centric
curation we do, curators need to “switch gears” for pathway curation.
Found that it was easier to make progress by making a focused “project” out of
pathway curation.
Our favorite configurable
Ptools options
•
Multiple routes to customize the function and
appearance of the tools:
–
–
–
–
–
Menu options
Parameters passed upon PTools web server startup
PTools “init” file
Style sheet
Custom scripts
Some options that we find to be useful:
PathoLogic: Specify Reference PGDB
SGD had some recent
curation that was not yet
included in MetaCyc
Pathologic allowed
us to include the
new information
in the prediction set!
Useful customization options, continued:
Pathway Hole Filler can be run without
using the operon-related data types
Issue command at the lisp prompt before you start the hole filler:
(update-nodes '(SSCORE-NODE EVALS-NODE ALN-NODE
RANK-NODE))
Useful customization options, continued:
Gene links within pathway displays link to
CGD Locus Summary pages
Use -gene-link-db CGD argument when starting web server.
Defined link template in the CGD database frame.
Useful customization options, continued:
Gene links within pathway displays link to
CGD Locus Summary pages
Use -gene-link-db CGD argument when starting web server.
Defined link template in the CGD database frame.
Useful customization options, continued:
Add custom header and footer
Integrated appearance, navigation with the rest of our site
Define header and footer in: /ptools-local/html
Useful customization options, continued:
Customized color
of buttons and boxes
on the interface
Use "CGD Gold"
not "EcoCyc Orange”
aic-export/htdocs/style.css
Relabel “Quick Search”
on the interface
(because we already
have a Quick Search,
with different
functionality)
/ptools-local/ptools-init.dat
One more useful customization:
Custom-format pathway files, updated weekly
• Suzanne Paley sent us a Lisp script that we run as a cron job to
regenerate the flat-files weekly
• We then process the flat files to generate an always-current
custom-format file, requested by a CGD user
Some advice and encouragement
on the pathway generation process
• The process can be very “fiddly.” Hang in there!
• Helpful user support: [email protected]
• Do not be afraid to ask for help! The process can be
complicated, but an active dialogue with the helpful PTools
support team makes it all possible
The
Team
Gavin Sherlock, PI
Martha Arnaud, Curator
Maria Costanzo, Curator
Diane Inglis, Curator
Marek Skrzypek, Curator
Gail Binkley, Database Administrator
Stuart Miyasato, Systems Administrator
Prachi Shah, Scientific Programmer
For help in getting CGD Pathways up and running
THANKS
•
•
•
•
•
•
Peter Karp
Suzanne Paley
Michelle Green
Joe Dale
Ron Caspi
SGD
Contact us: [email protected]