No Slide Title

Download Report

Transcript No Slide Title

Tripal: a Construction Toolkit for Online Genome Databases
Stephen P. Ficklin1, Lacey-Anne Sanderson2, Chun-Huai Cheng1, Margaret Staton3,
Sook Jung1, Taein Lee1, Il-Hyung Cho4, Kirstin E. Bett2, Dorrie Main1
1
Washington State University, Pullman, WA, USA
2 University of Saskatchewan, Saskatoon, SK, Canada
3 Clemson University Genomics Institute, SC, USA
4 Saginaw Valley State University, University Center, MI, USA
contact: [email protected]
Brief Overview
Tripal is a platform that simplifies construction of online genomic databases. With the increase in data from new sequencing technologies and
downstream data analysis the need for online visualization and data-mining is ever increasing. The need for skilled web developers and IT
professionals coupled with the complexities of project and data management creates an obstacle for many research communities or individual
labs. Additionally, online genomic databases are expected to provide access to the raw data, analysis results, data mining tools, cross
references to larger or companion databases, include outreach content and perhaps social networking capabilities. Tripal is intended to reduce
these complexities by coupling the strengths of GMOD Chado, a relational database schema for biological data, and Drupal, a popular
Content Management System (CMS).
Tripal provides a web interface that includes a Chado installer, data loaders for ontologies (controlled vocabularies), GFF files, and FASTA
files. Web pages are automatically generated for organisms, genomic features, biological libraries, and stock collections. Web pages can be
enriched with analysis results from BLAST, KAAS/KEGG, InterProScan, and Gene Ontology (GO). Tripal can be used “as is” but also
allows for complete customization. PHP-based template files are provided for all data types to allow for precise customizations as required by
the community. A well-developed Tripal API provides a uniform set of variables and functions for accessing any and all data within the
Chado database.
Currently, Tripal only supports visualization of a subset of the current Chado schema, but further development is underway. Meanwhile,
others can use the Tripal API to develop their own extensions. Those extensions can in turn be made available for anyone to use. These
custom extensions, the Tripal package, and support resources such as an active mailing list can be found on the Tripal website
(http://tripal.sourceforge.net). Currently, Tripal is in use for several genome websites including the Citrus Genome Database, The Cacao
Genome Database, Pulse Crops Genomics and Breeding, The Hardwood Genomics Project and more.
Sites Using Tripal
Fagaceae Genomics Web
Citrus Genome Database
Pulse Crop Genomics & Breeding
Marine Genomics Project
Cacao Genome Database
Cool Season Food Legume Genome
Genome Database for Vaccinium
Hardwood Genomics Project
Organism Pages. Site administrators can easily create pages with unique content for each
organism.
Sites Migrating to Tripal
Genome Database for Rosaceae
Cotton Genome Database
Resources
Mailing List
Tutorials
Demo Site
Online Documentation
Feature Pages. Genomic features from whole genome assemblies, unigene assemblies or
other analyses can be added to the database using web-based loaders. Pages such as above
can be made available for each feature. This example from the Citrus Genome Database is
for an mRNA sequence. A structural view of the gene from GBrowse is shown and
additional information about this gene is available through the right-hand resources sidebar.
http://tripal.sourceforge.net/
Reference & Acknowledgements
Stock/Germplasm Pages. Stock and germplasm collections can be managed and
viewed using Tripal. The screenshot above, taken from the KnowPulse website
contains details about a specific stock, including properties, synonyms, genotypes
and more.
Analysis Pages. All data imported through Tripal has an
“analysis” page providing a description for how the data was
obtained.
Stephen P. Ficklin, Lacey-Anne Sanderson, Chun-Huai Cheng, Margaret Staton, Taein Lee, Il-Hyung Cho, Sook Jung, Kirstin
E Bett, Dorrie Main. Tripal: a construction Toolkit for Online Genome Databases. Database, Sept 2011. Vol 2011.
Tripal is funded indirectly through various agencies and groups. We are extremely thankful for this support. For a listing
of these funding agencies please see the Tripal paper referenced above. The GMOD group provides logistical support.
Functional Data. Tripal supports import and display of BLAST, InterProScan and KEGG Functional Data. Reports for Gene Ontology (GO) annotations can be
results. All are loaded through common file formats such as XML Tripal easily supports displayed. Users can select from all available analyses used to map GO terms.
loading of results generated through Blast2GO.