The Horrizontal Gene Transfer Database

Download Report

Transcript The Horrizontal Gene Transfer Database

The Horizontal Gene Transfer Database
Jeramy Brewster, Edward Simpson and Spandana Kommalapati
Mentor: Mark Goebl, PhD
Purpose
• Horizontal Gene Transfers (HGT) are significant events that can have a dramatic
impact on the physiology of an organism
• Currently there are no robust resources for cataloguing and exploring HGT’s,
however researchers are using complex evidence-based methods to create
links between genetic elements and donors
• Articles are being published regularly identifying new transfers but without a
central repository this information is difficult to organize
• Creating a network of HGTs would allow researchers to investigate the affects
of newly acquired genes on existing pathways among other useful applications
• Tracking HGTs is also a way to demonstrate pivotal moments in the
evolutionary history of organisms and can provide insight on the gene-centric
view of evolution.
Objectives
• We developed a central repository for Horizontal Gene Transfers that holds
records for HGT events and integrates associated useful external resources for
•
•
•
•
Taxonomy associations (donor/recipient)
gene-gene interactions
metabolic pathways
proximal genetic regions
• This database is a proof of concept focused on Saccharomyces Cerevisiae, the
schema of which can be capable of being adapted to multiple organisms
• It is attractive, easy to navigate and presents data in a concise and effective
format
• The schema incorporates YeastCyc (biological metabolic function), BioGrid (genegene interactions), NCBI Taxonomy, and the UCSC Genome Browser (gene
proximity and some biological representation) in addition to links to other
databases such as PubMed, SGD and NCBI Gene
Technologies
• The database and website is hosted on a free public hosting service
(2freehosting.com) at http://556db.ciki.me/hgt/
• The database utilizes a MySQL dbms for imported data
• NCBI Taxonomy, BioGrid and YeastCyc sources are processed by downloading CSV
text files and rendering them into SQL tables using a PHPADMIN GUI provided by
2freehosting.com that imports CSV files to SQL data tables. Record counts and a
random sample of records are reviewed to ensure complete and valid data imports
• UCSC Genome Browser is related to imported data tables using the open SQL calls
provided by UCSC
• HTML5 and PHP is used to build a dynamic front-end for the database that generates
MySQL queries across all data assets
Strategies
• This database uses a Relational Database Model
• HGTs are manually curated from literature listing both donor and recipient organisms
• Manual curation entails:
• All genes must be represented with their official gene symbol and ID from NCBI gene and the UCSC
ORF name
• Genes need to be associated with ID s from the NCBI Taxonomy database
• Multiple records with the same gene symbol can exist but only one record per
organism
• Transfer records will contain the donor, recipient and the PMID of the article where
the transfer was identified
• If more than one gene transfer is identified in an article, one record is created per
gene
Schema
Database Overview
Site Layout
Curation 1: Managing Genes
Curation 2: Managing Transfers
Query Strategies
Allow Scientist to refine searches for hypothesis generation starting at higher level
organization and scaling more granular
• Search by Pathways (can gene be impacting a given pathway function?)
• Show HGT genes in pathway
• Show HGT gene-gene interactions
• Show all genes in same chromosome area
• Show biological attributes of associated genes and outlinks for more information
• Search by Taxonomy (is the functioning of gene from donor species the same as function
of recipient species?)
• Show genes from taxonomy
• Show gene functions
• Show gene-gene interactions
• Search by HGT Gene (what is relationship/functioning of specific gene?)
• Show genes that have been curated
• Show gene pathways (see Pathways above for how search is structured)
• Show gene-gene interactions
• Show all genes in same chromosome area
• Show biological attributes of associated genes and outlinks for more information
Example: Search by Pathways
Example: Search by Taxonomy
Example: Gene Search
Resources
Richards TA, Leonard G, Soanes DM, Talbot NJ. Gene transfer into the fungi. Fungal
Biology Reviews, 25(2), July 2011, Pages 98-110, ISSN 1749-4613,
http://dx.doi.org/10.1016/j.fbr.2011.04.003.
Fitzpatrick DA. Horizontal gene transfer in fungi. FEMS Microbiol. Lett, 329, 2012,
Pages 1–8, doi: 10.1111/j.1574-6968.2011.02465.x. Epub 2011 Dec 15.
Slot, JC and Rokas, A. Multiple GAL pathway gene clusters evolved independently
and by different mechanisms in fungi. Proceedings of the National Academy of
Sciences, 107(22), 2010, Pages 10136-10141, 10.1073/pnas.0914418107.