PPT - wFleaBase

Download Report

Transcript PPT - wFleaBase

wFleaBase
Daphnia Genome Database
from Common Components
Daphnia Genomic Consortium
Meeting, Sept. 2003
Don Gilbert, [email protected]
http://iubio.bio.indiana.edu/daphnia
A Replicable Genome infOrmation
System ( Argos )
http://eugenes.org/argos | flybase.net/flybase-ng
common/
java/ ; perl/ -- program libraries and packages
servers/ -- major programs (BLAST, MySql/PostgreSQL, others)
systems/ -- OS executables of programs
daphnia/ .. implemented organism genome systems
eugenes/
flybase/
docs/ & install/ -- Argos instructions and usage
template/ -- structure for new projects
ROOT/ -- common directory of installed projects
Argos features
Common genome tool set




Share benefits of “best of breed” genome tools
Common parts are tested & maintained by others
Minimal IT expertise (no compiles or system management)
Choice of tools (existing or new genome DB use parts desired)
Flexible project packages
 Project needs specify tool set (compare EnsEMBL where all use one set)
 Own look’n’feel web pages, contents, functions
 Security for protected and public sections
Easy replication to any Unix computer




‘Live’ database system replication using rsync
Keep remote servers up-to-date every day
Local cluster/grid for high-volume traffic
Works on common workstations, laptops
Argos common parts
Java common library, Ant builds, XML Tools,
Web Services (Axis), Lucene for “Google”-like searches
Perl common library of BioPerl, GBrowse, others
Servers include
Apache, Tomcat web servers
MySQL, PostgreSQL databases
BLAST (NCBI)
Systems compiled for
apple-powerpc-darwin, intel-linux, sun-sparc-solaris
wFleaBase structure
Cgi-bin -- Web programs(Perl)
Common -- Link to common, shared tools
Conf -- Site configurations for web, data
Data -- Bulk data & FTP site folder
Dbs
-- Project databases: blast, lucene, mysql
Indices -- Database indices
Lib
-- Program libraries
Web -- Web structure and documents
Genomics, Sequences, Maps, Literature, Stocks, Docs, other
includes Public and Protected (project member only) parts
Webapps -- Web programs (Java)
includes Search system, Secure web and editing
Search wFleaBase
BLAST wFleaBase
Edit wFleaBase
Where to put Daphnia Genome?
Database needs


Automated annotation and curated updates
Search and retrieve data subsets
Choices



EnsEMBL - working now, Gramene & others
use
GMOD:Chado - in development
(FlyBase,WormBase, ChlamyGenome,TIGR,
others will use)
Others choices?
Generic Model Organism Database
Construction Set www.gmod.org

Genome+ Database (more than annotations)

Genome visualization tools

Genome annotation pipeline planned

Literature curation and Gene Ontology
tools

Component system (pick and choose)

Developing - more complete in 2004
EnsEMBL Genome Database
www.ensembl.org

Genome annotation database

Genome visualization tools

Genome annotation pipeline

Comprehensive system (all or none)

Production - useable now
From Shawn Hoon, Fugu Informatics Group
wFleaBase issues
• Basic web system ready for genome data?
• Start with EnsEMBL for management; move to
GMOD:Chado if better choice?
• Add GMOD GBrowse; Apollo Editor with genome
• Add “Self-service” database features for?
• Easy management by scientists
• Genome data; stocks; research literature
• Add evolutionary, ecological, environmental data
Prototype at http://iubio.bio.indiana.edu/daphnia/
GBrowse Maps
Apollo Annotator