VisiGene - A Virtual Microscope and Database for In Situ Images

Download Report

Transcript VisiGene - A Virtual Microscope and Database for In Situ Images

VisiGene - AVirtual Microscope and Database for In Situ Images at genome.ucsc.edu
Galt Barber, Donna Karolchik, David Haussler, Jim Kent
Database structure
Indentation shows parent/child relationship between tables. Key fields used to join tables are underlined. In general a key field named xyz links into the
id field of the xyz table.
VisiGene displays images from in-situ RNA hybridization, reporter genes, and other techniques that show where a
gene, enhancer, or promoter is active in an organism. Currently VisiGene contains ~100,000 images from several
high-throughput gene projects and also images from the literature as curated by the model organism databases.
The controls for VisiGene are quite simple. There is a text box for search terms, a scrolling list of thumbnails of
images that match the search terms, and a large region that serves as a virtual microscope for the selected image.
One simply clicks on a region to go to the next level of magnification centered on that region. VisiGene only
transmits the data for the part of the image that you are viewing at the scale you are viewing it at, so the response
time is quite fast. One can scroll through the image by dragging it with a mouse. Underneath the image is a
caption which contains a link to the paper associated with the image, hyperlinks to the UCSC Genome Browser
page for the genes, the age, sex and genotype of the organism, and when available human curated information on
what anatomical structures the gene is active in. The search terms include gene names and symbols, authors, date
of publication, organisms, developmental stages, and anatomical structures. The Genome Browser and Gene
Sorter contain tracks and columns that link into VisiGene. Current image sets include mouse transcription factors
from the Mahoney Lab, adult mouse brain images from the Allen Brain Atlas, mouse head and brain images from
the GENSAT project, whole mount Xenopus laevis images from the Japanese Institute of Basic Biology, and
images from the mouse literature curated by the GXD group of MGI. We are grateful to all who have contributed
images to VisiGene so far, and are actively searching for additional image sets.
table
fields
submissionSource
id,name,acknowledgement,setUrl,itemUrl,abUrl
submissionSet id,name,contributors,year,publication,pubUrl,journal,copyright,submissionSource
journal id,name,url
copyright id,notice
imageFile
id,fileName,priority,imageWidth,imageHeight,submissionSet,submitId,caption
caption id,caption
image id,submissionSet,imageFile,imagePos,paneLabel,sectionSet,sectionIx,specimen,preparation
specimen
id,name,taxon,genotype,bodyPart,sex,age,minAge,maxAge,notes
bodyPart
id,name
sex id,name
genotype
id,taxon,strain,alleles
strain id,taxon,name
genotypeAllele genotype,allele
.allele id,gene,name
gene id,name,locusLink,refSeq,genbank,uniProt,taxon
preparation id,fixation,embedding,permeablization,sliceType,notes
fixation
id,description
embedding
id,description
permeablization
id,description
sliceType
id, name
imageProbe
image,probe,probeColor
probe
id,gene,antibody,probeType,fPrimer,rPrimer,seq,bac
gene id,name,locusLink,refSeq,genbank,uniProt,taxon
antibody
id,name,description,taxon
probeType
id, name
bac
id, name
probeColor id, name
expressionLevel imageProbe,bodyPart,level,cellType,cellSubtype,expressionPattern
bodyPart id, name
cellType id, name
cellSubtype id, name
expressionPattern id, name
Full Resolution
Image
1/2x
Image
1/4x
Full sized images are shrunk 1/2, 1/4, 1/8, 1/16, 1/32, and 1/64. Images at each scale are cut into 512x512 tiles. This processing
happens off-line on our computer cluster. Javascript code in “bigImage.html”requests just those tiles needed to to show current
window. The bigImage.html is independent of the database, and could easily be used to deliver other high resolution imagery
over the web.
File naming scheme
3 CDs JPEGs
Japanese NIBB
EST Seq
SQL Database
http JPEGs
JAX/MGI
Gene names
Excel Spreadsheet
Laptop JPEGs
Mahoney Lab
PCR Primers
XML Dump
http JPEGs
NCBI Gensat
BAC Seq.
vgLoadJax
978 lines of C
vgLoadMahoney
724 lines of C
vgLoadGensat
301 lines of C
vgLoadJax
204 lines of C
Directory containing 3 files per submission:
submission.ra
imageInfo.tab
caption.txt
Excel Spreadsheet
Ext HD JPEG 2000
Allen Brain
Clone Seq
vgLoadJax
253 lines of C
Directories of
Full sized
images
visiGeneLoad
1332 lines of C
vgGetText
290 lines of C
vgPrepImage
832 lines of C
~1,000,000
row MySQL
Database
Free text
gene-aware
index
~4,000,000
512x512
JPEG image tiles
hgVisiGene
Web CGI script
3988 lines of C
bigImage.html
JavaScript + HTML
1098 lines
Your web browser
Acknowledgements
Imagery and Caption Data:
Paul Gray and the Mahoney Lab
Martin Ringwald, Susan McKlatchy, Janan Eppig, and the Gene Expression folks at MGI/Jackson Labs
Michael Dicuccio at NCBI and the GENSAT project
Naeto Ueno and the Japanese National Institute for Basic Biology
Susan Sunkin and the Allen Brain Institute
Software Tools:
MySQL
Image Magick
ER Mapper (for JPEG 2000 libraries)
GNU Compiler Collection & Linux
Funding:
VisiGene was developed as a skunk works under NHGRI grant 1P41HG02371
Special thanks to the Quality Assurance Group at genome.ucsc.edu for all their help in making VisiGene a robust web
application.