Tomato genome annotation

Download Report

Transcript Tomato genome annotation

Tomato genome annotation/
transcriptomics workshop
Rome, March 17-19, 2010
Main Conclusions
Participants:
Stephane Rombauts, Univ of Ghent
Ioannis Filippis, Imperial College
Asis Hallab, MPIZ
Manuel Spannagl, MIPS
Heidrun Gundlach, MIPS
Zhangjun Fei, Boyce Thompson Institute
Mohammed Zouine, Univ of Toulouse
Fabio Fuligni, Univ di Milano
Arnaud Bovy, Univ. ofWageningen
R Klein Lankhorst, Univ. of Wageningen
Roeland van Ham, Univ. of Wageningen
Erwin Datema, Univ of Wageningen
Maria Luisa Chiusano, Univ di Napoli
Walter Sanseverino, Univ di Napoli
Nunzio D'Agostino, Univ. di Napoli
Alessandra Traini, Univ. di Napoli
Miriam di Filippo, Univ. di Napoli
Mara Ercolano, Univ. di Napoli
Marco Pietrella, ENEA
Gianfranco Diretto, ENEA
Giovanni Giuliano, ENEA
Conclusions, Annotation
Annotation is good enough for gene family/pathway captains to start manual
annotation now.
All data will revolve around iTAG annotation-Suynchronization of SGN, MIPS
and CAB essential
Only tweaking will be needed to adapt main conclusions to final annotation
Manual annotation will be both hypothesis-driven and data-driven.
Hypothesis-driven: Gene families/pathways important for tomato biology
(carotenoid genes, ethylene receptors, etc)
Data-driven: Focus will be on genes-gene families showing:
Unexpected expansion/reduction
Fruit-specific expression
Specific to tomato (by OrthoMCL)
Annotation jamboree will be organized on an island after paper submission
Conclusions, Transcriptomics
Illumina and 454 RNAseq data are being aligned on genome.
Gene models supported by Illumina data have been provided to Affy for chip
design
Protocol has been worked out for polishing 454 data before submission
Evidence for alternative splicing
Inclusion of tissue-specific RNAseq data is important for biological
conclusions on the paper
Cluster of companion papers under production
Inclusion of peptide data on Gbrowse is important and innovative
Task division
Blast Arabidopsis proteins significantly longer from tomato ones to
scaffolds: Rombauts
Study genome duplication: Rombauts (Paterson)
Building pseudomolecules, physical mapping: Sato, Arizona, Keygene
Feed peudomolecules to assembly, assembly correction: van Ham
Chromosome heatmaps/Repetitive element annotation/ FISH pictures:
Chiusano, Gundlachs, De Jong
OrthoMCL clusters: Spannagl
Human Readable Descriptions, Dendrograms: Allab
Enrichment analysis: Not assigned
Collection of RNAseq data: Giuliano
Alignment of 454, Illumina and SOLID data: Filippis, Rombauts, Alioto
Generation of alternative splicing annotation: Rombauts, Fei
Inclusion of peptide data on genome browser: Zouine, SGN, MIPS
Inclusion of BAC ends on genome browser: Datema
Align old array probes on genome annotation: Fei
Data-driven manual annotation
Look for expanded-contracted orthoMCL groups.
Look for tomato-specific orthoMCL groups.
Verify if they are supported by expression data
Look for fruit-expressed genes
Identify appropriate gene family/pathway captains
Hypothesis-driven manual annotation
Gene family/pathway captains:
Ethylene biosynthesis/Signal transduction: Giovannoni
ARFs: Bouzayen
Carotenoid biosynthesis, photomorphogenesis: Giuliano
Ascorbate: Botella
Transcription factors: Chiusano, Fei,Khurana
R genes: Ercolano, others?
Cyt p450s: Bishop
Repeats: Gundlach
miRNA prediction: Rombauts
Manual annotation details
Informations that will be provided to gene family/pathway captains:
Access to Gbrowse
Table w automated iTAG annotation, including human readable gene
descriptions
Scaffold information
OrthoMCL cluster information (tomato-Arabidopsis-rice-grape), including
dendrograms
Gene expression, including tissue-specific data
Informations they will have to provide:
Biological story, including interesting mutants, w focus on fruit biology.
Synteny and gene duplication history using OrthoMCL and dendrogram
information
Input phenotypic data/gene name/protein name on table and on Bogas
webpage
Pathway picture in Pathvisio format
Dendrogram of the protein family, in standard format
Expression heatmap in standard format