Présentation PowerPoint - Bioinformatics @ Manchester

Download Report

Transcript Présentation PowerPoint - Bioinformatics @ Manchester

EUROPEAN MULTIMEDIA BIOINFORMATICS EDUCATIONAL RESOURCE
a new tutorial on sequence analysis and bio computing
Viorica Ghita*, Valérie Ledent*, Robert Herzog*, Terry Attwood#, Ioannis Selimas#, Marc Brugman$
*Belgian EMBnet Node – BEN. Laboratoire de Bioinformatique. Université Libre de Bruxelles. Campus de la Plaine – Bat NO. Bd du Triomphe. 1050 Bruxelles. #UMBER, the University
of Manchester Specialist Node of EMBnet, School of Biological Sciences, Oxford Road, M13 9PL, Manchester. $University of Amsterdam, Mauritkade 61, 1092 AD Amsterdam, The
Netherlands
Ember is a new tutorial on sequence analysis and bio computing developed
by several EMBnet teams within an EC framework. The course can be used by
independent users as well as material for academic purposes and is
structured by chapters of gradually increasing difficulty. Each chapter has
several sections: AIM, INFO (presenting theoretical aspects of the subjects
tackled), INSTRUCTIONS (presenting practical exercises on line) Quiz and
References.
Figure 1. Ember presentation page: here chapter 3 of the tutorial, containing a detailed presentation of the most important secondary
databases: PROSITE, eMOTIF, PRINTS, BLOCKS, Pfam and InterPro.
The information presented is supported by multiple web links, illustrative animations and practical exercises.
The tutorial is addressed to a wide variety of researchers (Master and PhD students,
post-docs, junior and senior researchers) from all Molecular Biology and
Bioinformatics departments, covering broad analysis areas such as:
•DNA analysis: DNA translation (chapter 1), similarity searches (chapter 2), multiple
alignments (chapter 4), restriction mapping (chapter 13); determination of gene structure
through intron/exon prediction (chapter 10); inference of protein coding sequence through
open reading frame (ORF) analysis (chapter 10);
•Protein analysis: retrieving protein sequences from databases (chapter 1); classifying
proteins into families (chapter 3); searching primary and secondary protein databases
(chapter 3); finding the best alignment between two or more proteins (chapter 4);
computing amino-acid composition, molecular weight, isoelectric point, and other
parameters (chapter 5); computing hydrophobicity/hydrophilicity profiles, locating
membrane-spanning segments (chapter 5); predicting elements of secondary structure
(chapter 5); visualizing the protein structure in 3D (chapter 6); predicting a protein 3D
structure from its sequence (chapters 7 and 8); finding evolutionary relationships between
proteins (chapter 12).
•Genome analysis: analysing genomic sequences; locating genes in a genome;
displaying genomes; parsing a eukaryotic genome sequence: GenScan (chapter 10), etc.
The tutorial presents a wide variety of tools and websites for multiple types of
analysis: similarity searches tools (BLAST, PSI-BLAST); protein family analysis
through databases searches (PROSITE, eMOTIF, BLOCKS, PRINTS, Pfam); multiple
alignment tools (Clustal, DIALIGN, T-COFFEE, CINEMA, Jalview); physicochemical
parameters and profile prediction (ProtParam and ProtScale); transmembrane helix
prediction (MEMSAT, TMpred); secondary structure prediction (Jpredet,
NNPREDICT); 3D prediction, comparison and visualisation (RasMol, QuickPDB, Cn3D); homology modelling (Swiss Model, Geno-3D); fold recognition (GenThreader,
3D-PSSM); phylogenetic analysis (Pylip); SRS (sequence retrieval), etc.
Figure 2. The tutorial presents the most important tools for multiple sequence alignment,
rich information about manual and automatic multiple alignment tools, exercises and links
to various software and alignment databases (chapter 4).
Figure 3. Physicochemical parameters computation tools for molecular weight, theoretical
pI, amino acid composition, atomic composition, extinction coefficient, hydropathy, chain
flexibility, solvent-accessible surface area, etc., software tools to predict the
transmembrane topology of proteins and some secondary structure prediction software
are presented in tutorial (chapter 5).
Figure 4. Figure 4. A detailed presentation of Protein Data Bank, the principal repository of biological
macromolecule structures, and some structure classification resources (CATH, SCOP, EC->PDB) are
presented in Chapter 6 “Fold classification”, as well as visualisation and comparison of protein 3D
structure with various Molecular Structure Viewers: RasmOl, QuickPDB, Deep View, Cn-3D.
Figure 5. Different protein structure viewers, presented in the tutorial, displaying
the ubiquitin-like signalling protein, Nedd8 (PDB ID: 1NND). (A) Deep View, (B)
Rasmol, (C) QuickPDB and (D) CN3D. (A) illustrates classical ball and stick mode,
(B) cartoon mode, (C) a wireframe α-carbon trace, with a small section of the
structure highlighted in blue, and (D) a hybrid display with amino acid chains in
cartoon mode and non-amino acid atoms in space-filling mode.
Figure 6. In the “Sickle cell haemoglobin” case study chapter the users can compare
sickle cell and normal β globin sequences to reveal the nature of the sickle cell
mutation.The exercise integrates several databases searches and multiple tools:SRS,
CLUSTALW, Restriction map as well as an advanced RasMol session by scripting files
to visualise the mutant haemoglobin and the interaction between mutant β chains and
further amino acid side chains in the vicinity of mutated Val6 residue.
Figure 7. “Human Genome” case study chapter proposes a complex analysis using advanced
bioinformatics tools in concrete research applications. Using a genomic fragment of the human
chromosome 6, the students are invited to find potential genes in this fragment with GenMark and
GENESCAN software. They can then compare the results and assess their reliability using GeneQuiz,
an integrated system for large-scale biological sequence analysis, and current database annotation in
Human Genome project - Ensembl.
In this representation, the two central mutant β chains are highlighted as white and orange wireframes.
Also highlighted are the side chains of the central Val6 mutation and porphyrin prosthetic group (in
CPK coloured space-filling models). Both the porphyrin prosthetic groups (blue) and the mutant Val6
residues (red) are represented as space filling models. Highlighted in yellow are the side-chains in the
vicinity of Val6 at the interface of the two haemoglobin molecules.
EMBER EMBnet teams : University of Manchester (United Kingdom), Swiss Institute of Bioinformatics (Switzerland), University of Nijmegen (The Netherlands), University of the Western Cape (South Africa), European Bioinformatics Institute
(United Kingdom), Instituto Gulbenkian de Ciencia (Portugal), ULB University of Bruxelles (Belgium), Canada Institute for Marine Biosciences (Canada), Research Institute for Genetic engineering and Biotechnology (Turkey), Expert Center for
Taxonomic Identification (The Netherlands).
The project coordinator is Professor Terri Attwood from the University of Manchester: the principal authors include Ioannis Selimas, from the Manchester group and Marc Brugman from the Expert Centre for Taxonomic Identification.