The 8 Queens Problem (and how to solve it)
Download
Report
Transcript The 8 Queens Problem (and how to solve it)
Software Project
MassAnalyst
Roeland Luitwieler
Marnix Kammer
April 24, 2006
Overview
Introduction
System requirements
Our solution: Spectre
Progress so far
Conclusion
2
Introduction
Project initiator
Scientific background
The need for software tools
3
Project initiator
Dr. ir. Bas van Breukelen
Department of Biomolecular Mass
Spectrometry
Utrecht University!
WENT building
Expert in:
Bioinformatics
Proteomics
4
Scientific background:
Proteomics
Our body consists of cells
Cell functionality and structure is offered by
proteins
Proteomics
Main research areas:
Identification of proteins
Interaction of proteins
Comparison of protein levels
5
Protein identification
How to identify proteins?
Identity defined by their structure
Protein structure
Protein: sequence of peptides
Peptide: sequence of amino acids
20 common types
Consist of different atoms – have different masses
Too small to see… but not to weigh
Mass Spectrometry!
6
Mass Spectrometry (MS)
Technique using a mass spectrometer
Input: sample of peptides
Proteins have been split chemically
Provides a.o. more accuracy, efficiency
Most head / tail subsequences are present
Output: mass spectrum
Frequencies of particles of certain masses
Full peptide sequence can be derived
7
8
Mass Spectrometry (MS)
How does it work?
Ionize particles
Accelerate them in an
electric field
Deflect them in a
magnetic field
Now particles have an
electrical charge
Deflection depends on
mass (F = m a)
Measure how far they
have been deflected
9
Mass Spectrometry (MS)
Improvements for better analysis (1)
Use chromatography
Spreads input over time: more details
Output: a sequence of MS spectra
10
Mass Spectrometry (MS)
Improvements for better analysis (2)
Use “recursive” mass spectrometry
Called MS/MS (or MS2 or tandem MS)
Take part of the sample that produces a peak
Usually concerns one certain peptide
Output: MS spectra with related MS/MS spectra
11
Mass Spectrometry (MS)
Improvements for better analysis (3)
Use bioinformatics
All output is translated to mzXML
A database is searched on MS/MS spectra
Input: raw MS data
Output: pepXML: peptide information
Tools are used to e.g. display the data
Lots of redundant / boring work is taken care of!
12
Bioinformatics:
what can be done?
Remember the Proteomics research areas:
Identification of proteins
Interaction of proteins
Comparison of protein levels
Most research: differ one aspect at a time
Requires interactive display of data
Zooming, “stacking”, cross sections, etc.
But not just display of data
Filtering, “warping”, peak detection, etc.
13
Bioinformatics:
existing tools
Tools exist, but…
Lots of different tools to do different things
Functionality not always as desired
They also lack functionality
Not easily extendable
Example: Pep3D
Nice visualization, but
Only one sample at a time, only a single view
Solution: develop new software
14
System requirements
Load raw spectrometry data
Visualize the data
Manipulate and analyze the data interactively
Export data
Extendibility
Use in open community
Open source
15
Loading data
mzXML: raw spectrometry data
MS spectra
Embedded MS/MS spectra
pepXML: database of matches with
peptides
16
Visualizing the data
List of loaded samples
MS spectrum
Cross sections of the MS spectrum
MS/MS spectra
Peptide information
17
Manipulating and
analyzing the data
Stacking: toggle samples on/off
Warping
Zooming
Peak detection
More analysis, like ratio calculation
18
Export data
Lists of peak pairs
Modified PepXML (i.e. with ratios)
Images of spectra
Modified samples
19
Our solution: Spectre
20
Opening a workspace
21
Loading a workspace
22
After the workspace is loaded
23
Working in normal mode
24
Zoom on selection mode
25
After zooming in
26
Zoom on click mode
27
After zooming in
28
The structure of
Spectre
Graph: MS spectra, cross sections, MS/MS
spectra
Workspace: a collection of samples and
settings
Sample: internal data structure for one
sample
GUI: the user interface
Processor: the main link between parts of
the program
29
The structure of
Spectre
GUI
1
1
Processor
*
Workspace
4
Graph
*
Sample
30
Systematic approach
to the problem
Phased development
Three versions
Lots of diagrams
Application of courses MSO, PM
HCI team and data layer team
Later on: data visualization team
Extreme Programming
31
Progress so far
First version will be due in week 18
Functionality:
Loading raw data
Visualization and user interface
Basic interaction with zooming etc.
Complete internal data structures
Export of images
Missing link between mzXML and pepXML!
32
Further planning
Version 2 – week 23
Warping
Peak detection / analysis
Export of calculated data
Version 3 – week 27
Ratio calculation
Modification of samples
33
After completion of the project
Web site
Open source
further maintaining
extendable
34
Conclusion
Spectre: a modular and extendable
program
A combination of many different
requirements
Phased addition of features
Any questions?
35
The data structure
Sample
1
MzTable
SampleParser
SampleWriter
*
MzNode
MzParser
PepParser
…
…
36