The 8 Queens Problem (and how to solve it)

Download Report

Transcript The 8 Queens Problem (and how to solve it)

Software Project
MassAnalyst
Roeland Luitwieler
Marnix Kammer
April 24, 2006
Overview
Introduction
System requirements
Our solution: Spectre
Progress so far
Conclusion
2
Introduction
Project initiator
Scientific background
The need for software tools
3
Project initiator
Dr. ir. Bas van Breukelen
Department of Biomolecular Mass
Spectrometry
Utrecht University!
 WENT building

Expert in:
Bioinformatics
 Proteomics

4
Scientific background:
Proteomics
Our body consists of cells
Cell functionality and structure is offered by
proteins
Proteomics
Main research areas:



Identification of proteins
Interaction of proteins
Comparison of protein levels
5
Protein identification
How to identify proteins?

Identity defined by their structure
Protein structure


Protein: sequence of peptides
Peptide: sequence of amino acids



20 common types
Consist of different atoms – have different masses
Too small to see… but not to weigh

Mass Spectrometry!
6
Mass Spectrometry (MS)
Technique using a mass spectrometer
Input: sample of peptides

Proteins have been split chemically


Provides a.o. more accuracy, efficiency
Most head / tail subsequences are present
Output: mass spectrum


Frequencies of particles of certain masses
Full peptide sequence can be derived
7
8
Mass Spectrometry (MS)
How does it work?

Ionize particles



Accelerate them in an
electric field
Deflect them in a
magnetic field


Now particles have an
electrical charge
Deflection depends on
mass (F = m a)
Measure how far they
have been deflected
9
Mass Spectrometry (MS)
Improvements for better analysis (1)

Use chromatography
Spreads input over time: more details
 Output: a sequence of MS spectra

10
Mass Spectrometry (MS)
Improvements for better analysis (2)

Use “recursive” mass spectrometry


Called MS/MS (or MS2 or tandem MS)
Take part of the sample that produces a peak


Usually concerns one certain peptide
Output: MS spectra with related MS/MS spectra
11
Mass Spectrometry (MS)
Improvements for better analysis (3)

Use bioinformatics
All output is translated to mzXML
 A database is searched on MS/MS spectra



Input: raw MS data
Output: pepXML: peptide information

Tools are used to e.g. display the data

Lots of redundant / boring work is taken care of!
12
Bioinformatics:
what can be done?
Remember the Proteomics research areas:




Identification of proteins
Interaction of proteins
Comparison of protein levels
Most research: differ one aspect at a time
Requires interactive display of data

Zooming, “stacking”, cross sections, etc.
But not just display of data

Filtering, “warping”, peak detection, etc.
13
Bioinformatics:
existing tools
Tools exist, but…




Lots of different tools to do different things
Functionality not always as desired
They also lack functionality
Not easily extendable
Example: Pep3D


Nice visualization, but
Only one sample at a time, only a single view
Solution: develop new software
14
System requirements
Load raw spectrometry data
Visualize the data
Manipulate and analyze the data interactively
Export data
Extendibility


Use in open community
Open source
15
Loading data
mzXML: raw spectrometry data
MS spectra
 Embedded MS/MS spectra

pepXML: database of matches with
peptides
16
Visualizing the data
List of loaded samples
MS spectrum
Cross sections of the MS spectrum
MS/MS spectra
Peptide information
17
Manipulating and
analyzing the data
Stacking: toggle samples on/off
Warping
Zooming
Peak detection
More analysis, like ratio calculation
18
Export data
Lists of peak pairs
Modified PepXML (i.e. with ratios)
Images of spectra
Modified samples
19
Our solution: Spectre
20
Opening a workspace
21
Loading a workspace
22
After the workspace is loaded
23
Working in normal mode
24
Zoom on selection mode
25
After zooming in
26
Zoom on click mode
27
After zooming in
28
The structure of
Spectre
Graph: MS spectra, cross sections, MS/MS
spectra
Workspace: a collection of samples and
settings
Sample: internal data structure for one
sample
GUI: the user interface
Processor: the main link between parts of
the program
29
The structure of
Spectre
GUI
1
1
Processor
*
Workspace
4
Graph
*
Sample
30
Systematic approach
to the problem
Phased development

Three versions
Lots of diagrams

Application of courses MSO, PM
HCI team and data layer team

Later on: data visualization team
Extreme Programming
31
Progress so far
First version will be due in week 18
Functionality:





Loading raw data
Visualization and user interface
Basic interaction with zooming etc.
Complete internal data structures
Export of images
Missing link between mzXML and pepXML!
32
Further planning
Version 2 – week 23
Warping
 Peak detection / analysis
 Export of calculated data

Version 3 – week 27
Ratio calculation
 Modification of samples

33
After completion of the project
Web site
Open source
further maintaining
 extendable

34
Conclusion
Spectre: a modular and extendable
program
A combination of many different
requirements
Phased addition of features
Any questions?
35
The data structure
Sample
1
MzTable
SampleParser
SampleWriter
*
MzNode
MzParser
PepParser
…
…
36