MCGH Analyzer

Download Report

Transcript MCGH Analyzer

MCGH Analyzer
Hans A. Kestler
André Müller
08-10-2004
MCGH Analyzer
1
Data processing steps
• Scanning of the DNA chips (normal and
switched)
– 2 Channels (Cy 5 and Cy 3)
• Build mean/median over the pixels
• Further processing with MCGH Software
08-10-2004
MCGH Analyzer
2
MCGH software
• Background reduction
calculate intensities according to the background
• Quality control of the spots
reject spots not fitting the quality criteria
• Accumulate spots to clones
• Check test
reject clones not fitting the visual options
• Select control clones
• Reduce control clones
• Main calculation loop
08-10-2004
MCGH Analyzer
3
Overview
08-10-2004
MCGH Analyzer
4
Background reduction
Background reduction to get intensities
1.
2.
3.
4.
5.
6.
No reduction
Fixed reduction
Local reduction
Global reduction
Local + Fixed reduction
Global + Fixed reduction
Compute log Ratios
•
•
log( IntCy3 / IntCy5 )
log( IntCy5 / IntCy3 )
08-10-2004
MCGH Analyzer
5
Quality control
Reject spots with
• flags marked by the scanning software
(bad, not found, absent, normal ...)
• A background intensity brighter than the
foreground (new!)
• Min/Max reduction:
–
–
08-10-2004
Reject the n smallest ratios
Reject the n largest ratios
MCGH Analyzer
6
Spots to clones
Accumulate the non-rejected spot values
•
Mean
•
Standard deviation
•
Median
over
•
Intensities (Cy3, Cy5)
•
log Ratios
New Feature:
Reject clones with less than SpotLowerBound valid spots.
08-10-2004
MCGH Analyzer
7
Check test
Reject clones if at least one of these conditions holds:
1.
2.
3.
Me(di)an background intensity > Background upper bound
Me(di)an Cy3 Intensity < Me(di)an Cy3 background intensity x
Intensity lower bound
Standard deviation Cy3 Ratio > Ratio SD upper bound
08-10-2004
MCGH Analyzer
8
Select control clones
Only non-rejected clones will be selected as control clones.
•
Manual selection
Select clones with id = ‚91‘ or ‚k‘ or ‚K‘ or ‚?91‘ as control clone
•
Automatic selection
–
No [AutoBand]
[CutoffPercentage] clones
from the middle band
–
[AutoBand]
Select band around the median
08-10-2004
MCGH Analyzer
9
Reduce control clones
Some of the control clones will be rejected ...
•
[Cutoff Percentage]
Reject the n smallest ratios
•
Without [Cutoff Band]
Reject the n largest ratios
•
[Cutoff Band]
Reject band around the median
08-10-2004
MCGH Analyzer
10
Main calculation loop
1.
2.
3.
Calculate control means (the mean/median over all control clones/spots)
Normalize ratios (subtract control mean from the ratio)
Calculate tolerance value T
2s 2
T  t2n2,1 
n
s standard deviation of the ratios of the observed clone
n the number of valid spots in this clone
t value of the t-statistic
 significance niveau
4.
[ Force T-Test ]
Reject clones with T > [ Force T Value ]
5.
[ C Check ]
6.
Replace tolerance values with possible greater values.
Find clone with maximum tolerance and reject it if its tolerance value T is > [ Force
T Value ]
7.
Perform [ T Test ] and evaluate result value.
Everything has to be recalculated if a control clone will be rejected.
08-10-2004
MCGH Analyzer
11
The C Check
The clone tolerance values are now recalculated according to the following scheme:
mc
mean overcontrolspot ratios
sc2
varianceovercontrolspot ratios
nc
number of controlspots
mr
mean over ratiosof thecurrentclone
sr2
varianceover ratiosof thecurrentclone
nr
number of validspotsin thisclone
Tnew  tnc  nr 2,1 
(nr  1) sr  (nc  1) sc nc  nr

 mr  mc
nc  nr  2
nc nr
If the new tolerance value is greater than the old T will be replaced by the new value
08-10-2004
MCGH Analyzer
12
The T Test
If [ Force T ] is set, the value Tˆ will be set to the [ Force T Value ]
otherwise it is the greates tolerance value found in the clones.
mc
mean overcontrolspot ratios
sc2
varianceovercontrolspot ratios
nc
number of controlspots
mr
mean over ratiosof thecurrentclone
s
2
r
nr
z1 
varianceover ratiosof thecurrentclone
z2 
number of validspotsin thisclone
08-10-2004
MCGH Analyzer
mc  Tˆ  mr
(nr  1) sr  (nc  1) sc nc  nr

nc  nr  2
nc nr
mc  Tˆ  mr
(nr  1) sr  (nc  1) sc nc  nr

nc  nr  2
nc nr
13
The T Test (2)
Calculation of the result value R
• [ T Test ]

 
 
 
R  0  mc  mr  Tˆ  mc  mr  Tˆ  z1  t nc  nr  2,1  z2  t nc  nr  2,1

R  1  mc  mr
R  1  mc  mr
•
No [ T Test ] : thresholding

 
R  0  mc  mr  t   mc  mr  t 
R  1  mc  mr  t 

t
negativethresholdvalue
t
positivethresholdvalue
R  1  mc  mr  t 
In this routine the test T > [ Force T Value ] will be performed repeatedly
08-10-2004
MCGH Analyzer
14
NCBI Clone Database
• Integration of the NCBI “component”
database
• Automatically mapping of clone id’s to
accession numbers, genomic clone locations
and clone status information according to an
up-to-date database
• Direct import of the NCBI file format
08-10-2004
MCGH Analyzer
15
Database-generated Information
Accession
-Number
08-10-2004
Start-Base
End-Base
MCGH Analyzer
Clone-State
16
Batch Processing
•One ore more file pairs can
be added to a session
•All computations are
performed simultaneous on
the included datasets
08-10-2004
MCGH Analyzer
17
Diagrams functions
• Ratio-profiles of multiple clone sets can be shown in one
diagram
08-10-2004
MCGH Analyzer
18
Ideogram Browser 1
• Independent portable Java application
• Automation from MCGH-Analyzer with JNI
• Generation of ideogram drawings from the NCBI
map database
• Direct representation of gain and lost markers of
multiple clone sets
• Scalable and scrollable graphs
08-10-2004
MCGH Analyzer
19
Ideogram Browser 2
08-10-2004
MCGH Analyzer
20
Software Structure 1
• Excel as convenient platform with widely known
user interface for
– Table representation
– Diagram drawing
– User interaction
• Windows DLL written in C++ for high
performance using COM automation
• Platform-independent Java-Application for
visualizing ideograms (can be docked to the DLL
via JNI)
08-10-2004
MCGH Analyzer
21
Software Structure 2
08-10-2004
MCGH Analyzer
22
Future Features
• Copy number estimation
– Global thresholds
– Adaptive (local) thresholds
• Wavelets
• Adaptive weights smoothing
• NCBI database online update
• Interface to the R platform
08-10-2004
MCGH Analyzer
23