Time Series Modelling: Water Quality Applications
Download
Report
Transcript Time Series Modelling: Water Quality Applications
The challenge of
bioinformatics
Chris Glasbey
Biomathematics & Statistics
Scotland
Talk plan
1. DNA
2. mRNA
3. Protein
4. Genetic networks
1. DNA
1. DNA
Frank Wright et al
BioSS
1.DNA
1. DNA
TOPALi
2. mRNA
Prepare cDNA targets
Label with
fluorescent dyes
Combine
Equal Amounts
Scanning
Hybridise for 5 -12 hours
2. mRNA
•
Scanner’s PMT setting is one
of the sources of
contamination.
•
Scanner’s setting is to be
raised to a certain level to
make the weakly expressed
genes visible.
•
This may cause highly
expressed genes to get
censored (at 216–1= 65535)
expression values.
2. mRNA
65535
Censored
spot
0
Imputed values
With GTI (Edinburgh)
2. mRNA
Scans 1 to 4 intensity data
20000
40000
60000
Multiple scans
vs.
vs.
vs.
vs.
Scan-1
Scan-1
Scan-1
Scan-1
0
Scan-1
Scan-2
Scan-3
Scan-4
0
10000
20000
30000
Scan-1 intensity data
40000
50000
Observed pixel mean / beta
10000
30000
50000
0
Sc an-1
Sc an-2
Sc an-3
Sc an-4
0
10000
20000
30000
Estimated gene expression
40000
Mizan Khondoker
2. mRNA
Jim McNicol
3. Proteins
Electrophoresis
gel
Lars Pedersen
DTU, Denmark
3. Proteins
Protein separation by
1. pH
2. Mol. Wt.
3. Proteins
gel 1
How to
compare gels
1 and 2?
gel 2
3. Proteins
WARP
John Gustafsson, Chalmers University, Sweden
3. Proteins
Two gels
superimposed (in
different colours)
3. Proteins
Statistical Design
3 complete reps of
15 treatment combinations.
(3 ecotypes by 5 heavy metals)
Maximum of 1400 protein spots per gel
Statistical Analyses
1E-16
Filter data – remove spots with
low intensity values and low
quality scores (leaving ~290 spots)
1E-14
1E-12
1E-10
1E-08
1E-06
0.0001
Individual proteins – ANOVA,
main effects and interactions
0.01
1
1
26
51
76 101 126 151 176 201 226 251 276
3. Proteins
Principal Components
Analysis
Identify groups of proteins that
are affected in a consistent
manner by treatments
0.12
0.10
Loadings
0.08
0.06
0.04
0.02
0.00
-0.02 1
25 49 73 97 121 145 169 193 217 241 265 289
-0.04
-0.06
-0.08
Protein identity
Jim McNicol
4. Genetic networks
4. Genetic networks
4. Genetic networks
Is it possible to infer the network from gene
expression data such as these?
Dirk Husmeier
4. Genetic networks
Bayesian network
4. Genetic networks
truth
inferred
“I genuinely believe that we are
living through the greatest
intellectual moment in human
history.”
(Matt Ridley, Genome, 1999)
“Grand Unified Systems Biology”