integrationIIx
Download
Report
Transcript integrationIIx
Integration II
Prediction
Kernel-based data integration
• SVMs and the kernel “trick”
• Multiple-kernel learning
• Applications
– Protein function prediction
– Clinical prognosis
SVMs
These are expression measurements
from two genes for two populations
(cancer types)
The goal is to define a cancer type
classifier...
[Noble, Nat. Biotechnology, 2006]
SVMs
These are expression measurements
from two genes for two populations
(cancer types)
The goal is to define a cancer type
classifier...
One type of classifier is a “hyper-plane”
that separates measurements from
two cancer types
[Noble, Nat. Biotechnology, 2006]
SVMs
These are expression measurements
from two genes for two populations
(cancer types)
The goal is to define a cancer type
classifier...
One type of classifier is a “hyper-plane”
that separates measurements from
two cancer types
E.g.: a one-dimensional hyper-plane
[Noble, Nat. Biotechnology, 2006]
SVMs
These are expression measurements
from two genes for two populations
(cancer types)
The goal is to define a cancer type
classifier...
One type of classifier is a “hyper-plane”
that separates measurements from
two cancer types
E.g.: a two-dimensional hyper-plane
[Noble, Nat. Biotechnology, 2006]
SVMs
Suppose that measurements are separable:
there exists a hyperplane that
separates two types
Then there are an infinite number of
separating hyperplanes
Which to use?
[Noble, Nat. Biotechnology, 2006]
SVMs
Suppose that measurements are separable:
there exists a hyperplane that
separates two types
Then there are an infinite number of
separating hyperplanes
Which to use?
The maximum-margin hyperplane
Equivalently: minimizer of
[Noble, Nat. Biotechnology, 2006]
SVMs
Which hyper-plane to use?
In reality: minimizer of trade-off between
1. classification error, and
2. margin size
loss
penalty
SVMs
This is the primal problem
This is the dual problem
SVMs
What is K?
The kernel matrix:
each entry is sample inner product
one interpretation: sample similarity
measurements completely described by K
SVMs
Implication:
Non-linearity is obtained by
appropriately defining kernel
matrix K
E.g. quadratic kernel:
SVMs
Another implication:
No need for measurement vectors
all that is required is similarity
between samples
E.g. string kernels
Protein Structure Prediction
Protein structure
Sequence similarity
Protein sequence
Protein Structure Prediction
Kernel-based data fusion
Core idea: use different kernels for different genomic data sources
a linear combination of kernel matrices is a kernel
(under certain conditions)
Kernel-based data fusion
Kernel to use in prediction:
Kernel-based data fusion
In general, the task is to estimate
SVM function along with
coefficients of the kernel
matrix combination
This is a type of well-studied
optimization problem
(semi-definite program)
Kernel-based data fusion
Kernel-based data fusion
Kernel-based data fusion
Same idea applied to cancer classification from expression and proteomic data
Kernel-based data fusion
• Prostate cancer dataset
– 55 samples
– Expression from microarray
– Copy number variants
• Outcomes predicted:
– Grade, stage, metastasis, recurrence
Kernel-based data fusion