Supervised learning for medical imaging analysis and diagnosis

Transcript Supervised learning for medical imaging analysis and diagnosis

Supervised learning for medical
imaging analysis and diagnosis:
segmentation and detection in 3D
Le Lu
Siemens Corporate Research
Computed Aided Medical Imaging
Diagnosis

Ultimate Goal

Semantic understanding of functions of human
body via medical imaging modalities

Quantitative measurement and diagnosis for more
accurate, better performed healthcare

“Human-machine” collaborative system; CAD as a
second-reader
Computed Aided Medical Imaging
Diagnosis

“Historical” heuristic approach



“Natural”
Mainstream, useful, can be limited …
Statistical learning approach




Supervised (discriminative boosting, SVM, …)
Generative (density model, …)
Hybrid, …
Exploit learned anatomical domain knowledge
Two samples of work

Representation + Computation

Accurate Polyp Segmentation for 3D CT Colonography Using
Multi-Staged Probabilistic Binary Learning and Compositional
Model1, Le Lu, et. al., CVPR'2008: IEEE Conf. on Computer
Vision and Pattern Recognition, June, 2008, Anchorage, USA.

Simultaneous Detection and Registration for Ileo-Cecal Valve
Detection in 3D CT Colonography2, Le Lu, et. al., ECCV'2008:
European Conf. on Computer Vision, October, 2008, Marseille,
France.
1 Clinic
talk at New Era of Virtual Colonoscopy meeting at MICCAI’08
2 Clinic evaluation and talk at RSNA’07
Previous work



J. Yao, M. Miller, M. Franaszek and R. Summers, Colonic polyp
segmentation in CT Colongraphy-based on fuzzy clustering and
deformable models, IEEE Trans. on Medical Imaging, 23(11):13441352, 2004.
A. Jerebko, S. Lakare, P. Cathier, S. Periaswamy, L. Bogoni,
Symmetric Curvature Patterns for Colonic Polyp Detection, MICCAI
(2) 2006: 169-176.
R. Summers, J. Yao, C. Johnson, CT Colonography with ComputerAided Detection: Automated Recognition of Ileo-cecal Valve to
Reduce Number of False-Positive Detections, Radiology, 233:266272, (2004).
Building blocks

Learner: Probabilistic Boosting Tree


Z. Tu, Probabilistic boosting-tree: Learning discriminative
methods for classification, recognition, and clustering, Int’l Conf.
Computer Vision, 2005.
Features:


Multiscale Steerable features: Axis-pattern, Boxpattern in 3D (ICCV’07, CVPR’08, ECCV’08)
Curve-parsing features: boundary (or bi-partition)
learning in 1D (CVPR’08)
Colon CAD system
What’s a polyp (in textbook)?

Copyright by ….
Polyps in 3D/2D pictures
Our polyp segmentation system




makes use of a three-stage binary classification
framework and a hierarchical, compositional shape
representation
integrates low-, and mid-level contextual information
for discriminative learning
shows superior polyp segmentation reliability rate of
98.2% (i.e., errors =< 3mm), compared with
previous work of about 75% ~ 80%
offers robustness testing with disturbances (thanks
to compositional shape model)
Flow-chart
Step 0: CAD-input
Step 1: polyp tip finding
1.
2.
3.

3D Point-detector
(with probability
output)
Grouping by C-C
Geometric
centroid on
surface
Probabilistic
spatial prior
Step 1.5: marching-cubes & polarcoordinates
Step 2: polyp interior-exterior
detection
Output of step 2
Step 3: polyp boundary detection
Step 3.5: smooth & measurement

Smoothness: Gaussian, Viterbi-like Dynamic
Programming, Loopy belief-propagation
Flow-chart
Experiments-1: accuracy

Five-fold cross-validation: Training (left, 221 polyps)
versus Testing (right, 54 polyps)
Experiments-2: comparison

Left [Jerebko06]
Right [without stacked learning]
Experiments-3: comparison
Experiments-4: Robustness

See table 1 for numerical results
Experiments-4: Robustness
Discussion on stacked learning

Stacked generality: a classifier combination method to
learn a linear or non-linear function of multiple classifier
outputs


D. H. Wolpert, Stacked generalization, Neural Networks,
5(2): 241-259, 1992.
Our stacked learning is learning a new (hopefully easier)
task from the structure outputs of another classifier (i.e.,
supervised embedding)
Summary





Our multi-staged probabilistic learning framework decomposes a
complex learning task as a sequence of better trainable subtasks.
A local-to-global scaled 3D data evidences are gradually
integrated with this learning process to achieve robustness.
Hierarchical, stacked Learning did improve direct, multi-parts,
polyp profile learning.
Our compositional model tackles the problem of “curse of
dimensionality”, which makes statistical learning practically more
feasible when applying to a highly complex 3D medical images
problem.
Robustness of polyp measurement w.r.t. multi-clicks is achieved,
thanks to shared curve learning patterns among different polyps.
What’s Ileo-cecal Valve?
 Ileo-Cecal Valve can present
with bumpy, polyp-like substructures
• Importance: a CAD system
can mistakenly detect those
bumps – resulting in polyp
false-positives (FPs), up to
15~20%
• Previous approach: Summers
et al. 2004, Radiology –
technique not fully automatic
Why difficult?

ICV appears huge within-class variations in both its internal
shape/appearance and external spatial configurations.

ICV is a relatively small size (compared with heart, liver, even
kidney) and deformable human organ which opens and closes as a
valve (connecting colon and small intestine).

ICV size and shape are sensitive to the patient weight and/or
whether ICV is diseased.

ICV position and orientation also vary, of being a part of colon which
is highly deformable.
Looking for an easier job?

Is there an easier job preceding the final task? More
importantly, how it can make the final task easier, more
solvable (data bootstrapping, back tracing; searching
range, …)?

An intuitive example, “surface-aided object localization”,
or “rotation-invariant face detection”?

Overall: computationally less expensive! (Easier)

Local step: trainable!! (via classifier ROC analysis)
Global solution: back-traceable!!! (via training data bootstrapping)

Brief review of our solution

A general 3D object detection algorithm by proposed
incremental parameter learning in full 3D space

Prior learning using domain specific knowledge for
efficiency

Prior learning in the same framework (or, spirit) of
incremental parameter learning

T -> S -> R
System
Incremental Parameter Learning for
3D object localization

Analogy to twenty-questions [Geman & Jedynak], but
simpler

Equivalent to exhaustive search in {T,S,R} if we can train
a perfect classifier (100% recall at 0% false positive rate)
at each step.

Trade explicit, exhaustive searching for parameter
estimation with implicit within-class variation modeling
using data-driven clustering inside supervised classifier
training (especially at early learning stage).

PBT, cluster based tree, multiplicative kernels, …
Robustness for non-perfect classifier

Keeping multiple hypotheses relaxes the requirement for
training/detection accuracy (sequential MC)


Cluster based sampling or Non-Maximum Suppression for
multiple object detection
Detection Accuracy:


Decreasing distances from the positive-class decision boundary
to the ground-truth (annotation)
Decreasing distance margins between positive and negative
class decision boundaries over stages
Training ROCs
Experiments
Learning-based Component for
Suppression of False Positives Located
on the Ileo-Cecal Valve1: Evaluation of
Performance on 802 CTC Volumes
L. Bogoni, A. Barbu, S. Lakare, M. Dundar, M. Wolf, L. Lu
Computer-Aided Diagnosis and Knowledge Solutions
Siemens Medical Solutions USA, Inc.
1research/product
prototype, not commercially available
RSNA 2007, Chicago, USA
Training Data

Cases with clean prep





116 volumes
8 sites
Siemens, GE, Toshiba MDCT
4, 16 and 64 slice scanners
116 ileo-cecal valves were box annotated and
then used for training
Results – Standalone System

Tested on 116 training cases



Detection Rate: 98.3% (114 out of 116)
1 false positive
Tested on 142 unseen clean cases


Detection Rate: 93.7% (133 out of 142)
5 false positives

None of the false positives is a polyp

Running time is 4~10 seconds/volume
Detection Results (Clean)
Detection Results (Tagged)
Results – Polyp FP Reduction


412 Test Cases total (data are independent!)
Clean preparation





211 patients, 407 volumes
10 sites
Siemens, GE, Toshiba MDCT
4, 16 and 64 slice scanners
Tagged preparation (combinations of iodine & barium)





201 patients, 395 volumes
4 sites
Siemens and GE MDCT
16 and 64 slice scanners
No E-cleansing needed!
Integration into CAD Prototype*
Processing
Flow:
Input Data
Candidate Generation
Feature Computation
Classification
* Work in Progress, not available commercially
CAD marks
Integration into CAD Prototype
ICV Detector
as Post-Filter:
Input Data
Candidate Generation
Feature Computation
Classification
ICV Suppression
CAD marks
Integrated Results – Post Filter

Clean cases



Tagged cases



Per Patient FP count reduced from 3.92 to 3.72 (5.5%)
Per Volume FP count reduced from 2.04 to 1.92 (5.9%)
Per Patient FP count reduced from 6.2 to 5.78 (6.8%)
Per Volume FP count reduced from 3.15 to 2.94 (6.7%)
One polyp out of 124 polyps was mislabeled as ICV
(close to ICV)

S. Kim, et al. Two- versus Three-dimensional Colon Evaluation with Recently
Developed Virtual Dissection Software for CT Colonography, Radiology
2007; 244: 852-864.
Integration into CAD Prototype
ICV detector
integrated
at feature
stage:
Input Data
Candidate Generation
Feature Computation
ICV Suppression
Classification
CAD marks
Integrated Results – FC Stage

The same performance of polyp FP reduction is
maintained.

No polyp out of 124 polyps was labeled as ICV.


The previously lost polyp was preserved when combining the
output of the ICV detector with additional features
A N-box ICV model was later proposed in ECCV’08,
which increases the mean overlap ratio from 74.9% to
88.2% and surprisingly removes 30.2% more Polyp FPs
without losing true polyps (N=2).
Conclusion


Explicit ICV anatomical knowledge can be learned by
system to improve CAD performance
Approach generalizes well in both clean and tagged CT
volumes


Benefits




CAD marks on ICV are suppressed both in clean and tagged preparation
Modest Reduction in false positives (especially nuisance fps)
Can potentially reduce interpretation time for Radiologists
No detriment to system sensitivity
Can potentially increase acceptance of CAD systems by
avoiding obvious false positives
Acknowledgement

Dr Adrian Barbu for technical collaboration
Other coauthors for discussion, clinical and
system support

Questions?


Supervised learning for medical imaging analysis and diagnosis

Transcript Supervised learning for medical imaging analysis and diagnosis

Directory