evaluation of a Support Vector Machine based

Download Report

Transcript evaluation of a Support Vector Machine based

Developing outcome prediction
models for acute intracerebral
hemorrhage patients: evaluation of a
Support Vector Machine based method
A. Jakab1, L. Lánczi1, L. Csiba2, I. Széll2, P. Molnár3, E. Berényi1
University of Debrecen Medical School and Health Science Center
Faculty of Medicine
1: Department of Biomedical Laboratory and Imaging Science,
2: Department of Neurology,
3: Institute of Pathology
1
Introduction
Intracranial hemorrhages, clinical scales
PATHOLOGY
Ichemic stroke
primary intracerebral hemorrhage (ICH)
subarachnoid hemorrhage (SAH)
undetermined stroke
DIAGNOSTIC IMAGING
(nonenhanced CT scan)
CLINICAL RECORDS
Size of the hematoma
Location of the hematoma
Expansion rate of the hematoma
Mass effect
Time from onset to examination
Patient age
GCS
BP, electrolytes, etc.
Adaptation of a prognostic / predictive model
(scoring system)
Assesment of clinical outcome (30-day
mortality) or therapy decisions
2
Introduction
Technical advancement and challenges
ROLE OF NEUROIMAGING
Challenge: How to measure hematoma volume?
Answer 1: ABC/2 Method
Answer 2: Manual image segmentation
Answer 3: Automatic image segmentation,
parcellation (location, ventricular extension, etc.)
ABC/2: quick, accurate in most cases (ellipsoid
method)
Image segmentation: advantageous in complex
geometry, intraventricular component
3
Introduction
Clinical scales and DSS
„Classical clinical scales”
Data used:
• quick assessment of
neuroimaging findings
• patient data, basic lab findings
• a limited number of variables
Evaluation of clinical outcome
• adding the scores or simple
equations
• logistic regression functions
ICH Score
(Hemphill et. al., 2001)
GCS, patient age, volume,
infratentorial, intraventricular
Decision support systems (DSS)
using Computer aided diagnosis
(CAD)
Data used:
• CAD: computer aided definition of
imaging findings, image segmentation, ROI
analysis
• patient data, lab results
• many variables (no limit)
Evaluation:
• logistic regression functions
• more complex, „non-linear” methods:
• Artificial neural networks
•Bayesian classifier
•Nearest-neighbor rule
•Support Vector Machines
4
Introduction
Support vector machines
„..a set of related supervised learning methods which analyze data and recognize
patterns, used for statistical classification and regression analysis”
• Fitting n-dimensional hyperplanes to examples
in feature space
• Complex, but reproducible mathematical
function! (unlike neural networks)
5
Objectives
AIM of our study WAS:
AIM of our study WAS NOT:
• To use semi-automatic image
segmentation for determining useful
neuroradiological parameters
• To use many clinical parameters AND
neuroradiological data to assess 30-day
ICH mortality
• To assess the feasibility of Support
Vector Machines in the selection of
variables and creation of a prognostic
model
• To compare efficiency with the results
using „conventional” classifier
methods (logistic regression analysis)
• To validate the feasibility of
image segmentation methods
or compare to the efficiency of
ABC/2 method
• To evaluate the clinical and
neuroradiological factors of
ICH mortality
• To introduce a new
commercial Diagnostic
Support System approach or
product
6
Methods
Patient database, image segmentation
Patient population: 125 consecutive patients, Department of Neurology (ICU)
Neuroimaging: Acute, non-enhanced CT scans (two devices: GE CT-e Dual and
GE Lightspeed16, GE Medical Systems, Milwaukee, WI, USA).
Image segmentation, CT volumetry:
Intracranial space
(cm3)
Parenchymal and ventricular extension
of hemorrhage (cm3)
SLICER 3D SOFTWARE, SEMI-AUTOMATIC
SEGMENTATION.
Normalize variables to intracranial space.
Effacement of prepontine cistern (mm)
IMAGEJ SOFTWARE
Manual measurement
7
Methods
Statistical workup, SVM application
Radiological variables
Intracranial space
Normalized parenchymal hematoma vol.
Normalized intraventricular hematoma vol.
Prepontine cistern effacement
VARIABLE SELECTION
Clinical variables
Age, sex, onset, lab findings (Na, K),
RR (BPsyst, Bpdiast), pulse
history of IHD, mRs,
etc.!
VARIABLE SELECTION
TRAINING
Support Vector Machine
Classifier
TESTING
Software, classifier training: WEKA
Free, open-source environment for data mining applications
http://www.cs.waikato.ac.nz/ml/weka/
Support Vector Machine algorithm: LibSVM
Clinical outcomes of
training dataset
Validation on
testing dataset
8
Results
Assessment of clinical outcome
Questions to evaluate:
1. Assesment of ICH mortality with SVM method without
prospective evaluation (Training success %)
2. Test the method on a different patient population
(Testing accuracy %)
3. Calculate the sensitivity, specificity, AUC, error rate
4. What clinical variables are the most important, i.e. what
if many clinical variables are included in the model?
5. Is the SVM method more accurate than the logistic
regression model?
8
Results
Assessment of clinical outcome
9
Discussion
1. Semi-automatic segmentation of acute, ICU CT images could
determine useful volumetric data
2. In our experimental evaluation (prospective testing: 75% of patients as
training, 25% as test) SVM-based model could correctly prognosticate
poor outcome (30-d mortality) in 90,3% of the test cases, the method
had higher sensitivity than logistic regression did.
3. To achieve feasible results, all neuroimaging variables were used, plus
clinical parameters.
4. The „model” was saved and could be used for further prospective
analysis
10
Further plans
To integrate and automate these functions
• Automation: segmentation, clinical data
mining
• Integration: „internal” database with previous
outcomes, continuous refinement of model.
• All-in-one software packages are needed
• Health technology assessment for the
benifits of a decision support system using
these algorithms
Thank you for your attention!