#### Transcript VBM with Unified Segmentation

Voxel-Based Morphometry with Unified Segmentation Ged Ridgway Centre for Medical Image Computing University College London Thanks to: John Ashburner and the FIL Methods Group. Preprocessing in SPM • Realignment – With non-linear unwarping for EPI fMRI • • • • • Slice-time correction Coregistration Normalisation SPM8b’s unified tissue segmentation and spatial Segmentation normalisation procedure Smoothing But first, an introduction to Computational Neuroanatomy Aims of computational neuroanatomy • Many interesting and clinically important questions might relate to the shape or local size of regions of the brain • For example, whether (and where) local patterns of brain morphometry help to: ? ? ? ? ? Distinguish schizophrenics from healthy controls Understand plasticity, e.g. when learning new skills Explain the changes seen in development and aging Differentiate degenerative disease from healthy aging Evaluate subjects on drug treatments versus placebo Alzheimer’s Disease example Baseline Image Standard clinical MRI 1.5T T1 SPGR 1x1x1.5mm voxels Repeat image 12 month follow-up rigidly registered Subtraction image SPM for group fMRI Group-wise statistics fMRI time-series Preprocessing Stat. modelling Results query “Contrast” spm T Image Stat. modelling Results query “Contrast” Image Stat. modelling Results query “Contrast” Image fMRI time-series Preprocessing fMRI time-series Preprocessing SPM for structural MRI High-res T1 MRI ? High-res T1 MRI ? High-res T1 MRI ? ? Group-wise statistics The need for tissue segmentation • High-resolution MRI reveals fine structural detail in the brain, but not all of it reliable or interesting – Noise, intensity-inhomogeneity, vasculature, … • MR Intensity is usually not quantitatively meaningful (in the same way that e.g. CT is) – fMRI time-series allow signal changes to be analysed statistically, compared to baseline or global values • Regional volumes of the three main tissue types: gray matter, white matter and CSF, are welldefined and potentially very interesting Examples of segmentation GM and WM segmentations overlaid on original images Structural image, GM and WM segments, and brainmask (sum of GM and WM) Segmentation – basic approach • Intensities are modelled by a Gaussian Mixture Model (AKA Mixture Of Gaussians) • With a specified number of components • Parameterised by means, variances and mixing proportions (prior probabilities for components) Non-Gaussian Intensity Distributions • Multiple MoG components per tissue class allow non-Gaussian distributions to be modelled – E.g. accounting for partial volume effects – Or possibility of deep GM differing from cortical GM Tissue Probability Maps • Tissue probability maps (TPMs) can be used to provide a spatially varying prior distribution, which is tuned by the mixing proportions – These TPMs come from the segmented images of many subjects, done by the ICBM project Class priors • The probability of class k at voxel i, given weights γ is then: kbik P(ci k | γ) K j1 jbij • Where bij is the value of the jth TPM at voxel i. Aligning the tissue probability maps • Initially affine-registered using a multidimensional form of mutual information • Iteratively warped to improve the fit of the unified segmentation model to the data – Familiar DCT-basis function concept, as used in normalisation MRI Bias Correction • MR Images are corupted by smoothly varying intensity inhomogeneity caused by magnetic field imperfections and subject-field interactions – Would make intensity distribution spatially variable • A smooth intensity correction can be modelled by a linear combination of DCT basis functions Summary of the unified model • SPM8b implements a generative model – Principled Bayesian probabilistic formulation • Combines deformable tissue probability maps with Gaussian mixture model segmentation – The inverse of the transformation that aligns the TPMs can be used to normalise the original image • Bias correction is included within the model Segmentation clean-up • Results may contain some non-brain tissue (dura, scalp, etc.) • This can be removed automatically using simple morphological filtering operations – Erosion – Conditional dilation Lower segmentations have been cleaned up Limitations of the current model • Assumes that the brain consists of only GM and WM, with some CSF around it. – No model for lesions (stroke, tumours, etc) • Prior probability model is based on relatively young and healthy brains – Less appropriate for subjects outside this population • Needs reasonable quality images to work with – No severe artefacts – Good separation of intensities – Good initial alignment with TPMs... Extensions (possible or prototype) • Multispectral modelling k μk , k σk , {s } – (New Segment Toolbox) • Deeper Bayesian philosophy – E.g. priors over means and variances – Marginalisation of nuisance variables – Model comparison • • • • Groupwise model (enormous!) Combination with DARTEL (see later and new seg tbx) More tissue priors e.g. deep grey, meninges, etc. Imaging physics – See Fischl et al. 2004, as cited in A&F introduction Voxel-Based Morphometry • In essence VBM is Statistical Parametric Mapping of segmented tissue density • The exact interpretation of gray matter concentration or density is complicated, and depends on the preprocessing steps used – It is not interpretable as neuronal packing density or other cytoarchitectonic tissue properties, though changes in these microscopic properties may lead to macro- or mesoscopic VBM-detectable differences A brief history of VBM • A Voxel-Based Method for the Statistical Analysis of Gray and White Matter Density… Wright, McGuire, Poline, Travere, Murrary, Frith, Frackowiak and Friston. NeuroImage 2(4), 1995 (!) – Rigid reorientation (by eye), semi-automatic scalp editing and segmentation, 8mm smoothing, SPM statistics, global covars. • Voxel-Based Morphometry – The Methods. Ashburner and Friston. NeuroImage 11(6 pt.1), 2000 – Non-linear spatial normalisation, automatic segmentation – Thorough consideration of assumptions and confounds A brief history of VBM • A Voxel-Based Morphometric Study of Ageing… Good, Johnsrude, Ashburner, Henson and Friston. NeuroImage 14(1), 2001 – Optimised GM-normalisation (“a half-baked procedure”), modulation of segments with Jacobian determinants • Unified Segmentation. Ashburner and Friston. NeuroImage 26(3), 2005 – Principled generative model for segmentation using deformable priors • A Fast Diffeomorphic Image Registration Algorithm. Ashburner. Neuroimage 38(1), 2007 – Large deformation normalisation to average shape templates • … VBM overview • • • • • Unified segmentation and spatial normalisation Optional modulation with Jacobian determinant Optional computation of tissue totals/globals Gaussian smoothing Voxel-wise statistical analysis VBM in pictures Segment Normalise VBM in pictures Segment Normalise Modulate (?) Smooth VBM in pictures Segment Normalise Modulate (?) Smooth Voxel-wise statistics a1xyz a 2 xyz Y X xyz exyz 2 aNxyz exyz ~ N (0, xyz V) 1 1 X 0 0 0 0 1 1 VBM in pictures Segment Normalise Modulate (?) Smooth Voxel-wise statistics VBM Subtleties • • • • • Whether to modulate Adjusting for total GM or Intracranial Volume How much to smooth Limitations of linear correlation Statistical validity Modulation • Multiplication of the warped (normalised) tissue intensities so that their regional or global volume is preserved Native intensity = tissue density Modulated – Can detect differences in completely registered areas • Otherwise, we preserve concentrations, and are detecting mesoscopic effects that remain after approximate registration has removed the macroscopic effects – Flexible (not necessarily “perfect”) registration may not leave any such differences Unmodulated “Globals” for VBM • Shape is really a multivariate concept – Dependencies among volumes in different regions • SPM is mass univariate – Combining voxel-wise information with “global” integrated tissue volume provides a compromise – Using either ANCOVA or proportional scaling Figures from: Voxel-based morphometry of the human brain… Mechelli, Price, Friston and Ashburner. Current Medical Imaging Reviews 1(2), 2005. Above: (ii) is globally thicker, but locally thinner than (i) – either of these effects may be of interest to us. Below: The two “cortices” on the right both have equal volume… Total Intracranial Volume (TIV/ICV) • “Global” integrated tissue volume may be correlated with interesting regional effects – Correcting for globals in this case may overly reduce sensitivity to local differences – Total intracranial volume integrates GM, WM and CSF, or attempts to measure the skull-volume directly • Not sensitive to global reduction of GM+WM (cancelled out by CSF expansion – skull is fixed!) – Correcting for TIV in VBM statistics may give more powerful and/or more interpretable results Smoothing • The analysis will be most sensitive to effects that match the shape and size of the kernel • The data will be more Gaussian and closer to a continuous random field for larger kernels • Results will be rough and noise-like if too little smoothing is used • Too much will lead to distributed, indistinct blobs Smoothing • Between 7 and 14mm is probably best – (lower is okay with better registration, e.g. DARTEL) • The results below show two fairly extreme choices, 5mm on the left, and 16mm, right Nonlinearity Caution may be needed when looking for linear relationships between grey matter concentrations and some covariate of interest. Circles of uniformly increasing area. Smoothed Plot of intensity at circle centres versus area VBM’s statistical validity • Residuals are not normally distributed – Little impact on uncorrected statistics for experiments comparing reasonably sized groups – Probably invalid for experiments that compare single subjects or tiny groups with a larger control group • Need to use nonparametric tests that make less assumptions, e.g. permutation testing with SnPM VBM’s statistical validity • Correction for multiple comparisons – RFT correction based on peak heights should be OK • Correction using cluster extents is problematic – SPM usually assumes that the smoothness of the residuals is spatially stationary • VBM residuals have spatially varying smoothness • Bigger blobs expected in smoother regions – Toolboxes are now available for non-stationary cluster-based correction • http://www.fmri.wfubmc.edu/cms/NS-General VBM’s statistical validity • False discovery rate – Less conservative than FWE – Popular in morphometric work • (almost universal for cortical thickness in FS) – Recently questioned… • Topological FDR in SPM8 – See release notes for details and paper Variations on VBM • “All modulation, no gray matter” – Jacobian determinant “Tensor” Based Morphometry – Davatzikos et al. (1996) JCAT 20:88-97 • Deformation field morphometry – Cao and Worsley (1999) Ann Stat 27:925-942 – Ashburner et al (1998) Hum Brain Mapp 6:348-357 • Other variations on TBM – Chung et al (2001) NeuroImage 14:595-606 Deformation and shape change Figures from Ashburner and Friston, “Morphometry”, Ch.6 of Human Brain Function, 2nd Edition, Academic Press Deformation fields and Jacobians Original Warped Template Determinant of Jacobian Matrix encodes voxel’s volume change Jacobian Matrix Deformation vector field Longitudinal VBM • Intra-subject registration over time much more accurate than inter-subject normalisation • Imprecise inter-subject normalisation – Spatial smoothing required • Different methods have been developed to reduce the danger of expansion and contraction cancelling out… Longitudinal VBM variations • Voxel Compression mapping separates expansion and contraction before smoothing – Scahill et al (2002) PNAS 99:4703-4707 • Longitudinal VBM multiplies longitudinal volume change with baseline or average grey matter density – Chételat et al (2005) NeuroImage 27:934-946 Longitudinal VBM variations Late Early Late CSF - Early CSF Late CSF Early CSF Late CSF - modulated CSF Smoothed Warped early Difference Relative volumes CSF “modulated” by relative volume Nonrigid registration developments • Large deformation concept – Regularise velocity not displacement • (syrup instead of elastic) • Leads to concept of geodesic – Provides a metric for distance between shapes – Geodesic or Riemannian average = mean shape • If velocity assumed constant computation is fast – Ashburner (2007) NeuroImage 38:95-113 – DARTEL toolbox in SPM8b • Currently initialised from unified seg_sn.mat files DARTEL exponentiates a velocity flow field to get a deformation field Velocity flow field Example geodesic shape average Average on Riemannian manifold Linear Average (Not on Riemannian manifold) DARTEL average template evolution Grey matter average of 452 subjects – affine Iterations 471 subjects – DARTEL Questioning Intersubject normalisation • Registration algorithms might find very different correspondences to human experts – Crum et al. (2003) NeuroImage 20:1425-1437 • Higher dimensional warping improves image similarity but not necessarily landmark correspondence – Hellier et al. (2003) IEEE TMI 22:1120-1130 Questioning Intersubject normalisation • Subjects can have fundamentally different sulcal/gyral morphological variants – Caulo et al. (2007) Am. J. Neuroradiol. 28:1480-85 • Sulcal landmarks don’t always match underlying cytoarchitectonics – Amunts, et al. (2007) NeuroImage 37(4):1061-5 Intersubject normalisation opportunities • High-field high-resolution MR may have potential to image cytoarchitecture • Will registration be better or worse at higher resolution? – More information to use – More severe discrepancies? – Need rougher deformations – Non-diffeomorphic? 4.7T FSE De Vita et al (2003) Br J Radiol 76:631-7 Intersubject normalisation opportunities • Regions of interest for fMRI can be defined from functional localisers or orthogonal SPM contrasts – No obvious equivalent for single-subject structural MR • Potential to include diffusion-weighted MRI information in registration ? – Zhang et al. (2006) Med. Image Analysis 10:764-785 Summary of key points • VBM performs voxel-wise statistical analysis on smoothed (modulated) normalised segments • SPM8b performs segmentation and spatial normalisation in a unified generative model • Intersubject correspondence is imperfect – Smoothing alleviates this problem to some extent • Also improves statistical validity • Some current research is focussed on more sophisticated registration models Unified segmentation in detail An alternative explanation to the paper and to John’s slides from London ‘07 http://www.fil.ion.ucl.ac.uk/spm/course/ slides07/Image_registration.ppt Unified segmentation from the GMM upwards… The standard Gaussian mixture model Voxel i, class k p ( yi | ci k ) N ( k , k ) p ( yi , ci k ) k N ( k , k ) p ( yi ) k N ( k , k ) k Assumes independence (but spatial priors later...) p(y) k N ( yi | k , k ) i Could solve with EM k p(y | μ, σ, γ ) (1-5) Unified segmentation from the GMM upwards… Spatially modify mean and variance with bias field p(y) k N ( yi | k , k ) i k k k / i ( ) k k / i ( ) Note spatial dependence (on voxel i), [coefficients for linear combination of DCT basis functions] k k p(y ) k N yi , ( ) ( ) k i i i (10) Unified segmentation from the GMM upwards… Anatomical priors through mixing coefficients k k p(y ) k N yi , ( ) ( ) k i i i Basic idea Implementation k ik bik k k bij j j Note spatial dependence (on voxel i) prespecified: bik estimated: k bik k k k p (y ) N yi , ( ) ( ) k bij j i i i j (12) Unified segmentation from the GMM upwards… Aside: MRF Priors (A&F, Gaser’s VBM5 toolbox) bik k k bij j j bik exp kmrmi mK k jmrmi j bij expm K rmi probable number of neighbours in class m, for voxel i (45) Unified segmentation from the GMM upwards… Spatially deformable priors (inverse of normalisation) k bik k k p (y ) N yi , ( ) ( ) k j bij i i i j Simple idea! Optimisation is tricky… bik bik ( ) Prior for voxel i depends on some general transformation model, parameterised by α SPM8b’s model is affine + DCT warp With ~1000 DCT basis functions (13) Unified segmentation from the GMM upwards… Spatially deformable priors (inverse of normalisation) k bik k k p (y ) N yi , ( ) ( ) k j bij i i i j bik bik ( ) k bik ( ) k k p(y ) N yi , ( ) ( ) k j bij ( ) i i i j p(y) p(y | α, β, μ, σ, γ) (14, pretty much) Unified segmentation from the GMM upwards… Objective function so far… k bik ( ) k k p(y ) N yi , p(y) i p(ky | α, βj b,ijμ(, σ), γ) i ( ) i ( ) j log p(y | α, β, μ, σ, γ ) log p( yi | α, β, μ, σ, γ ) i k bik ( ) k k log N yi , b ( ) ( ) ( ) i k j ij i i j (14, I think...) Unified segmentation from the GMM upwards… Objective function with regularisation p(y) p(y | α, β, μ, σ, γ) p(y, α, β | μ, σ, γ) p(y | α, β, μ, σ, γ) p(α) p(β) p(α) N (0, C ) p(β) N (0, C ) C : αT C1α gives deformation’s bending energy Assumes priors independent log p(y | α, β, μ, σ, γ ) F log p(y, α, β | μ, σ, γ ) log p(α) log p(β) (15,16) Unified segmentation from the GMM upwards… Optimisation approach Maximising: k bik ( ) k k F log N yi , log p ( ) log p ( ) i k j bij ( ) i ( ) i ( ) j With respect to {α, β, μ, σ, γ} is very difficult… Iterated Conditional Modes is used – this alternately optimises certain sets of parameters, while keeping the rest fixed at their current best solution Unified segmentation from the GMM upwards… Optimisation approach • EM used for mixture parameters • Levenberg Marquardt (LM) used for bias and warping parameters – Note unified segmentation model with Gaussian assumptions has a “least-squares like” log(objective) making it ideal for Gauss-Newton or LM optimisation • Local opt, so starting estimates must be good – May need to manually reorient troublesome scans Unified segmentation from the GMM upwards… Optimisation approach • Repeat until convergence… – Hold γ, μ, σ2 and α constant, and minimise E w.r.t. b • Levenberg-Marquardt strategy, using dE/dβ and d2E/dβ2 – Hold γ, μ, σ2 and β constant, and minimise E w.r.t. α • Levenberg-Marquardt strategy, using dE/dα and d2E/dα2 – Hold α and β constant, and minimise E w.r.t. γ, μ and σ2 • Expectation Maximisation Figure from C. Gaser Note ICM steps Results of the Generative model Key flaw, lack of neighbourhood correlation – “whiteness” of noise Motivates (H)MRF priors, which should encourage contiguous tissue classes (Note, MRF prior is not equivalent to smoothing each resultant tissue segment, but differences in eventual SPMs may be minor…)