Transcript Slide 1

# 125
Expression profiling of peripheral blood cells for early
detection of breast cancer
J. Aarøe1 • T. Lindahl2 • S. Sæbø3 • P. Skaane4 • S. Myhre1 • T. Reiersen1 • A. Lönneborg2 • A-L. Børresen-Dale1 • P. Sharma2
1 Department
of Genetics, The Norwegian Radium Hospital, N-0310 Oslo, Norway. 2 DiaGenic ASA, Oslo, Norway. 3 IKBM, University of Life Sciences, 1432 Ås, Norway. 4 Department of
Surgery, Ullevål University Hospital, Oslo, Norway.
Introduction
Results and Discussion
Early detection of breast cancer is a key to successful treatment and patient survival. The existing
methods to detect breast cancer in asymptomatic patients have limitations, and there is a need to
develop more accurate and convenient methods. In a recently published study (Sharma et al.
2005), we demonstrate the potential use of gene expression profiling in peripheral blood cells
(PBC) for early detection of breast cancer. However, the study was based on limited sample size
and the use of in-house manufactured macroarrays. The aim of the present study was to
investigate whether the findings reported earlier could be reproduced using a larger sample size
and a commercially available microarray platform.
Materials and methods
Whole blood were collected in PAX tubes from 64 females diagnosed with breast cancer and 76
females with no reported sign of the disease. Total RNA was extracted and gene expression
analyses were conducted using high density oligonucleotide arrays (Agilent Technologies)
containing 22.000 probes. Expression data were analyzed using several statistical approaches:
Partial Least Square Regression (PLSR) was used for model building while a novel approach
combining double and triple cross validation (CV) was used to identify stable and relevant
predictive genes and estimate their prediction efficiency. A number of studies have reported overoptimistic accuracy levels due to improper validation where selection bias has not been taken into
account (Ambroise and McLachlan, 2002). In order to avoid such selection bias and to obtain
unbiased estimates of accuracy, a trippel CV approach was required, since the gene selection
procedure was based on an inner double CV routine (Figure 1). Based on the selected predictors,
pathway analysis was conducted (PathwayAssist, Ariadne Genomics).
We identified a set of 58 genes that correctly predicted the diagnostic class in 75% ± 7% of the
samples, including several cases of early stage cancers (stage 0 and stage I). In addition to some
gene families reported earlier such as ribosomal genes, the identified predictive genes also
included some novel gene families. Pathway analysis identified a number of pathways based on
the identified genes. These pathways are currently being investigated more thoroughly to reveal
possible tumor-blood interactions.
In this study, factors known to significantly affect data quality such as manufacturing lot,
hybridization and labeling time were randomized as we expected less noise when using
commercial arrays and protocols. Though, the results show that diagnostic information relating to
early stage breast cancer can still be mined from PBC, it is important that any test intended for
breast cancer diagnosis has high prediction accuracy. We have recently conducted a pilot study
using Applied Biosystems 44K microarrays following a design that allows efficient normaliztion of
the facors affecting data quality. The results show a significant improvement in the prediction
accuracy. We are now reanalyzing 130 blood samples using ABI microarrays in a carefully
designed experiment.
Conclusions
• Larger study supports our previous finding that breast cancer affects gene expression patterns
in PBC during early stages of disease development.
• A blood-based gene expression test can potentially be developed for early detection of breast
cancer.
Ongoing and future studies
Repeated for all 140 samples
Improper validation
Proper validation
• Validate the relevancy of identified predictive genes by TaqMan RT-PCR.
All samples
Full variable set
Test samples
Classification
Cross-validation
Model selection
Variable selection
using class info on
training samples
Model selection
Cross-validation
Full variable set
Variable selection
using class info on
all samples
Training samples
• Try to understand the underlying biology causing the gene expression changes in blood cells
of breast cancer patients.
Training samples
Select stable
and common
genes across
segments
• Develop a simple, objective and accurate gene expression based diagnostic tool for early
detection of breast cancer.
References
Test samples
Classification
Sample 140
Jackknife significant genes
1. Sharma et al. (2005). Breast Cancer Research 7(5): R 634-644.
2. Ambroise C, McLachlan GJ (2002). Proc. Natl. Acad. Sci. USA, 99: 6562-6566.
Acknowledgement
Figure 1
The present study was supported by The National Programme for Research in Functional Genomics in Norway (FUGE) in the Research
Council of Norway
The 97th AACR Annual Meeting, Washington DC, USA, 1-5 April 2006