Transcript Document

Chip-Assisted Analysis of Epithelial Transporter Proteins
Pascale Anderle, ISREC Lausanne
Overview
1.
2.
Introduction
1.
Transporters in the context of the whole genome
2.
Classification of transporters
3.
Introduction into microarray technology
4.
Overview on various microarray platforms
Strategies to select transporter genes and example studies
1.
2.
3.
First example: Custom Array
1.
Evaluation of transporter and channel genes in the intestine
2.
Use of Hidden Markov Models
3.
Summary
Second example: Affymetrix Platform
1.
Genomic profiling of membrane transporters in the intestine
2.
Gene Ontology Project
3.
Importance of annotation
4.
Isrec Ontologizer
5.
Conclusions
Acknowledgment
Transporters in the Context of the Whole Genome
Venter at al., Science 2001
Membrane Transporter Proteins: Classification
Membrane Transport Proteins
Specific Carriers
Selective Channels
Primary Active Transport
Facilitated Diffusion
ATP-powered pumps
Uniporters
GluT1-5
Facilitated diffusion
Transport of substances across the
membrane by means of uniporters.
Transport is from an area of higher
concentration to lower concentration. Passive
transport powered by the potential energy of
a concentration gradient and does not require
the expenditure of metabolic energy
Secondary Active Transport
ATPases:
P-type, F-type and ABC-type ATPases
(ABC transporters)
Primary active transport
Energy derived from the hydrolysis
of ATP to ADP liberating energy from
high energy phosphate bond
http://tcdb.ucsd.edu/tcdb
Symporters
Antiporters
hPept1
SLC18A1*
Secondary
active
transport.
Use of energy from another sourceanother secondary diffusion gradient set
up across the membrane using another
ion.
Because this secondary diffusion gradient is
initially established using an ion pump, as in
primary active transport, the energy is ultimately
derived from the same source-ATP hydrolysis.
*Monoamine transporter, carrier of doxorubicin
http://lab.digibench.net/transporter/
Introduction into Microarray Technology
Spotting:
Probes
Photolithography
Printing
Oligomers
Physical support:
Glass slide,
nylon membrane
PCR products
Sample preparation and hybridization:
cRNA vs. cDNA
Single-labeling vs. dual-labeling
Affymetrix:
Short oligo chip
Single labeling
Fluorescence vs. radioactivity
cDNA chip:
Oligos or PCR products
Dual-labeling
Different Microarray Platforms
Definition of biological questions
Experimental design
Custom array
PCR products
Oligomers
Commercial array
Short oligos: Affymetrix
Long oligos: Agilent
Chip preparation
Probe design
Probe preparation
Printing
Sample preparation
cRNA/cDNA Labeling
Hybridization
Scanning
Data Acquisition and Data Analysis
Evaluation of Transporter and Channel Genes in the Intestine
Goals:
• Caco-2 cells: Differentiated cells vs. undifferentiated cells
• Small intestinal and colonic tissues vs. Caco-2 cells
Anderle et al., Pharm Res 2003
Probe Design for Custom Array
Keywords,
seed sequences
Search Pfam
HMM db
HMM
Models
Run hmmsearch
against GenPept
db
Putative
new genes
Filter genes
(human only, set cut off,
eliminate red. genes)
Transporters: 670
Channels: 263
Transporters: 316
Channels: 151
Contigs: 156
Positive Controls: 9
Negative Controls: 3
Controls (diff. Oligos): 9
RGS: 75
FGF/RGF-like: 7
ADAM family: 18
Run
Pick70
Multiple alignment and
selection of repr. genes
Run Pick70
Tm = 70, Palindrome
Uniqueness = 15 bp
236 Contigs and singlets
Assemble contigs
Remove vector and
characterized ESTs
Protein seed
sequence
Converged
PSI-Blast
Brown et al. AAPS PharmSci. 2003
Core Protein
Family
Blast human
EST db
EST nucleotide
sequence
Anderle et al. Pharm Res. 2003
Differentiation of Caco-2 cells
5 days vs. 3 weeks
5 days vs. 5 days
M-values
Time
5 days vs. 2 weeks
5 days vs. 1 week
9
11
13
15
7
A-values
7
9
9
11
A-values
13
15
11
A-values
M-values
M-values
7
7
9
11
A-values
13
15
13
15
Summary
Differentiation of Caco-2 cells:

During differentiation: Expression pattern changes

Up and down regulation usually < 2 fold

Significant changes between 5 days to 1 week and 1 week to 2 weeks

No significant changes between 2 weeks and 3 weeks

Genes in general related to ion household

No major differences between flasks and filters (except GLUT3)

Typical small intestinal transporters not especially up regulated in differentiated cells
Comparison Tissue vs. Caco-2 Cells:

Changes more pronounced between tissue and cell line than between undifferentiated and differentiated
cells

Tissue vs. Caco-2 cells: More ratios > 2 fold

No trend observed: undiff. cells to diff. cells = colon-like to small intestinal-like cells
Genomic Profiling of Membrane Transporters in the Intestine
Objective:
Identification of putative segment-specific and non-specific specific drug carriers
Study Design:
Triplicates → 3 Pools of 10 mice
Duodenum
Jejunum
Ileum
12 x Mu74Av2
12 x Mu74Bv2
12 x Mu74Cv2
Colon
Gene Ontology Project
GO Output
Cellular Component
L3
L3
L4 GO:X
Molecular Function
L2
L3 GO:Y
Biological processes
L3 GO:Z
L3
L4 GO:Y
ABCB1
Two pragmatic purposes of ontology:
1. Facilitate communication between people
and organizations
2. Improve interoperability between systems
Ontologies are structured vocabularies in the form
of directed acyclic graphs (DAGs) that represent a
network in which each term may be a “child” of one or
more than one ”parent”.
Annotation
Affymetrix
Representative Sequence
Representative sequence
Consensus sequence
BLAT against assembly
sequence from UCSC
Comparison with UG DB
NetAffx
Unigene
Ensembl DB
Probes
Tagger
Exact mapping to UG and RefSeq DB
Exact mapping to temp cDNA DB
SIB annotation
4 quality levels
EnsMart DB
Representative Sequence: Chosen during chip design as a sequence which is best associated with the
transcribed region being interrogated
BLAT threshold: Only records whose match / Qsize >= 75% and; only records whose score >= 0.70, where
score = (match - mismatch - gap# x 5 - gap_size x 2) / Qsize; If record has several mapping locations with score >
0.70, choose the highest one; if a record has several mapping locations with the same highest score, all mapping
locations kept.
EnsMart Approach: cDNA sequence plus an additional length of downstream sequence immediately following
the most 3' exon. The individual probe sequences are mapped, by exact matching. If more than 50 % of probes
mapped, then listed as hits.
Comparison of Various Annotations
NetAffx
A: 21545
B: 22014
EnsMart
A: 3209
A: 2686
A: 796
B: 904
B: 8473
B: 499
A: 15421
B: 5507
A: 11269
B: 4027
A: 4381
A: 147
B: 8610
B: 77
Mouse MOE A and B
A: 5085
B: 2533
NetAffx
Tagger
A: 20882
A: 22446
B: 22112
EnsMart
A: 1193
A: 2384
A: 418
B: 169
B: 7300
B: 355
A: 12460
B: 15247
B: 1853
A: 6409
Human U133 A and B
A: 149
B: 12790
B: 85
A: 2657
B: 1728
Tagger
A: 21675
B: 16456
A: 14220
B: 2462
Quality of Probe Sets
Chip
HG-133A
HG-133B
Mu74v2A
Mu74v2B
Mu74v2C
MOE-A
MOE-B
High
13792
3795
5340
2587
756
12683
2453
Medium
1663
790
1283
969
302
2395
620
Low
1103
519
1697
1190
982
1194
592
Undefined
5657
17473
4102
7665
9828
6354
18846
Chip
HG-133A
HG-133B
Mu74v2A
Mu74v2B
Mu74v2C
MOE-A
MOE-B
High
15703
10096
8015
7010
2600
18070
11602
Medium
1196
2026
615
1421
780
1222
2376
Low
3983
3125
2127
2306
2555
2383
2478
Undefined
1333
7330
1665
1674
5933
951
6055
Mapped on:
RefSeqs
Mapped on:
RefSeqs
mRNAs
ESTs
HTCs
Distribution: UGs per Probe Set
100000
Number of Probe Sets
10000
EnsMart A
1000
EnsMart B
Tagger A
Tagger B
100
NetAffx A
NetAffx B
10
1
1
10
Number of UniGenes
100
Distribution: Probe Sets per UG
100000
U133A
10000
U133B
U133AB
Number of UniGenes
U74Av2
U74Bv2
U74Cv2
1000
U74ABCv2
U74ABCv3_NA
MOE430A
MOE430B
MOE430AB
100
10
1
1
10
Number of Probe Sets
100
Io: Isrec Ontologizer
Selection of hierarchical level
Classification of probe sets
Classification of UniGenes
Classification of RefSeqs
Flagging of ambiguous results
Multiple probe sets per UniGene:
addressed via flagging
Multiple UniGenes per probe set:
addressed via quality threshold
(user defined annotation)
Io: Overview
GO Consortium
Ontology Files
Probesets
Affymetrix
Annotation Files
(Custom)
Quality Files
Results File
Io engine independent from data structure: Can classify anything hierarchical, provided
well structured files are given to the program. (E.g.: Simple extension to spotted arrays.)
Flexibility improved by a single configuration file (v0.1.2).
Io: Annotation Organization
UniGene
Tagger
UG ID
Loc2UG
Probe Set ID
RefSeq
Tagger
Loc2GO
RefSeq ID
Loc2ref
Probe sets
of interest
Quality Filter
UG ID
NetAffx
GO term
Probe Set ID
RefSeq ID
Loc2UG
Loc2GO
GO term
IO classification
Functional classification of differentially regulated UGs along the Intestine
Function
Depth 2 Depth 3
molecular function
anticoagulant activity
antifreeze activity
antioxidant activity
apoptosis regulator activity
binding
catalytic activity
cell adhesion molecule activity
chaperone activity
chaperone regulator activity
cytoskeletal regulator activity
defense/immunity protein activity
enzyme regulator activity
ice nucleation activity
molecular_function unknown
motor activity
nutrient reservoir activity
obsolete
protein stabilization activity
protein tagging activity
reg. of establishment of comp. for transf. activity
signal transducer activity
structural molecule activity
surfactant activity
toxin activity
transcription regulator activity
translation regulator activity
transporter activity
amine/polyamine transporter activity
auxiliary transport protein activity
boron transporter activity
carbohydrate transporter activity
carrier activity
channel/pore class transporter activity
drug transporter activity
electron transporter activity
group translocator activity
intracellular transporter activity
ion transporter activity
lipid transporter activity
neurotransmitter transporter activity
nitric oxide transporter activity
nucleob/nucleos/nucleot./nucl.a. transp. activity
organic acid transporter activity
organic alcohol transporter activity
oxygen transporter activity
peptide transporter activity
peptidoglycan transporter activity
permease activity
protein transporter activity
toxin transporter activity
vitamin/cofactor transporter activity
water transporter activity
triplet codon-amino acid adaptor activity
All
# UG (1/F/G) # total UG
2167 (881/1096/190) 6794
0 (0/0/0) 2
0 (0/0/0) 0
9 (5/4/0) 10
22 (11/11/0) 57
1056 (386/586/84) 3697
24 (10/12/2) 89
24 (10/14/0) 72
0 (0/0/0) 0
1 (0/1/0) 4
38 (16/17/5) 79
992 (421/477/94) 2642
68 (34/32/2) 195
0 (0/0/0) 0
0 (0/0/0) 0
16 (4/11/1) 57
0 (0/0/0) 0
109 (47/47/15) 313
0 (0/0/0) 0
0 (0/0/0) 0
0 (0/0/0) 0
272 (113/137/22) 1188
126 (47/66/13) 358
0 (0/0/0) 4
3 (3/0/0) 8
137 (51/73/13) 539
17 (3/12/2) 58
233 (100/106/27) 718
14 (9/3/2) 30
0 (0/0/0) 1
0 (0/0/0) 0
5 (3/1/1) 10
101 (45/44/12) 227
45 (19/23/3) 217
0 (0/0/0) 0
15 (7/7/1) 29
0 (0/0/0) 0
1 (0/1/0) 9
42 (18/18/6) 112
3 (0/1/2) 7
7 (2/4/1) 15
0 (0/0/0) 0
3 (2/1/0) 6
16 (10/4/2) 37
0 (0/0/0) 0
1 (0/1/0) 9
3 (1/2/0) 5
0 (0/0/0) 0
0 (0/0/0) 0
38 (13/23/2) 110
0 (0/0/0) 0
0 (0/0/0) 5
0 (0/0/0) 1
0 (0/0/0) 0
All
% of all UG
31.9
0.0
90.0
38.6
28.6
27.0
33.3
25.0
48.1
37.5
34.9
28.1
34.8
22.9
35.2
0.0
37.5
25.4
29.3
32.5
46.7
0.0
50.0
44.5
20.7
51.7
11.1
37.5
42.9
46.7
50.0
43.2
11.1
60.0
34.5
0.0
0.0
All
# UG filt.
1341 (1094/167/80) 4685
0 (0/0/0) 2
0 (0/0/0) 0
5 (4/1/0) 7
16 (13/3/0) 42
610 (486/87/37) 2522
11 (8/2/1) 59
13 (13/0/0) 50
0 (0/0/0) 0
1 (0/1/0) 3
17 (14/2/1) 47
651 (538/79/34) 1833
33 (27/3/3) 123
0 (0/0/0) 0
0 (0/0/0) 0
10 (7/3/0) 29
0 (0/0/0) 0
71 (62/2/7) 205
0 (0/0/0) 0
0 (0/0/0) 0
0 (0/0/0) 0
160 (134/19/7) 855
83 (72/7/4) 233
0 (0/0/0) 3
1 (1/0/0) 6
79 (58/15/6) 371
7 (5/1/1) 33
149 (119/14/16) 532
11 (9/1/1) 24
0 (0/0/0) 0
0 (0/0/0) 0
3 (1/1/1) 8
65 (52/6/7) 154
26 (21/4/1) 170
0 (0/0/0) 0
11 (10/0/1) 22
0 (0/0/0) 0
1 (1/0/0) 6
30 (25/1/4) 78
1 (1/0/0) 6
5 (3/0/2) 12
0 (0/0/0) 0
2 (2/0/0) 5
13 (10/2/1) 29
0 (0/0/0) 0
1 (1/0/0) 7
0 (0/0/0) 2
0 (0/0/0) 0
0 (0/0/0) 0
25 (17/3/5) 84
0 (0/0/0) 0
0 (0/0/0) 4
0 (0/0/0) 0
0 (0/0/0) 0
All
% of all filt. UG
28.6
0.0
71.4
38.1
24.2
18.6
26.0
33.3
36.2
35.5
26.8
34.5
34.6
18.7
35.6
0.0
16.7
21.3
21.2
28.0
45.8
37.5
42.2
15.3
50.0
16.7
38.5
16.7
41.7
40.0
44.8
14.3
0.0
29.8
0.0
Self Organizing Maps
pGEA = 0.01: All genes
Duodenum
Jejunum
Ileum
Colon
GEA==0.01:
genes
ppGEA
Transporters
0.05:All
pGEA = 0.05: Transp
Pair-wise Comparison: M vs A Plots
2 * SD according to lowess fitting
3 * SD according to lowess fitting
M (log2 of fold change) vs A (log2 of absolute average intensity) plots of the pair-wise comparisons of the four intestinal segments. Highlighted are genes
for which a significant difference was measured between the two segments of interest and for which the annotation was of “high” or “medium” quality.•
differentially regulated genes, p (GEA) ≤ 0.05; • differentially regulated transporters p (GEA) ≤ 0.05; • differentially regulated transporters p (GEA) ≤ 0.01
Conclusions I
Bioinformatic aspects:
 Annotation provided by NetAffx does not catch the entire complexity of Affymetrix-based microarray
experiments
 Heterogeneous representation of genes on GeneChips: 1 unique probe set ≠ 1 unique gene
 Need of coherent and comparable annotation when comparing results of microarray experiments
 Filtering of genes using an annotation quality threshold
 No significant bias in general regarding the distribution of the selected probe sets into the different
molecular functions for the top hierarchical levels
 Possible influence regarding the distribution of the selected probe sets into the different molecular
functions at lower hierarchical levels
 Functional classification of gene on the UniGene level and RefSeq level yields very similar results
 Flagged genes ambiguous rather due to technical issues than due to the fact that splice variants
may be differentially expressed
Conclusions II
Biological aspects:
 About 28 % of genes with transporter activity are differentially regulated along the intestine, thus,
indicating that the majority of transporter genes are not segment specific.
 Some transporters, however, or genes involved in transport activity* may be used as local specific
drug targets such as:
 Apoa1*, Fabp1*, Xtrp3s1, CNT2 for the small intestine
 GLUT1 (Slc2a1), the amino acid transporter B0+ (Slc6a14) and the multidrug-resistance
associated protein Abcc6 for the colon
 Fabp1* might be an interesting target for absorption of fatty acid type drugs in the proximal small
intestine
 The tumor suppressor gene SLC5A8 seems to be highly expressed in the more distal part of the
intestine, namely the ileum and the colon
 The mRNA levels need to be quantified by quantitative RT-PCR.
 The expression of SLC34A2, Xtrp3s1, CNT2, SLC10A2, SLC5A8, GLUT1, AI648912 will be
measured in the villi, FAE and crypts using LDM and quantitative RT-PCR
Acknowledgments I
UCSF/OSU
Xenoport
National Cancer Institute
Wolfgang Sadée
Vera Rakhmanova
Shoshana Brown
Katie Woodford
Noa Zerangue
John Weinstein
Kimberly Bussey
Joe DeRisi
Adam Carroll
Jingchun Zhu
Acknowledgments II
ISREC
Bioinformatics Core Facility
Nestlé
Jean-Pierre Kraehenbuhl
Martin Rumbo
Mauro Delorenzi
Thierry Sengstag
Gary Williamson
Muriel Fiaux
Robert Mansourian
David Mutch
Matthew-Alan Roberts
Swiss Institute of Bioinformatics
Philipp Bucher
Viviane Praz
Christian Iseli
Fluorescence signal of 22 probes
1 numeric value
MAS5 or RMA ?
Normalization across chips
Loess, quantile or others ?
Comparability of chips
Statistical analysis of data
Identification of differentially regulated probe sets
Clustering of genes with similar expression profiles
Classical ANOVA or GEA ?
GEA: SD a function of A
What clustering method ?
Which measurement of similarity ?
Identification of similarly regulated genes
GO Output
Cellular Component
L3
L3
L4 GO:X
Molecular Function
L2
L3 GO:Y
L4 GO:Y
Biological processes
L3 GO:Z
Functional annotation
L3
Identification of genes with similar functions
Mapping to GO terms?
GO Classification Programs
Name
Input
GO annotation Quality
Assessment Statistics
threshold of ambiguity
IO
PS
NetAffx,
LocusLink
GOA
Selection Classification Classification on
of level
on UG basis RefSeq basis
Comments
Yes
Yes
No
Yes
Yes
Yes
GeneBank/
SwissProt,
Trembl
Onto-Express PS and others NetAffx
No
No
Yes: z-score
No/Yes
No
No
Linked to pathway
maps
No
No
No
Yes
No
No
Included in OntoTools: OntoTranslate etc
Affymetrix GO PS
Mining Tool
NetAffx
No
No
No
No
No
No
GoMiner
HUGO
?
No
No
Yes: Fisher
Predeterm.
No
No
FatiGo
Depending on
species: e.g.
UG, SP
PS
GOA
No
No
Yes: Fisher, rel.
Enr. factor
Yes
No
No
NetAffx*
No
No
No
No
No
No
No
No
No
Prechosen
No
No
GenMapp/
MappFinder
GeneSpring
David
PS,GB,LL,Ref NetAffx, UM
Seq,UG
associations
Linked to other DBs
Linked to other DBs