Poster Patrocles_V3

Download Report

Transcript Poster Patrocles_V3

Compiling polymorphic miRNA-target interactions: the Patrocles database.
Samuel Hiard1, Xavier Tordoir2, Wouter Coppieters2, Carole Charlier2 and Michel Georges2
Bioinformatics and Modeling, GIGA & Department of Electrical Engineering and Computer Science – University of Liège, Sart-Tilman B28, Liège, Belgium
2 Unit of Animal Genomics, Department of Animal Production, Faculty of Veterinary Medicine & CBIG, University of Liège (B43), 20 Boulevard de Colonster, 4000-Liège, Belgium.
1
Abstract
Using positional cloning, we have recently identified the mutation responsible for muscular phenotype of the Texel sheep. It is located in the 3’UTR of the GDF8 gene - a known developmental repressor of muscle growth - and creates an
illegitimate target site for miRNA expressed in the same tissue. This causes miRNA-mediated translation inhibition of mutant GDF8 transcripts which leads to muscle hypertrophy.
We followed up on this finding by searching for common polymorphisms and mutations that affect either (i) RNAi silencing machinery components, (ii) miRNA precursors or (iii) target sites. These might likewise alter miRNA-target interaction
and could be responsible for substantial differences in gene expression level.
They have been compiled in a public database (“Patrocles”: www.patrocles.org), where they are classified in (i) DNA sequence polymorphisms (DSP) affecting the silencing machinery, (ii) DSP affecting miRNA structure or expression and (iii)
DSP affecting miRNA target sites. DSP from the last category were organized in four classes: destroying a target site conserved between mammals (DC), destroying a non-conserved target site (DNC), creating a non-conserved target site
(CNC), or shifting a target site (S). To aid in the identification of the most relevant DSP (such as those were a target site is created in an antitarget gene), we have quantified the level of coexpression for all miRNA-gene pairs.
Analysis of the numbers of Patrocles-DSP as well as their allelic frequency distribution indicates that a substantial proportion of them undergo purifying selection. The signature of selection was most pronounced for the DC class but was
significant for the DNC and CNC class as well, suggesting that a significant proportion of non-conserved targets is truly functional.
The Patrocles database allowed for the selection of DSP that are likely to affect gene function and possibly disease susceptibility. The effect of these DSP is being studied both in vitro and in vivo.
In conclusion, Patrocles-DSP could be widespread and underlie an appreciable amount of phenotypic variation, including common disease susceptibility.
Categories of DNA sequence polymorphisms (DSP)
affecting miRNA-mediated gene regulation
miRNA
DSP altering miRNA recognition
sites in the target
DSP altering the sequence of the
miRNA
. Stabilizing or destabilizing the
interaction with the target
(pSNP)
DSP altering the concentration of
the miRNA
Copy Number Variants
emcompassing the pri-miRNA
DSP altering the transcription rate
of the pri-miRNA
. Cis or trans-acting
DSP affecting the processing
efficiency of the pri- or pre-miRNA
Altering existing target sites
. Stabilizing or destabilizing the
interaction with the miRNA
Creating illegitimate target sites
DSP altering the target’s 3’UTR
e.g. polymorphic polyadenylation
Silencing machinery
DSP altering the amino-acid
sequence of silencing
components
50
35
Copy Number Variants
encompassing silencing
components
12 Kd
MSTN
X:
0
X : 913
X:
0
L:
0
L : 639
L:
0
B:
0
X : 5282
L : 7967
B : 858
B : 225
X : 4524
L : 7365
B : 708
B:
0
X : 202
Shifted
X : 391
L : 361
L : 691
B:
85
Table 1
Texel
Created
Destroyed
Polymorphic
Shifted
Romanov
Quantifying miRNA putative target co-expression
The g+6723G-A natural polymorphism causes translational inhibition of the Texel MSTN allele by creating an illegitimate
target site for two miRNA expressed in the same tissue, this leads to muscle hypertrophy.
Mutations in miRNAs
Initial dG =
miRNA Expression :
- Fahr et al. 2005
- Compute observed frequency
- Compute expected frequency (
- Kolmogorov-Smirnov test
B : 16
a)
b)
0
L:
0
1  (1  P(8nt _ match))UTR _ Length 7 )
c)
10
20
30
40
U|
C
- C
A
GA
UAAUG
GAGG GCC CUCU G GUGUUCAC GCG CCUUGAUU
U
CUCC CGG GAGA C CGUAAGUG CGC GGAAUUAA
C
C^
G
A
A G
AC
CAUAU
80
70
60
50
P-Value : Determined by 1000 random
permutation of genes + KS
0
X : 3661
L : 4325
B : 592
X : 424
L : 269
B : 73
X : 3363
L : 4157
B : 529
X:
L:
B:
X : 1000
L : 1313
B : 197
0
0
0
X : 14
L : 21
B : 11
d)
L = Lewis et al. 2005 : Reverse complement of (A + 2  8) of mature miRNA
(MiRBase)
B = Both
First try :
ei j 
2
2




mutations acting in cis or trans on the pri-miRNA promoter (or host gene) may influence
transcription rate:
Copy Number Variants (CNV) may affect the number of copies of the miRNA or the
integrity of the pri-miRNA host:
DSP in components of the RNA silencing machinery may affect its overall efficacy.
We followed 19 genes involved in miR biology for coding SNP, CNV, eQTL and allelic imbalance:
CNV encompass Drosha and DGCR8 genes and 6 genes present non synomymous mutations (table
3)
2
miRNA_id
hsa-mir-627
hsa-mir-124a-3
hsa-mir-513-1
hsa-mir-662
hsa-mir-518e
hsa-mir-125a
hsa-mir-606
hsa-mir-449b
hsa-mir-520c
hsa-mir-34a
hsa-mir-646
hsa-mir-560
hsa-mir-568
hsa-mir-581
hsa-mir-92b
hsa-mir-581
hsa-mir-608
nt
2
5
6
7
7
8
10
12
13
14
14
15
15
15
17
21
22
allele
A/C
G/T
-/C
G/A
-/A
G/T
-/A
A/G
G/C
C/A/T
T/G
-/GCGG
T/G
G/A
G/C
T/G
C/G
SNP_ID
rs2620381
rs34059726
rs35027589
rs9745376
rs34416818
rs12975333
rs34610391
rs10061133
rs7255628
rs35301225
rs6513497
rs10660600
rs28632138
rs810917
rs12759620
rs1694089
rs4919510
Table 2: DSP in mature miRNA
Globally
CoExpression :
 tr ji

  
 t


10
20
30
40
U|
C
- C
A
GACC
UAAUG
GAGG GCC CUCU G GUGUUCAC GCG
UUGAUU
U
CUCC CGG GAGA C CGUAAGUG CGC
AAUUAA
C
C^
G
A
A G
ACGU
CAUAU
80
70
60
50
A first CNV map of the human genome has been recently constructed (Redon et al.,
2006). We found 43 miRNAs residing in regions involved in CNV, 19 without known host
gene and 24 in a host gene which were completely (18) or partially (6) included in a CNV.
- 80% of miRNAs hosted by genes
- Deduce expression from
corresponding gene expression
- Experimental data
 gri j

 g
mutations in the pre-miRNA may affect stability or processing
efficacy,
71 SNP in the premiR: eg.: *
-32.2
At least eight host genes were found amongst the differentially regulated genes reported
in these studies. An additional one is showing allelic imbalance.
derived
expression
1
*
Initial dG =
We identified miRNA host genes characterized by inherited variation in expression levels,
reasoning that this might affect the cellular concentration of passenger miRNAs. We
compiled host genes influenced by both trans- and cis-acting “expression QTL” (eQTL)
identified either by linkage analysis or by association studies and host genes having
shown allelic imbalance in heterozygous individuals (review by Pastinen et al., 2006;
Spielman et al, 2006).
a)
X = Xie et al. 2005 : Predicted putative miRNA target sites by identification of
octamer motifs in 3’UTRs characterized by unusually high motif conservation
scores (i.e. proportion of conserved amongst all occurrences).
mutations in the mature miRNA (table 2)
6 SNP in the miR seed (yellow)
11 SNP in the mature miR (white)
For the 474 human miRNAs in Rfam (oct 2006):
- 186 host genes for 229 miR (48.3%)
- 245 miR without host gene
Not conserved
B:
-40.3
*
miRNA
X:
Reduced circulating MSTN protein
in Texel (T1) vs WT (W1)
Reduction of ~1.5X
Allelic imbalance of MSTN at the mRNA level
Texel allele (A) < WT allele (G) in heterozygous animals
In Mouse
Conserved
15
Reduction of >3X
Schematic representation of the MSTN gene and sequence context of the
polymorphic miRNA-MSTN interaction (left).
Muscle hypertrophy in Texel compared to wild-type Romanov sheep (right).
Gene Expression : SymAtlas (http://symatlas.gnf.org/SymAtlas/)
Not conserved
20
Kd
For specific miRNAs
Conserved
25
10
How?
In Human
Polymorphic
100
For a pSNP to be affect function, miRNA and putative target need to have overlapping
expression domains. To assist in the identification of relevant pSNPs, we therefore
have devised a way to quantify the degree of co-expression for miRNA-gene pairs
Compiling candidate pSNPS
Destroyed
cDNA
genomic
DSP altering the concentration of
silencing components
Why?
Created
Nature Genetics, 2006
MWM
Target
T1
miRNA-mediated gene silencing emerges as a key regulator of cellular differentiation and
homeostasis to which metazoans devote a considerable amount of sequence space. This
sequence space is bound to suffer its toll of mutations of which some will be selectively
neutral while others will be advantageous or more often at least slightly deleterious. DNA
sequence polymorphisms (DSP) occurring within this sequence space certainly contribute
to phenotypic variation including disease susceptibility and agronomically important traits.
An important question is how important their contribution actually is.
DSP may affect miRNA-mediated gene regulation by perturbing core components of the
silencing machinery, by affecting the structure or expression level of miRNAs, or by
altering target sites (Table 1).
DSP in core components of the silencing machinery may affect its overall efficacy.
Mutations that drastically perturb RNA silencing will obviously be rare given their
predictable highly deleterious consequences. Yet, DSP with subtle effects on gene function
may occur. As distinct targets may be more or less sensitive to variations in miRNA
concentration or silencing efficiency, such DSP may affect some pathways more than
others. Specific miRNA-target interactions may be influenced by mutations affecting either
the miRNA or its target. On the miRNA side of the equation: (i) the sequence of the mature
miRNA may be altered, thereby either stabilizing or destabilizing its interaction with targets,
(ii) mutations in the pri- or pre-miRNA may affect stability or processing efficiency, (iii)
mutations acting in cis or trans on the pri-miRNA promoter may influence transcription rate,
and (iv) Copy Number Variants (CNV) may affect the number of copies of the miRNA or
the integrity of the pri-miRNA host. On the target side of the equation: (i) mutations may
affect functional target sites thereby destabilizing or stabilizing the interaction with the
miRNA, (ii) mutations may create illegitimate miRNA target sites (either in the 3’UTR or
maybe even in other segments of the transcript) which will be particularly relevant if
occurring in antitargets, (iii) mutations causing polymorphic alternative polyadenylation
may affect a gene’s content in target sites.
miR mediated translational inhibition of the Texel
MSTN allele
W1
Introduction
gene
Drosha
Drosha
Drosha
DGCR8
DGCR8
DGCR8
DGCR8
DGCR8
Exportin-5
Exportin-5
Exportin-5
Exportin-5
Exportin-5
Exportin-5
Exportin-5
Dicer1
Argonaute 1
Argonaute 1
Argonaute 1
Argonaute 1
Argonaute 1
Argonaute 2
allele
G/T
G/C
C/T
G/T
A/G
T/C
T/C
C/T
C/T
G/T
C/G
G/A
C/T
A/G
C/T
C/G
C/T
A/T
T/C
G/A
G/C
C/G
external_id
rs35342496
rs12517177
rs1559205
rs9606253
rs11546015
rs35569747
rs35987994
rs5748529
rs34324334
rs11544379
rs12173786
rs35794454
rs1111785
rs11544382
rs7759854
rs4566088
rs12564106
rs12735796
rs12739932
rs17855789
rs12746607
rs35369360
Table 3: non synonymous SNP in
components of miR pathway
CoExpression distribution
of known antitargets
- CoExpression of known antitarget gene
and miRNA is quite low
- Why? This function doesn’t differenciate
moderate coexpression across all
tissues and extremely high
Nb of Known
antitargets
coexpression in one tissue
Screen shot
CoExpression Score
Patrocles finder
Evidence for purifying selection against pSNPs of
conserved and non-conserved target sites
Why?
Why?
What is the evidence that any of the candidate pSNPs listed above truly affect gene
function and hence phenotype? Indirect evidence that a significant proportion of them
are functional can be obtained from population genetics. Indeed, pSNPs without
appreciable effect on gene function will evolve neutrally, subject only to the vagaries
of random genetic drift while pSNPs affecting gene function may undergo positive,
negative or balancing selection via their effect on phenotype. Selection may leave
distinct signatures on the level of inter-species divergence, intra-species variability,
allelic distribution and linkage disequilibrium
How?
Patrocles is built using the public information provided by Ensembl. But the laboratories
that work on SNPs often discover new ones. So, there must be a tool that allows these
labs to obtain the information about stabilized, destabilized or illegitimate target sites
How?
End users must provide one or two sequences for, respectively, (i) the analysis of the
presence of octamers or (ii) the comparison of the two sequences regarding to the
content in octamers. They also have to possibility to provide an alignment of each
sequence if they care about conservation.
Screen shots
- Generation of 100 random sets of SNPs
- Processed through pipeline
Acknowledgements
Results:
PAI P5/25 from the Belgian SSTC (n° R.SSTC.0135), EU “Callimir” STREP project.
C.C. is chercheur qualifié from the FNRS.
 Less pSNPs in real data
 Differences between X and L
 pSNPs that destroy conserved target
site are highly underrepresented
(expected)
 pSNPs that either destroy or create nonconserved target site are also
underrepresented ( functional even if not
conserved across mammals)