L23_ABPG2014 - University of Toledo

Download Report

Transcript L23_ABPG2014 - University of Toledo

Applications of Bioinformatics,
Proteomics, and Genomics
Small ncRNA part I: snoRNA
Alexei Fedorov
1
Post-genomic era
Human Genome Project
Completed in 2003, the Human
Genome Project (HGP) was a 13year project coordinated by the U.S.
Department of Energy and the
National Institutes of Health. During
the early years of the HGP, the
Wellcome Trust (U.K.) became a
major partner; additional
contributions came from Japan,
France, Germany, China, and others.
http://www.ornl.gov/sci/techresources/Hum
an_Genome/home.shtml
2
Latest investigation of collective data on
mouse and human genome
Suzuki M., Hayashizaki Y. BioEssays 26:833-843, 2004
“Mouse-centric comparative transcriptomics of protein
coding and non-coding RNAS”
Number of
coding TU
Number of
non-coding TU
Mouse
17,594
15,815
Human
20,487
16,599
Transcriptional Unit (TU) – a cluster of transcripts
that contain a common core of genetic information.3
New era of RNomics
Why did a majority of scientists ignore
this field for a long time?
Until recently, investigation of non-messenger
RNA (nmRNA) was not a popular area of
molecular biology because mutations in a single
nmRNA gene do not usually cause a severe
disease, phenotypic abnormality, or reduction of
viability. These effects are only detected under
simultaneous disruption of several nmRNA genes
4
Great variety of non-coding RNAs
EXAMPLES
Long ncRNAs
1) Xist RNA~17,000 nt (Inactivation of the second X-
chromosome in female cells)
2) Air RNA 108,000 nt (Regulates expression of
imprinted genes. Probably responsible for paternalspecific type of expression)
Short ncRNAs
1) microRNA ~20-30 nt
2) snoRNA ~60-140 nt
Long vs Short ?
5
An entire issue of Science journal (Sept 2nd, 2005) was
devoted to the roles of various ncRNAs. This issue of
Science describes structures, functions, evolution, and
yet unresolved puzzles of these RNA molecules.
The bottleneck in our understanding of ncRNA consists in
characterization and prediction of the intricate shapes that these flexible
molecules can form by themselves and also in complexes with other
macromolecules (proteins, DNA, RNA). While it is possible to use x-ray
crystallography and NMR spectroscopy to determine the threedimensional structure of certain RNA molecules, it is so expensive and
time consuming that the 3D structures of relatively few RNA molecules
will ever be determined with atomic precision. However, many thousands
of new RNA sequences are being produced continuously by genomic
sequencing efforts ongoing world-wide.
6
Ribo-gnome: The big world of small RNAs
Zamore P.D. Haley B.
In humans ncRNAs regulate transcription of thousands of
genes, define chromatin structure, and govern genome
integrity and mRNA stability (Science, 2005, v. 309, p15191526). “Without these RNAs transposons jump (wreaking
havoc on the genome), stem cells are lost, brain and
muscle fail to develop, cells fail to divide for lack of
functional centromeres, insulin secretion is dysregulated,
and pants succumb to viral infection” (ibid). Therefore,
ncRNA will soon become a universal tool for medicine and
biotechnology. Human clinical trials testing RNA-based
drugs are currently under way.
7
http://www.genequantification.de/micro-rna.html
• Great WEB site for microRNAs
8
snoRNA
http://bioinf.scri.sari.ac.uk/cgi-bin/plant_snorna/home
Small nucleolar RNA, or snoRNA, is a major component
of small nucleolar ribonucleoprotein (snoRNP) particles
that are located inside the nucleolus of eukaryotes and
participate in post-transcriptional chemical modification or
processing of different RNAs including ribosomal RNA and
small nuclear RNA. SnoRNAs are ancient genes since
they are widespread through the entire kingdom of
eukaryotes, as well as in Archaea.
snoRNAs are inside introns!
9
Two types of snoRNA
There are two types of snoRNA - C/ D-box and H/ACA-box
snoRNAs - characterized by distinct three-dimensional structures
and conserved elements.
• C/D snoRNA is associated with 2`-O-ribose methylation of
ribosomal RNAs and other substrate RNA molecules.
• H/ACA snoRNA is associated with pseudouridinylation of
substrate RNA molecules.
The direct role of snoRNA consists of determining the site for
chemical modification by complementary pairing of its specific
sequence (known as antisense element) with the segment of
substrate RNA undergoing modification. The major enzymatic
activity belongs not to snoRNA, but to a fibrillarin - a protein
component from the snoRNP complex
10
Structure of C/D-box snoRNA
(cuga) D`
C` (ugauga)
antisense elements
(rugauga)
C
D (cuga)
5`
3`
external stem
11
C/D-box snoRNA sequences
MBII-52 mouse
C
D`
C`
D
gggtcaatgatgacaaccaatgtcatgaagaaaggtgatgacataaaattcatgctcaataggattacgctgaggccc
snR38 mouse
C
D`
C`
D
agcctatgatggattggttatccctgtctgaagatttcagctgagggaaaatactctattctgaggctta
Yeast U50
C
D`
C`
D
tatctgtgatgatcttatcccgaacctgacttctgttgaaaaaaaaaagttttacggatctggcttctgagat
12
Structure of H/ACA-box snoRNA
antisense elements
(ananna)
5`
box H
ACA
3`
13
H/ACA-box snoRNA sequences
Cer-6 C.elegans
box H
1a
1b
5`aatgcagatgtccattacgaaaaggctctttaccttttgacgtttagttaaatttgcgaaataaaa
1b
1a
2a
ttgatgtctcgaagacatgtgcttcatattttgatgctcatgttcaagatcagcaaacaaac3`
2b
2a
HBI-36 human
1a
1b
box H
2a
5`cagcactgccaagtgacccattgggctccatcttgaccaactgggcatcaagcggtgcaaaagcaaatccctctc
1b
1a
2b
aagctgggagagtcacaccgtgggctactcctgcatgcagctgggtacatat3`
14
Insights into the structure and function of a guide RNP
Fatica &Tollervey Nature Struct Biol 10:237-239, 2003
15
2000
16
Through its unique ability to coordinate a structural water
molecule via its free N1-H, Ψ exerts a subtle but signifficant
“rigidifying” influuence on the nearby sugar-phosphate
backbone and also enhances base stacking. These effects
may underlie the biological role of most (but perhaps not
all) of the Ψ residues in RNA.
17
18
2`-O-ribose methylation
19
Computer prediction of snoRNA
Fedorov A, Stombaugh J., Harr M.W., Yu S., Nasalean L., Shepelev V.
Computer identification of snoRNA genes using a Mammalian
Orthologous Intron Database.
Nucl. Acids Res. 2005. 33, 4578-4583.
To search for snoRNAs, use:
1) sequence motifs (C- and D-boxes)
2) specific arrangement of C- and D-boxes
(distances 40-100 nt)
3) characteristic secondary structures
(terminal stems with MFE < -8kcal/mol)
4) conservation during evolution
(presence in human, mouse, and rat)
20
Comparison of mouse and human
introns with snoRNA
21
Structure of snoRNA terminal stem
represents double-helix with mismatches
A common RNA structural feature is an internal loop called a kink turn (K-turn)
http://www.brynmawr.edu/scienceresearch/NataleeSmith.shtml
22
Examples of human snoRNA
terminal stems
• Usually the MFE value for snoRNA terminal
stems are < -10 kcal/mol
U45B
.TGTCCTACAAGGTCAATGATGTAATGGCATGT
|::||| |||||||
***|
AATGGGAGGTTCCAG---AGTC
snR38A
.AAGTCCCACAAGCCTATGATGGTTAGTTAT
|: ||| |||||
***|
GGTTTGGGAATTCGG---AGTC
23
RNAcofold
http://rna.tbi.univie.ac.at/cgi-bin/RNAcofold.cgi
The Vienna RNA Servers
http://rna.tbi.univie.ac.at/
• Example 1:
aaaacccggaaa
&
ttttcgggttt
MFE = -12.90 kcal/mol
• Example 2:
AAAAAGGGGGGG
&
UUUUUUUUUUUUU
MFE = -2.70 kcal/mol
24
Homework (part 1, computational assignment)
Analyze four sequences of putative snoRNAs and
determine whether they are real functional human
snoRNAs. Provide your arguments (as many as possible).
•
CHOICE 1: INTRON_11 13449_NT_026437
tttcactgtggcaactgtgatgaaagatttggtctgtatgtaatagattttattactaaatga
ggacaacagtccctctaaactgatgttgccatttaaaaa
•
INTRON_5 920_NT_032977
agagatgagctgctgaatgatgatatcccactaactgagcagtcagtagttggtcctttggtt
gcatatgatgcgataattgtttcaagacgggactgatggcagctactaaagt
•
CHOICE 3:INTRON_23 9499_NT_008470
ggaggacgggaggacggtgatgatctcccagtcttgttaaaggtgacacctagtcattgagtg
gcactggcctggccccaggcagccaccagcccacgtccctgcatgtcagggcttcctgaggcc
tcctgagcagca
•
INTRON_3 4476_NT_005612
gactatattcaaggccatgatgatgagttcactgatactctaatgttgtaacagtgtccactt
tccataaaagtttctaagcacttattcgcaatgtccgatcttatttctgtgcatagtctgaca
gtgaattagtgaat
25
Orphan snoRNAs
Nearly one hundred of orphan snoRNAs are located in
the imprinted region of chromosome 15q11-q13, which
expresses a complex transcription unit known as ICSNURF-SNRPN (Runte et al. 2001). Several strong yet
indirect evidences testify that IC-SNURF-SNRPN is
built from at least 148 exons that code for two proteins,
SNURF and SmN spliceosomal protein, as well as a
segment of antisense non-coding RNA, which is
probably involved in the development of imprinted
properties of another gene (UBE3A) from the same
locus.
26
A model for regulation of imprinted gene expression in 15q11-q13
Runte, M. et al. Hum. Mol. Genet. 2001 10:2687-2700; doi:10.1093/hmg/10.23.2687
Exons (short vertical blue bars) #1-3 code SNURF protein; #4-10 code SmN protein.
snoRNAs (long vertical blue bars) are inside introns
27
Copyright restrictions may apply.
Dysfunciton of 15q11-q13 snoRNAs
causes Prader-Willi syndrome
Introns from IC-SNURF-SNRPN transcription unit represent a
number of C/D-box snoRNAs, including 47 copies of HBII-52,
27 copies of HBII-85, 2 copies of HBII-438, and a single copy of
HBII-13, HBII-436, and HBII-437. These snoRNAs have been
designated as “orphan” since they do not have canonical guiding
targets for chemical modification of rRNAs or snRNAs. The
deletion or dysfunction of the IC-SNURF-SNRPN locus causes
Prader-Willi syndrome (PWS) – a genetic disorder with complex
manifestations and has separate clinical phases. Despite the colocalization of snoRNAs with two protein-coding genes on the
same transcriptional unit, it is commonly believed that the lack of
snoRNA expression is responsible for many symptoms of PWS.
28
Sahoo T, del Gaudio D, German JR, Shinawi M,
Peters SU, Person RE, Garnica A, Cheung SW,
Beaudet AL.
Prader-Willi phenotype caused by paternal
deficiency for the HBII-85 C/D box small
nucleolar RNA cluster.
Nat Genet. 2008 Jun;40(6):719-21.
May 25. PMID: 18500341
29
snoRNA regulate alternative splicing
(serotonin receptor 5-HT2c mRNA case)
30
NAR
2011
31
LITERATURE
Bachellerie J.P., Cavaille J., Huttenhofer A.
The expanding snoRNA world.
Biochimie 84:775-790, 2002.
snoRNA database
http://www-snorna.biotoul.fr/index.php
32
RNA editing
http://www.youtube.com/watch?v=Uw1aF2UfQ_8
http://dna.kdna.ucla.edu/rna/index.aspx
Credit: Nicolle Rager, National Science Foundation
http://www.nsf.gov/news/news_images.jsp?cntn_id=103132&org=NSF
33
RNA editing (A -> I)
adenosine
inosine
34
Assignment #2
• Watch a lecture by Dr. Eli Eisenberg
http://www.youtube.com/watch?v=Uw1aF2UfQ_8
Read about different types of RNA editing
1) Wikipedia
2) RNA Editing Web site http://dna.kdna.ucla.edu/rna/index.aspx
Write one-page assay where you do the following:
1) List all types of RNA editing
2) List all possible biological reasons for this
35
editing
HOMEWORK
• Assignment #1 (see slide #25)
• Assignment #2 (see slide # 35)
36