gene - Ministerios Probe
Download
Report
Transcript gene - Ministerios Probe
The Myth of
Junk DNA
Dr. Raymond G. Bohlin
Fellow, Discovery Institute
Probe Ministries
Non-Protein Coding DNA
2001
– 65,000 mRNAs, but only 4% from
exons
2002 – ENCODE found 11,655 non-proteincoding RNAs
2005 – most of mammalian DNA is
transcribed
2008 – both strands used in transcription
and frequently from overlapping
segments
Evolutionary predictions
If
a sequence is non-functional, then over
time the sequence should degrade
If
a sequence is functional, then the
sequence should be conserved by
natural selection.
Non-Protein-coding DNA
2005
– non-coding regions in humans and
mice, hundreds of nucleotides long are
identical.
Such ultra conserved regions (UCR)
regulate developmentally important
functions
This is not expected by evolution!
Introns
Introns
are not just inert spacers between
exons
2005 – intronic sequence is highly
conserved between humans, mice, rats,
dogs and chickens – likely functional
Mammalian thyroid receptor gene
produces two variant proteins with
opposite effects – splicing is regulated by
an intron.
Co-expressed loci are clustered together along in the
nucleus, sometimes to “create” genes
Nuclear compartment
with concentrated
transcription factors
Chromosome 5 loop
Chromosome 21 loop
Chromosome 2 loop
Pseudogenes
A
pseudogene is a gene that closely
resembles a functional gene but appears
to be a useless leftover
Pseudogenes as defined above would be
predicted by evolution but difficult under
ID
The human genome may have as many
as 2000 pseudogenes
pseudogenes
Some
pseudogenes appear to suppress
expression of the functional gene.
The pseudogene can be transcribed and
this transcript binds to the mRNA
sequence of the functional gene, thus
blocking translation. “RNA interference”
Transcribed pseudogenes serve as
“perfect decoys” for RNA degrading
enzymes, thus enhancing expression.
Repetitive Sequences
About
half of the mammalian genome
consists of various types of repetitive
sequences.
Long Interspersed Nuclear Elements –
LINEs
Short Interspersed Nuclear Elements –
SINEs
Endogenous Retroviruses - ERVs
Overview of LINEs
LINEs and SINEs have different structural arrangements. The
major LINE in the human genome is the L1. This sequence:
Is found throughout Mammalia but is largely taxon-specific
Is variously truncated at the 5’ end: ranges from 6-8kb to a few
hundred bps in length
Has a biased chromosomal distribution: AT-rich chromosome
bands and the X-chromosome
ORF1
ORF2: Reverse transcriptase
and endonuclease
G-dense
Pu:Py
element
(A-rich ‘tail’)
Species-specific
regulatory region
3’ UTR
(A-rich ‘tail’)
Chimp
Human
Chimp- vs. Human-Specific L1s*
0 L1Hs(Ta) elements
210 L1 nonTa elements
476 L1Pa2 elements
271 L1Hs(Ta) elements
252 L1 nonTa elements
490 L1Pa2 elements
5-6 Million Years Ago
*Mills, R.E. et al. 2006. Recently mobilized transposons in the human and chimpanzee genomes. Am. J. Hum. Genet. 78: 671-679.
Remember the layout of a mammalian gene? Many human
gene folders are bordered by species-specific repertoires of
L1s.
RNA outputs
L1s
“Gene” 2
“Gene” 1
“Gene” 4
“Gene” 3
“Gene” 5
L1s
Almost forty percent
of human nuclear
matrix attachment
elements are L1
sequences.
Overview of SINEs
The major SINE in the human genome is Alu. Unlike LINE-1,
Alu (and other SINEs) do not encode enzymes for their
mobilization. This sequence:
Is
primate-specific—subfamilies are distributed in a
taxonomically hierarchical manner (same with LINE-1)
Is ~300 bps in length; consists largely of two dimers (with
sequence differences)
Has a biased genomic distribution: GC-rich chromosome
bands
Central
A-stretch
(A-rich ‘tail’)
Monomer A
31 bp
insert
Monomer B
Chimp
Human
Chimp- vs. Human-Specific SINEs*
233 other Alu elements
50 AluS elements
1167 other Alu elements
263 AluS elements
10 AluYa5 elements
1,709 AluYa5 elements
9 AluYb8 elements
1,290 AluYb8 elements
360 AluY elements
484 AluY elements
979 AluYc1 elements
356 AluYc1 elements
1 AluYg6 elements
261 AluYg6 elements
396 SVA (SINE) elements
864 SVA (SINE) elements
5-6 Million Years Ago
*Mills, R.E. et al. 2006. Recently mobilized transposons in the human and chimpanzee genomes. Am. J. Hum. Genet. 78: 671-679.
Any seemingly random aspect of chromosome
sequence arrangement is not. A case in point
involves endogenous retroviruses (ERVs):
A. Human ERVs contribute 51,197 promoter elements
that initiate transcription at various stages (Conley et
al., Bioinformatics 24: 1563-1567, 2008).
B. Mouse ERVs are highly expressed at the 2-cell
embryo stage (and are the earliest to be
transcribed in the zygote) and are essential for
ontogenesis (Kigami et al., Biology of
Reproduction 68: 651-654, 2003).
ERVs
In
humans ERVs help regulate blood cell
production and metabolizing fat
ERVs also regulate gene expression in the
gastrointestinal tract, mammary glands,
and testes.
The ERV derived protein syncitin is
required for the fusion of fetal and
maternal cells in the placenta.
Although less than 2% of genomic
DNA in many vertebrates (e.g.,
mammals) can be placed in the
traditional “gene” category, nearly
all sequences are transcribed in a
cell- and tissue-specific manner.
DNA as Computer
Information
carried by DNA is
bidirectional, multi-layered, and
interleaved.
Repetitive elements format and
punctuate the information at different
scales
Cells can write codes onto non-coding
DNA so phenotype is not always equal to
genotype
“metaprogramming” – Cornell Conf.