Introduction to Genetics and Genomics

Download Report

Transcript Introduction to Genetics and Genomics

Introduction to Genetics and
Genomics
51:123
Terry Braun
1
Outline
• Basic Mendelian Genetics
– Mendel’s laws
• independent assortment
• independent segregation
– mitosis and meiosis
– PCR and markers
– dominant/recessive and pedigrees
• genotype and phenotype
– alleles
• Basic molecular genetics
–
–
–
–
DNA
RNA
proteins
Central Dogma
• genes and gene structure
– cells and chromosomes
Principles of Genetics, Tamarin, Human Molecular Genetics 2, Strachan
and Read
2
KeyTerms
• marker – a region of the genome that may often be uniquely
identified and distinguished between individuals
• minisatellite – a type of marker that varies in length from 14 to 100
nucleotides
• microsatelite – a type of marker that is very short (2, 3, 4, 5, 6
nucleotides) -- aka STRP's (short tandem repeat polymorphisms)
• polymorphism – a sequence variation
• SNP -- single nucleotide polymorphism
• polymerase chain reaction (PCR) – a reaction that mimics DNA
duplication in meiosis (aka DNA amplification) (Kary Mullis)
• DNA polymerase – a molecule that is essential for DNA duplication
(and PCR)
• primer – a piece of DNA that is essential for starting DNA replication
(and PCR)
• genotype – the genetic state of an individual (typically represented
by a marker)
3
Genetic Marker
– A genetic marker allows for the observation of the
genetic state at a particular genomic location (locus).
• A genotype is the measured state of a genetic marker.
• A tool for observing inheritance patterns (Mendel's rules and
meiosis)
• May never be feasible to sequence cases directly, however the
current cost is decreasing
– An “informative” marker is often “heterogeneous, or
“polymorphic” and enables the observation of the
inheritance of genetic material.
4
Example -- genotypes
Pedigree
male
female
parents
offspring
11
11
uninformative
11
11
12
14
34
24
heterogeneous
These labels (markers) are a measure of the genetic state of each individual.
Recall from "Rule of Segregation", offspring get one gene from each parent.
Markers are not genes, but they are regions on chromosomes (meiosis).
5
What a marker looks like in the
Genome
Geneticists assign numerical values to different versions of markers
6
Sources of Markers in the Genome
• duplications
• unequal homologous recombination
• slippage and errors during DNA
duplication
7
Duplicating DNA – to Use Markers
to "Probe" Genomes of Individuals
• mitosis is process that copies DNA in biology
• the first step is to "unzip" the 2 strands of the
double helix (DNA)
• an enzyme called DNA polymerase makes a
copy by using each strand as a template
• two other components
– nucleotides (A, G, T, C) (A-T, G-C, etc)
– a short stretch of DNA called a "primer" (to prime the
process)
8
PCR – Polymerase Chain Reaction
• PCR is a process that copies DNA exponentially
• mimics the process by organisms, but in vitro (in
a test tube)
• relies on the ability of DNA-copying enzymes to
remain stable at high temperatures
• Necessary components (in a vial)
–
–
–
–
piece of DNA to be copied
large quantities of four nucleotides
large quantities of primer sequence
DNA polymerase (Taq – named for Thermus
aquaticus, a bacterium that lives in hot springs)
9
PCR Reaction
• The reaction can be carried out entirely in a vial
simply by changing the temperature
– separate the 2 strands (in DNA)
• heat to 75-90 C (165 F) for 30 seconds
• this "melts" the DNA apart – the base pairing comes undone
– "anneal" the primers
• primers cannot bind to the template strands at such high
temp – cooled to 55 C for 20 seconds
– make complete copy of template (and thus new
templates for the next cycle)
• Taq polymerase works best at 75 C (hot springs)
• nucleotides are added (complement – if template has A, T is
added, etc)
10
PCR Reaction
•
Three steps
– separation of strands
– annealing of primers to template
– synthesis of new strands
•
•
•
•
Takes approx. 2 minutes
Each reaction is carried out in the same vial, and after every cycle, each
piece of DNA is duplicated (exponential copying)
Cycle can be repeated 30 times (2^30 = 1,073,741,824)
1 million copies can be made in approximately 3 hours from a single copy of
DNA
– this is why very minute samples can be used to identify individuals in crime
scene investigations
•
•
•
Valuable tool to multiply unique regions of DNA so they can be detected in
LARGE genomes
Note, we need to know the flanking sequence to be able to design primers
Also, this flanking sequence needs to be unique otherwise the reaction
could amplify sequence from multiple regions of the genome
11
Exponential Nature of Reaction
12
Sequencing Reaction
13
Automated
14
Components of the Reaction
15
DNA polymerase (Taq) and
Synthesis
16
17
Animations
• http://allserv.rug.ac.be/~avierstr/principles/
pcrani.html
18
Markers – the early days
• Prior to the HGP, markers were (and still are)
valuable tools for observing inheritance patterns
• Investigators consumed considerable time and
resources identifying markers
• Some markers were observed in a test group of
individuals to asses quality, and heterogeneity.
– CEPH (Centr d'Etude du Polymorphisme Humain)
• Affymetrix SNP Chip -- 500,000 SNPs (~$450 -2007)
19
Marker GATA50G06/D15S643,
Genotypes, and primers –
133101: 215, 197
133102: 219, 215
Genomic chr15 :
ttctgctctt
ccctactttg
ATACCTGGAG
ctcttatcct
tgtctgtcta
tctatctatc
AGGTTTTAAA
tctatctatc
atctgtcacc
ttgtctaaaa
ccgttgctgc
TCCTTGGTCC
tggggacaga
tctatctatc
tatctatcta
GCTGTTatcc
tatctatcta
tattta
tgtcagtcta
ctggctatac
ttcttgggaa
ttaaaccctt
tgtctatcta
cctacctaac
ttggggacag
tctatctatc
aatccttact
cttgtattta
aaagtattga
aaactatcta
tctatctatc
tacctaccaa
attaaaccct
tatctatcta
tgtaattgtg
ttgctggcct
ggttttaaag
tctgtctgtc
tatctatcta
aaaaGCATTG
caaccctcta
tctatctatc
57501064
57501114
57501164
57501214
57501264
57501314
57501364
57501414
http://genome.ucsc.edu/cgibin/hgc?hgsid=76756345&o=57501058&t=57501337&g=stsMap&i=GATA50G06&
c=chr15&l=57401058&r=57601337&db=hg18&pix=800
http://research.marshfieldclinic.org/genetics/genotypingData_Statistics/genotypes_
20
referenceIndividuals.asp
Marker GATA50G06/D15S643,
Genotypes, and primers –
133101: 215, 197
133102: 219, 215
Genomic chr15 :
ttctgctctt
ccctactttg
ATACCTGGAG
ctcttatcct
tgtctgtcta
tctatctatc
AGGTTTTAAA
tctatctatc
atctgtcacc
ttgtctaaaa
ccgttgctgc
TCCTTGGTCC
tggggacaga
tctatctatc
tatctatcta
GCTGTTatcc
tatctatcta
tattta
tgtcagtcta
ctggctatac
ttcttgggaa
ttaaaccctt
tgtctatcta
cctacctaac
ttggggacag
tctatctatc
aatccttact
cttgtattta
aaagtattga
aaactatcta
tctatctatc
tacctaccaa
attaaaccct
tatctatcta
tgtaattgtg
ttgctggcct
ggttttaaag
tctgtctgtc
tatctatcta
aaaaGCATTG
caaccctcta
tctatctatc
57501064
57501114
57501164
57501214
57501264
57501314
57501364
57501414
21
Genome to Gene
Sequence
Markers are typically NOT genes, however they may
reside in the genome relatively close to a gene.
22
Basis for Inheritance of Disease:
Examples
Aa
Pedigree
Aa
male
female
parents
offspring
Aa
1/2
A
Aa
AA
1/2
a
1/2
A
1/4
AA
1/4
Aa
1/2
a
1/4
Aa
1/4
aa
P(AA) = 1/4
P(Aa) = 1/2
P(aa) = 1/4
AA
Aa
A from mom/dad?
a from mom/dad?
23
1
2
3
4
5
6
232
234
236
238
240
242
1
2
3
4
5
Examples
234
236
238
240
242
234 238 232 238
14
24
3
234, 238 234, 238
238, 238 238, 232
234, 232
If you "genotype" an individual at enough markers, you can calculate the
probability of uniquely identifying an individual.
Note that the lawyers for OJ Simpson argued that "recoded" allele numbers
increased the likelihood of contamination and false identification.
24
Examples
Affected individuals
25
Examples
Dominant model
Geneticists then look for genes that mimic this pattern of inheritance
26
Example
Recessive model.
Very unlikely, because "founders"
marrying in also carry the disease,
which by definition is a rare
genetic disorder.
27
BBS4 Pedigree
28
Monogenic and Polygenic
Diseases
– monogenic (Mendelian) -- one gene
• “simple” (dominant and recessive) Mendelian inheritance
• direct correspondence between one gene mutation and one
disorder
• majority of disease genes found are monogenic
– polygenic -- (complex) multiple genes
• heterogeneity – disease caused by multiple genes
• epistasis – disease caused by multiple interacting genes
• obviously finding these is harder -- but why???
29
...Mongenic and Polygenic
Diseases
• phenocopy
• reduced penetrance
– Example -- sickle cell anemia
•
•
•
•
“classic” recessive disorder
defect in red blood cells (hemoglobin)
but… infant hemoglobin gene can “leak”
wide range of phenotypes
30
Bardet-Biedl Syndrome (BBS)
• Obesity
– Diabetes/
hypertension
•
•
•
•
•
•
Retinopathy
Hypogenitalism
Polydactyly
Mental Retardation
Renal Anomalies
Heart defects
Rare disorder, but common phenotypes
31
Molecular Analysis of BBS
•
•
•
•
•
•
•
•
BBS1 - 11q13
BBS2 - 16q22
BBS3 - 3p13
BBS4 - 15q21
BBS5 - 2q31
BBS6 - 20p12
BBS7 - 4q27
BBS8 - 14q31
Novel*
Novel*
Novel†, TPR Repeats
Type II Chaperonins
Novel*
Novel†, TPR Repeats
*,† - Some Similarity
32
33
Some Useful Properties of DNA
• fragments of DNA have a minute negative charge
– if you apply an electric field to DNA in a matrix, it will migrate to
the positive pole
• DNA is a linear molecule, but it tends to fold up (similar
to a knot)
– this bound up molecule of DNA will have a unique crosssectional area profile that is dependent on its sequence
• Gel electrophoresis – DNA is placed in a polyacrylamide
gel and a voltage is applied
– polyacrylamide gel and pool analogy
– applied charge will cause DNA to migrate dependent on its size,
and its sequence
34
BBS4 Deletion (by PCR)
Example of Usage
exons 3
4
35
Molecular Genetics
• Not covered
– molecular details of DNA duplication
• continuous replication, discontinuous, Okazaki
fragments, etc.
36
Genome – so now we know where
it comes from biologically – at least
most of it
• mitochondria
– organelle of eukaryotes
– number varies per cell – 10 to 10K
– human mitochondria is 16,569 nts
– mostly coding (no introns???)
– duplex strand and circular
– inherited maternally only
• consequences
– mito thought to be originally free-living bacteria
– origins (one or multiple events?)
37
Leber Optic Atrophy
• LHON
– mid-life, central vision loss
– caused by missense mutations in mtDNA
– generally familial
38
• Evolution of the
mitochondrial
genome and
origin of
eukaryotic cells
39
END
40
Another Marker?
BRCA1-A good predictive marker of drug sensitivity in breast cancer treatment?
* Mullan PB,
* Gorski JJ,
* Harkin DP.
Centre for Cancer Research and Cell Biology, Queen's University Belfast, Belfast,
Northern Ireland, BT9 7AB, United Kingdom.
There are currently only two predictive markers of response to chemotherapy for breast
cancer in routine clinical use, namely the Estrogen receptor-alpha and the HER2
receptor. The breast and ovarian cancer susceptibility gene BRCA1 is an important
genetic factor in hereditary breast and ovarian cancer and there is increasing evidence
of an important role for BRCA1 in the sporadic forms of both cancer types. Our group
and numerous others have shown in both preclinical and clinical studies that BRCA1 is
an important determinant of chemotherapy responses in breast cancer. In this review
we will outline the current understanding of the role of BRCA1 as a determinant of
response to DNA damaging and microtubule damaging chemotherapy. We will then
discuss how the known functions of this multifaceted protein may provide mechanistic
explanations for its role in chemotherapy responses.
41
Hardy-Weinberg Equilibrium
• Rule that relates allelic and genotypic frequencies in a
population of diploid, sexually reproducing individuals
if that population has random mating, large size, no
mutation or migration, and no selection
• Assumptions
– allelic frequencies will not change in a population
from one generation to the next
– genotypic frequencies are determined in a
predictable way by allelic frequencies
– the equilibrium is neutral -- if perturbed, it will
reestablish within one generation of random mating
at the new allelic frequency
• Ideal case
42
Expected allele frequencies
Deviations from distribution
may indicate special cases.
43
H-W
• f(AA) = p2
• f(Aa) = 2pq
• f(aa) = q2
• (p+q)2
• (p2 + q2 + r2 + 2pq + 2pr + 2qr)= (p+q+r)2
44
Use of H-W
•
•
•
All other things being equal, we can "expect" that the distribution of genes in
a subset of a population would be represented by the distribution of genes
in the population
Deviations from this expected distribution is evidence of selection or
enrichment
Association – when a specific variation of a gene (allele) is correlated with a
phenotype (or disease, or trait) more frequently than you would expect by
H-W
– also called Linkage Disequilibrium (since genes are normally in
equilibrium)
Often used to evaluate validity of an assay. For example, let us say that I
genotype 400 people at a marker with 2 alleles (A and B). I observe the
following genotypes:
marker1: AA: 36
AB 168 BB 196
marker2: AA 2
AB 37
BB 360
marker3: AA 64
AB 144
BB 192
Which maker is suspicious?
45
Will return to Linkage in Later
Lectures
46
47
48