No Slide Title

Download Report

Transcript No Slide Title

Statistics for Microarrays
Biological background: Molecular Biology
Class web site: http://statwww.epfl.ch/davison/teaching/Microarrays/
Acknowledgements
• http://www.accessexcellence.org/AB/GG
•http://www.oup.co.uk/best.textbooks/bioch
emistry/genesvii
• Sandrine Dudoit, UC Berkeley Biostatistics
• Yee Hwa Yang, UC Berkeley Statistics
• Terry Speed, UC Berkeley Statistics and
WEHI, Melbourne, Australia
Two types of organisms*
* Every biological ‘rule’ has exceptions!
Timeline of Genetics Highlights
Mendelian Genetics
http://www.stg.brown.edu/webs/MendelWeb/MWtoc.html
Human Chromosomes
Human Chromosome Banding Patterns
Chromosomes and DNA
Cell Division -- Mitosis
Cell Division -- Meiosis
Crossing over and Recombination
Mitosis and Meiosis Compared
(BREAK)
DNA Structure Discovery
Nature (1953), 171:737
“We wish to suggest a structure for the salt of deoxyribose
nucleic acid (D.N.A.). This structure has novel features which
are of considerable biological interest.”
DNA
• A deoxyribonucleic acid or DNA molecule is a
double-stranded linear polymer composed of
four molecular subunits called nucleotides
• Each nucleotide comprises a phosphate group, a
deoxyribose sugar, and one of four nitrogen
bases: adenine (A), guanine (G), cytosine (C), or
thymine (T)
• The two strands are held together by weak
hydrogen bonds between complementary bases
• Base-pairing occurs according to the rule:
G pairs with C, and A pairs with T
Polymorphic DNA Tertiary Structures
DNA B-type (7BNA)
(Watson-Crick form)
DNA A-type (140D) DNA Z-type (2ZNA)
(low water content) (high salt concentration)
Genes are linearly arranged along chromosomes
DNA Structure
(overview)
DNA Structure
The monomeric
units of nucleic
acids are called
nucleotides.
A nucleotide is a
phospate, a sugar, and a
purine (A, G) or a
pyramidine (T, C) base.
Nucleotide Bases
Adenine (A)
Guanine (G)
Thymine (T)
Cytosine (C)
(DNA)
(Pyrimidines)
(Purines)
Uracil (U)
(RNA)
Nucleotide codes
A
Adenine
W
Weak (A or T)
G
Guanine
S
Strong (G or C)
C
Cytosine
M
Amino (A or C)
T
Thymine
K
Keto (G or T)
U
Uracil
B
Not A (G or C or T)
R
Purine ( A or G)
H
Not G (A or C or T)
Y
Pyrimidin e (C or T)
D
Not C (A or G or T)
N
Any nucleotide
V
Not T (A or G or C)
Base Pairing
Proteins
• Proteins: macromolecules composed of one
or more chains of amino acids
• Amino acids: class of 20 different organic
compounds containing a basic amino group (NH2) and an acidic carboxyl group (-COOH)
• The order of amino acids is determined by
the base sequence of nucleotides in the
gene coding for the protein
• Proteins function as enzymes, antibodies,
structures, etc.
Amino acid codes
Ala
Arg
Asn
Asp
Cys
Gln
Glu
Gly
His
Ile
Leu
Lys
Met
Phe
Pro
Ser
Thr
Trp
Tyr
Val
Asx
Glx
Sec
Unk
A
R
N
D
C
Q
E
G
H
I
L
K
M
F
P
S
T
W
Y
V
B
Z
U
X
Alanin e
Arginin e
Asparagin e
Aspartic acid
Cysteine
Glut amin e
Glut amic acid
Glycine
Histidin e
Isoleucine
Leucine
Lysine
Methionine
Phenylalanin e
Prolin e
Serin e
Threonine
Tryptophan
Tyrosine
Valin e
Asn or Asp
Gln or Glu
Selenocysteine
Unknown
Primary Protein Structure
Multiple Levels of
Protein Strucure
( Protein folding)
Tertiary Structure of
Sperm whale myoglobin (1MBN)
(RT)
DNA Replication
Nature (1953), 171:737
“It has not escaped our notice that the specific pairing we have
postulated immediately suggests a possible copying
mechanism for the genetic material.”
DNA Replication
• The DNA strand that is copied to form a new
strand is called a template
• In the replication of a double-stranded or
duplex DNA molecule, both original (parental)
DNA strands are copied
• When copying is finished, the two new duplexes,
each consisting of one of the original strands
plus its copy, separate from each other
(semiconservative replication)
Semiconservative Replication
DNA Replication, ctd
• DNA synthesis occurs in the chemical direction 5’3’
• Nucleic acid chains are assembled from 5’ triphosphates of
deoxyribonucleosides (the triphosphates supply energy)
• DNA polymerases are enzymes that copy (replicate) DNA
• DNA polymerases require a short preexisting DNA strand
(primer) to begin chain growth. With a primer base-paired
to the template strand, a DNA polymerase adds
nucleotides to the free hydroxyl group at the 3’ end of the
primer.
• DNA replication requires assembly of many proteins (at
least 30) at a growing replication fork: helicases to unwind,
primases to prime, ligases to ligate (join), topisomerases to
remove supercoils, RNA polymerase, etc.
DNA Replication Fork
DNA Synthesis
DNA is unwinding 
RNA
• RNA, or ribonucleic acid, is similar to DNA, but
-- RNA is single-stranded
-- the sugar is ribose rather than deoxyribose
-- uracil (U) is used instead of thymine
• RNA is important for protein synthesis and other
cell activities
• There are several classes of RNA molecules,
including messenger RNA (mRNA), transfer RNA
(tRNA), ribosomal RNA (rRNA), and other small
RNAs
The Genetic Code
• DNA: sequence of four different
nucleotides
• Protein: sequence of twenty different
amino acids
• The correspondence between the fourletter DNA alphabet and the twenty-letter
protein alphabet is specified by the
genetic code, which relates nucleotide
triplets, or codons, to amino acids
Standard Genetic Code
Variation of genetic codes
T1
T2
T3
T4
T5
T6
T9
T10
T12
T13
T14
T15
CUU
CUC
CUA
CUG
Leu
Leu
Leu
Leu
-
Thr
Thr
Thr
Thr
-
-
-
-
-
Ser
-
-
-
AUU
AUC
AUA
AUG
Ile
Ile
Ile
Met
Met
-
Met
-
-
Met
-
-
-
-
-
Met
-
-
-
UAU
UAC
UAA
UAG
Tyr
Tyr
Stop
Stop
-
-
-
-
Gln
Gln
-
-
-
-
Tyr
-
Gln
AAU
AAC
AAA
AAG
Asn
Asn
Lys
Lys
-
-
-
-
-
Asn
-
-
-
-
Asn
-
-
UGU
UCG
UGA
UGG
Cys
Cys
Stop
Trp
Trp
-
Trp
-
Trp
-
Trp
-
-
Trp
-
Cys
-
-
Trp
-
Trp
-
-
AGU
AGC
AGA
AGG
Ser
Ser
Arg
Arg
Stop
Stop
-
-
Ser
Ser
-
Ser
Ser
-
-
Gly
Gly
Ser
Ser
-
T1: standard
T2: vert mt
T3: yeast mt
T4: other mt
T5: invert. mt
T6: cil. etc nuc.
T9: ech. mt
T10: eup. nuc.
T12:alt yeast nuc
T13: asc. mt
T14: flat. mt
T15: bleph. nuc.
Protein Synthesis