Genetic Maps - BIOTEC - Biotechnology Center TU Dresden

Download Report

Transcript Genetic Maps - BIOTEC - Biotechnology Center TU Dresden

Genome
Lesk,
Introduction to Bioinformatics,
Chapter 2
Michael Schroeder
BioTechnological Center
TU Dresden
Biotec
Organisms and cells
 All organisms consist of small cells
 Human body has approx 6x1013 cells of about 320
different types
 Cell size can vary greatly
 Human red blood cell  5 microns (0.005 mm)
 Neuron from spinal cord 1m long
 Two types of organisms
 Prokaryotes - Bacteria for example
 Eukaryotes - most other organisms
 Archaea – few organisms living in hostile
environments
By Michael Schroeder, Biotec, 2004
2
Genomes and Genes:
Not all DNA codes for genes
Organism
Number of bp
ФX-174
Human mitochondrion
Mycoplasma pneumoniae
5386
16,569
816,394
Mycoplasma laboratorium
Genes
10 Virus infecting E.coli
37 Subcellular organelle
680 Pneumonia
382 Minimal genome project
Hemophilus influenzae
1,830,138
1,738 Middle ear infection
E. coli
4,639,221
4,406
Saccharomyces cerevisiae
12.1 x 106
5,885 Yeast
C. elegans
95.5 x 106
19,099 Worm
Drosophila melanogaster
1.8 x 108
13,601 Fruit fly
H. sapiens
3.2 x 109
22,333 Human
By Michael Schroeder, Biotec, 2004
3
Genetic information
 Genes as discovered by Mendel entirely abstract
entities
 Chromosomes are physical entities and their
banding patterns their landmarks
 Chromosomes are numbered in size (1=largest)
 Human chromome: p (petite=short), q (queue) arm,
e.g. 15q11.1
 DNA sequences = hereditary information in physical
form
By Michael Schroeder, Biotec, 2004
4
Locating genes
 The disease cystic fibrosis is known since middle
ages, the relevant protein was not
 Folklore: „Children with excessive salt in sweat noticable when kissing them on forehead - were short
lived“
 Implication: Chloride channel in epithelial tissues
 Search in family pedrigrees identified various genetic
markers (Variable Number Tandem Repeat), which
limited the genomic region first from 1-2 Mio bp to
300kb
 Finally the deletion 508Phe in the CFTR gene was
identified as cause
By Michael Schroeder, Biotec, 2004
5
Chromosome
By Michael Schroeder, Biotec, 2004
6
Chromosome banding pattern map
By Michael Schroeder, Biotec, 2004
7
Chromosome banding pattern map
By Michael Schroeder, Biotec, 2004
8
2 Types of Maps: Physical Map
Genome sequencing projects supply the DNA
sequence of each chromosome
The physical distance is the number of base
pairs that separate two genes
180 Mbp
110
100
Gene A
Gene B
…ACTGTATGACTGGCATGGCACTGGGGCAAATGTGCACTC…
5
0
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003
By Michael Schroeder, Biotec, 2004
9
2 Types of Maps: Genetic Map
• Chromosomes are carriers of
genetic information
• Genetic information is linked and
linearly arranged inside the
chromosome
• This linkage is sometimes
broken: recombination
(crossing-over)
Genetic Maps
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003
By Michael Schroeder, Biotec, 2004
10
2 Types of Maps: Genetic Maps
 Genes located far from each other are more likely to be uncoupled
during a crossing-over
 A Morgan is the genetic distance in which 1
crossing-over is expected to occur
110 cM
78
70
2
0
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003
By Michael Schroeder, Biotec, 2004
11
Why 2 Types of Maps?
Historical background
Genetic markers may be mapped in only
one system (conversions needed)
Genetic markers may be ambiguous
Different systems provide us with
complementary information (not completely
redundant)
C. Voigt, S. Ibrahim, S.
P. Serrano
Fernández.
Non-linear
ByMöller,
Michael
Schroeder,
Biotec,
2004 map conversions. German Conference on Bioinformatics, 2003
12
Expected Map Conversion
bps / cM
Linear relationship
bps / cM
C. Voigt, S. Ibrahim, S. Möller, P. Serrano Fernández. Non-linear map conversions. German Conference on Bioinformatics, 2003
By Michael Schroeder, Biotec, 2004
13
Observed Map Conversion
 Non linear relationship (Yu A, et al. 2001.
Nature, 409:951-3
 Outliers
 Marker abiguity
 Local marker density
bps / cM / cR
 Inversions
Linear relationship
bps
Human chromosome 12
bps / cM / cR
cM
C. Voigt, S. Ibrahim, S.
P. Serrano
Fernández.
Non-linear
ByMöller,
Michael
Schroeder,
Biotec,
2004 map conversions. German Conference on Bioinformatics, 2003
14
General Properties

Gene density and recombination
 Recombination is mostly higher in areas with a high gene
density.
bps
high gene
density
Yao, et al. (2002)
Proc Natl Acad Sci
99(9):6157
high recombination
Human chromosome 12
cM
C. Voigt, S. Ibrahim, S.
P. Serrano
Fernández.
Non-linear
ByMöller,
Michael
Schroeder,
Biotec,
2004 map conversions. German Conference on Bioinformatics, 2003
15
How to Detect Genes?
 Detecting of regions similar to known coding regions from
other organisms
 Gene expressed (in another organism)  mRNA  cDNA = EST
(Expressed Sequence Tags)
 search for start of EST
 Ab initio: derive gene from sequence itself
 Bacteria easy as genes are contiguous
 Eucaryotes problem: alternative splicing
Initial exon:
 Search for TATA box ~30bp upstream,
 no in-frame stop codon,
 ends before GT splice signal
Internal exon:
 AG splice signal,
 no in-frame stop codons,
 ends before GT splice signal
 Final exon followed by polyadenylation
By Michael Schroeder, Biotec, 2004
17
Brent, Nat Biotech, 2007
By Michael Schroeder, Biotec, 2004
18
How to detect genes:
De novo prediction
 GenScan (late 90s)
 predicts 10% of ORFs in human genome
 Overprediction of 45,000 genes (~22,000 current
estimate)
 TwinScan (ealry 2000s):
 Use alignment between target and a related genome:
detect one third of ORFs in human genome
 N-Scan
 Includes pseudo gene detection
 Predicts 20,138 genes
By Michael Schroeder, Biotec, 2004
19
Applications
 Genetic diversity and anthropology
 Cheetahs very closely related to each other pointing to
a population bottleneck 10,000 years ago
 Humans: mitochondrial DNA passed on through
maternal line, Y chromosome from father to son
Variation in mitochondrial DNA in humans suggests
single maternal ancestor 140,000-200,000 years ago
Population of Iceland (first inhabited 1100 years ago)
descended from Scandinavian males and femals from
Scandinavia and the British Isles
Basques linguistically and genetically isolated
By Michael Schroeder, Biotec, 2004
20
Evolution of Genomes
 Phylogenetic profiles
 What genes do different phyla share?
 What homologous proteins do different phyla share
 What functions to different phyla share?
By Michael Schroeder, Biotec, 2004
21
Shared functions of
bacteria, archaea, and eucarya
 Functions shared by Haemophilus influenza (bacteria), Methanococus jannaschii
(archaea), Saccharomyces cerevisiae (eucarya)
 Energy:






Biosyntehsis of cofactors, amino acids
Central and intermediary metabolism
Energy metabolism
Fatty acids and phospholipids
Nucleotide biosynthesis
Transport
 Information:
 Replication
 Transcription
 Translation
 Communication and regulation
 Regulatory functions
 Cell envelope/cell wall
 Cellular processes
 Can we construct a minimal organism?
By Michael Schroeder, Biotec, 2004
22
Summary
 Relation of DNA, genes and chromosomes
 Relationship of distance in Morgan and basepairs
 How to find genes in DNA
 By similarity
 Ab initiov with Introns, exons, alternative splicing
 Read Lesk, chapter 2
By Michael Schroeder, Biotec, 2004
23