Introduction to Genetics and Genomics
Download
Report
Transcript Introduction to Genetics and Genomics
51:123
• Bioinformatics Techniques
• Terry Braun, Ph. D. (genetics), M.S. (EE)
• Administration
–
–
–
–
Syllabus
Webpage
Text
Office hours
[email protected]
1
Syllabus
2
Webpage
• http://pdb.eng.uiowa.edu/~tabraun/biotech/2008
• Icon:
• https://icon.uiowa.edu
Please check and make sure you have
access.
3
Textbook
Beginning Perl for Bioinformatics," J. Tisdall,
O’Reilly, 2001.
Orders? 39.95 vs 23.95
4
Literature Review
• This year, students will read and
assemble a presentation on a paper
– The rest of the class must read the paper,
write a question pertaining to the paper,
and submit their question
– Also, you will evaluate your peers (and
submit your evaluations)
– More details on papers to come
5
Previous years text (2006):
Discovering Genomics, Proteomics, & Bioinformatics
second edition
by A. Malcolm Campbell and Laurie J. Heyer
• Comments
web companion:
http://www.aw-bc.com/geneticsplace/
Discovery questions (links on web companion)
I really like this text as it has so many examples and
additional links to external material. It should
allow you to perform additional learning as
needed.
This book has a biology emphasis.
6
Previous years texts (2004)
• “Programming Perl (3rd edition) ” L. Wall, T.
Christiansen, and J. Orwant, O'Reilly, 2000.
Another good one is:
• “Learning Perl,” Schwartz, O’Reilly, 2001
• These books clearly have a Perl emphasis.
7
Assumptions
• No programming background
or
• No biology or genetics background
(I'm assuming you have at least one).
• This will probably change in future versions of this
course.
• Previously there was a bias towards Linux/Unix/vi
– open source
– favorite development platform in my lab
– in this version of the course -- I will try to provide examples
from many different platforms
8
Motivation of this course
• Intro to Informatics (in CS)
• Intro to Bioinformatics (51:121)
– provides a first exposure to some available computational
techniques and resources
– however, the emphasis is on utilization
• In this course (51:123) -- I try to emphasize tools and
techniques that you would use to go about
developing your own computational resources
(software, systems, tools, etc).
• Computational Methods in Molecular Biology (51:122
-- Casavant, Scheetz, Xing)
– advanced topics
9
More Motivation
• In 2003, the "bioinformatics" was not a
searchable term on "monster.com"
• 2008 -- you will find many "hits"
– 21 for July and August
10
Programming
• This course will go through the Perl language
at a fairly introductory level. However, there
are some basic concepts in general to all
programming languages that are not covered
in this course.
• If you are new to programming, then you will
be responsible for the additional work of
learning basic programming concepts
• I will attempt to provide as much help as
possible
11
Editorial
• This is a fairly difficult course to teach just
because of the changes in technologies and
techniques (for example, PHP, web services,
webStart, Ruby, SNP chips, SNPlex,
pyrophosphate sequencing etc. are fairly
recent developments).
• However, whether you go on to develop your
own applications, or never write another line
of code again -- your knowledge and
understanding of this field will be a benefit to
your career.
12
Bioinformatics
What is it?
13
Many different things to many
different people?
• Generally, so broad, hard to define
• Bio . informatics =?= biology . informatics
• Informatics
– problem solving with computers (or custom
hardware) and software
– can you really use a computer without
software?
14
Bioinformatics
• Multiple dimensions upon which to
carve up bioinformatics
• One dimension
– software tool use
– software tool development
15
Examples: Tool Use
•
Is amino acid 269 in LRP5 conserved in vertebrates?
•
How many SNPs are in ABCA4?
•
I have treated tumor cells with a new compound in an experiment and
observe 32 genes that appear to be up-regulated. Is there a
commonality between these 32 genes that is significant?
•
I find a mutation in a gene in an individual with AMD. Does that gene's
protein interact with any other known proteins?
16
Examples: Tool Building
•
Given a novel genome sequence, find all genes and p-genes.
•
I want to design "sequence capture" probes for the exons of 40 genes that
cause RP. Obtain the exonic sequene, with at least 100 nt's flanking, and 1000
nts of the promoter from transcription start
•
I propose a new way to find disease-causing mutations in humans. I want to
only look in genes that have regions that are 1) highly conserved across
species, 2) have known functional protein domains (ex. transmembrane
domains), and 3) have mRNA secondary structure. Is this a good idea?
17
Major Areas of Bioinformatics
•
•
•
•
•
Genomics
– sequencing
– model organisms
– annotation
– discovery and understanding
– epigenomics
Genetics
– mapping
Disease
– discovery
– treatment/therapeutics
Proteomics
Genetic Engineering
– Gene therapy
•
Clinic/patient/translational
–
•
"Systems Biology"
–
•
e-QTLA mapping
New technology
–
•
TCGA
454-sequencing
Basic science
– model organisms
– structure and function
– discovery (number of genes, alternative
splicing, alternative transcription start,
miRNAs)
18
A Biology "Dilemma"
• There are students in high-school today
that think they like biology because it is
not quantitative
• Their surprise is that biology is
increasingly becoming quantitative.
19
Growing and Evolving Fields of
Genetics, Genomics, and
Computational Biology
• “Explosion”, “Avalanche”, “Tsunami” of data
• Complexity of data
–
–
–
–
–
"genome is 106.2 feet tall" (not 555 and 1/3 feet).
30,000 “genes”
structure, function unknown
pathways (circuits) unknown
so much of biology that we do not know
• chromosomal packaging, transcriptional regulation, post-translational
modification, evolutionary questions – introns early/late, etc.
•
•
•
•
Stone Age of medicine for determining drugs and treatment?
Multiple model organism genomes
Requires the basics of genetics and biology
Also represents an increasing need in computational expertise
20
Biomedical information tsunami
• overwhelming
volume of
data
• multitude of
sources
Taken from Ken Buetow, NCI
21
Incredible developments in
biomedical information generation
Taken from Ken Buetow, NCI
22
Treatment of Disease
time
Disease with genetic component
Accelerated
by Human
Genome
Project
ID genes
Diagnostics
Preventative
medicine
Pharmacogenomics
Understanding basic
biological defect
Gene therapy
Drug therapy
23
Adapted from Francis Collins
Stone Age of Medicine
Try compound on model organism
Does it
work?
No
Yes
Seek approval,
human testing,
clinical trials,
FDA, 16 years,
millions of dollars
No
Does it
harm/kill?
Yes
Try compound on
different model
organism or for
other result
Done
Knowledge from the genome
is moving us away from this.
24
Former BME student
“The ability to program is a must in this day of technology. As data
is collected at higher and higher rates for more accuracy, the
tasks of processing data has become a must. Through CIE, I
learned the basics of C programming as well as digital to analog
and analog to digital conversions which I use consistently. My
programming capabilities have allowed me to write programs
that process data as needed for my particular experiment and
does not limit my capabilities."
The bottom line -- whether you end up in industry or academia, you
can be wildly successful without ever programming a computer.
But those with even a basic understanding will be much better
25
prepared for the challenges of the future.
Genomes
(NIH/Ensembl 2003)
• Human, drosophila, mosquito, malaria
parasite, various microbial genomes (112),
mouse, rat, retorvirusus (50 – HIV, etc),
zebrafish, fugu, plants (arabidopsis thaliana,
barley, corn, cotton, potato, tomato, rice,
wheat, others)
26
•
•
•
•
•
•
•
•
What’s in a Genome?
Stone age of medicine?
– new drugs, treatments, procedures (genetic engineering), diagnosis
New materials (drugs, proteins, enzymes embedded – contact lenses,
prosthetics, etc)
Tissue engineering
– organ replacement, improving rejection response, new materials
– neuronal
Genome discoveries: RNAi, haplotype blocks, SNPs, the number of
genes, the number of pseudogene, etc.
Cells
Systems/pathways
Imaging (functional, cellular, molecular)
New devices and technology
– automated sequencers, gene chips, SNPlex, proteomics
….Opportunity
27
First Assignment
• Obtain access to a computer
• Examples:
–
–
–
–
home computer with Linux, Windows, or MacOS
Lab computer with Windows, MacOS, or Linux
CSS account
http://css.engineering.uiowa.edu (or go to 1256
SC)
• We will utilize (or install) perl on this computer at a later time.
28
Survey
29
Questions
30
Introduction to Genetics and
Genomics
51:123
Bioinformatics Techniques
Terry Braun
31
Outline
• Basic Mendelian Genetics
– Mendel’s laws
• independent assortment
• independent segregation
– mitosis and meiosis
– dominant/recessive and pedigrees
– alleles
• Basic molecular genetics
–
–
–
–
DNA
RNA
proteins
Central Dogma
• genes and gene structure
– cells and chromosomes
Principles of Genetics, Tamarin, Human Molecular Genetics 2, Strachan
and Read
32
Outline
• Basic Genomics
– Genome
• human
• others
– molecular genetics and genomics
• clones, contigs, libraries
33
Mendelian Genetics
• Humans have 22 pairs (diploid) of
chromosomes
• plus XX or XY
34
Genome Lexicon Overview (3 Bb)
Adenine
Thymine
Guanine
Cytosine
ATGC
purines AG
pyrimidines
CT
35
www.ensembl.org
Mendelian Genetics
• Rule of Segregation
– offspring receive ONE allele (genetic
material) from the pair of alleles possessed
by BOTH parents (offspring receives 2 of 4
possible)
– a gamete receives only one allele from the
pair of alleles possessed by an organism
– fertilization (union of 2 gametes)
reestablishes the double number
36
Mendelian Genetics
• Rule of Independent Assortment
– alleles of one gene can segregate
independently of alleles of other genes
– (Linkage Analysis relies on the violation of
Independent Assortment Rule)
37
mitosis
• cell duplication (duplicate genetic material)
– DNA synthesis (broad bean)
• S phase (40%), Gap2 (25%), Mitosis (10%), Gap1 (25%)
• DNA duplicates in S phase (engineering marvel)
– mitosis
• prophase, metaphase, anaphase, telophase
• prophase
– chromosomes coalesce (shorten, thicken – analogous to
“packaging”)
– each “chromosome” is now a pair of sister “chromatids”
– other structural activities (formation of spindle – microtubules that
is structural mechanism for separating homologous chromosomes,
centrosome divides [individual centriole])
– nuclear membrane breaks down
38
mitosis
• metaphase
– microtubules attached to centromeres
– homologous pairs are lined up
• anaphase
– physical separation of chromosome
– microtubule consumed
• telophase
– sister chromatids are separated (end of anaphase) and
pulled to opposite poles of cell
– nuclear membranes reform
– cell constricts and separates
– chromosomes uncoil and protein synthesis resumes
39
Significance of mitosis
• two daughter cells (“clones”?)
• identical genetic material to parent cell
(assuming perfect fidelity of copy
mechanism)
40
Ended Lecture
41
meiosis
• gamete formation (halving of genetic material, diploid to haploid)
• but also duplicating (cell divides in 2 phases, meiosis I, and
meiosis II)
– prophase I
•
•
•
•
•
•
•
•
chromosomes more spread out (relative to mitosis)
identical pairs matched
homologous pairs match up (called a bivalent)
crossing over can now occur
as chromatids shorten, and thicken, they are called “tetrads”
“chiasmata” – regions where crossover occurs
virtually all tetrads form at least one “chiasma”
thought to stabilize the tetrad
– see Holliday structure for “homologous recombination”
42
meiosis I
• metaphase I
– tetrads line up
– microtubules attach to sister chromosomes
• anaphase I
– sister chromatids are pulled to the same pole (in
mitosis, sister chromatids were pulled apart)
• telophase I – cell divides
• meiosis separates maternal and paternal
chromosome pairs
• meiosis II separates sister chromatids
43
meiosis II
• metaphase II
– sister pairs line up
• anaphase II
– sister pairs are pulled apart
• telophase II
– cell constricts and divides
44
meiosis
• significance
– four cells formed
– diploid to haploid
– randomness of chromosome separation
• very large number of different chromosomal combinations
• gamete can get either maternal, or paternal chromosome 223 = >> 8
million combinations
• more combinations of alleles because of recombination
• recombination – new arrangements of alleles due to either crossing
over or by independent segregation of homologous pairs
• 30,000 genes 230,023 combinations
– each gamete receives only one chromosome (rule of segregation)
– anaphase 1 – direction of separation independent of tetrads (rule of
independent assortment)
45
Molecular Genetics
• Not covered
– chemical structure of nucleotides and DNA
– molecular details of DNA duplication
• continuous replication, discontinuous, Okazaki
fragments, etc.
46
End
47