Transcript ppt
Bioinformatics,
Computational Biology
— An Introduction
1
“…the most wondrous map ever produced by
mankind” — Bill Clinton
2
3
DNA
4
Post Genome Era
Why small variation, BIG DIFFERENCE?
The difference between you & chimp is ~1.24%
The difference between you and Maggie is ~0.1%
5
Genetics:
From DNA
to
population
Source: gsk
6
7
Introduction – Gene History
1865 Mendel: The basic unit of inheritance
is a gene.
Mendel’s work was forgotten until 1900s.
1944 The gene was known to be made of
DNA (Deoxyribonucleic Acid).
1953 James Watson and Francis Crick :
Double helical structure of DNA.
(雙股螺旋)
8
Introduction – Gene History
(Cont.)
1990 The Human Genome Project (人類基
因體計畫 ) started.
1995 The first free-living organism to be
sequenced : haemophilus influenzae
(流行性感冒嗜血桿菌)
1998 CELERA joined the gene research.
2000 The human DNA sequence draft was
completed (published in 2001).
9
動物細胞(細胞核、細胞質、細胞膜)
DNA位於細胞核內之「核仁」
10
DNA Sequence
11
DNA Length
The total length of the human DNA is
about 3109 (30億) base pairs.
1% ~ 1.5% of DNA sequence is useful.
# of human genes: 30,000~40,000
Conclusion from the human genome
project
Expected # is 100,000 originally.
12
13
DNA Double Helix (雙股螺旋)
14
DNA/RNA 核甘酸分子
核甘酸(Nucleotide)包含:
-
五碳糖(去氧核糖, deoxyribose)
磷酸基(phosphate group)
四種含氮鹼基之一(A、G、C、T/U)
15
Backbone of DNA and RNA
16
Watson-Crick Base Pairs
17
DNA Double Helix (雙股螺旋)
18
From DNA to RNA to Protein
19
Biochemical Context of
Genomics and Proteomics
DNA
Genome
“Genomics”
mRNA
Proteins
Proteome
“Proteomics”
Cell functions
20
What is Bioinformatics?
Deduction of knowledge by computer
analysis of biological data
See 988000 pages on this issue on the WWW
Information stored in the genetic code (DNA), protein sequences
Protein 3D structures, chromosome structure
Protein interaction, transcription factor, motif
Micro array gene expression, functional MRI, 2D-gel
Experimental results
Patient statistics
Scientific literature
Analysis tools
21
Computational Biology & Bioinformatics
Computational Biology
Biological Hypothesis
Formal
Specifications
Raw Data
Algorithms
___
Bioinformatics
Information
End with Experiments
22
Key Strategy for Analysis
In Biology
In Computer Sciences
Evolution
Information
Consensus
Clustering
Sequences
FESS
Structures
Distance
Measurement
Functions
Data
23
Key Strategy for System Biology
Experiment Computer Aided Design
Specification, Simulation and Reverse Engineering
Reverse Engineering Strategy
Hypothesis
Simulation Results
重
新
假
設
n
實際Microarray 輸出結果
Match
y
Candidate Set
再作 Distinguishable
實驗
是否唯一吻合
是
Believe it or not
否
22
24
Problems on Different Levels
25
Some Problems in Bioinformatics
Sequence comparison
Longest common subsequence
Edit distance
Similarity
Multiple sequence alignment
Fragment assembly of DNA sequences
Shortest common superstring
Physical mapping
Double digest problem
Consecutive ones problem
Evolutionary trees
Molecular structure prediction
Protein folding
26
Bioinformatics and Computer
Science
Algorithm: all computing problems.
Image processing: 3D images of RNA
folds or protein.
Database: massive database and
retrieval.
Distributed system and parallel
processing: massive storage and
accelerating computation.
27
Conclusion
Biology easily has 500 years of exciting
problems to work on.
-- Donald E. Knuth
28
Go working for Integrating
Nano
Cognition
Biology
Informatics !
Reference – Journals
Bioinfomatics (SCI)
Bulletin of Mathematical Biology (SCI)
Computer Applications in the Biosciences
Journal of Computational Biology (SCI
expanded)
Journal of Mathematical Biology (SCI)
Journal of Molecular Biology (SCI)
Nucleic Acids Research (SCI)
Gene (SCI)
Science (SCI)
30
Reference – Web Sites
BioWeb
http://bioweb.uwlax.edu/
MIT Biology Hypertextbook
http://esgwww.mit.edu:8001/esgbio/
Bioinformatics Related Journals
http://www.iscb.org/journals.html
NCBI
http://www.ncbi.nlm.nih.gov/
31