Lecture 1: Introduction, bioinformatics in biological study and

Download Report

Transcript Lecture 1: Introduction, bioinformatics in biological study and

LSM3241: Bioinformatics and Biocomputing
Lecture 1: Introduction
Prof. Chen Yu Zong
Tel: 6874-6877
Email: [email protected]
http://xin.cz3.nus.edu.sg
Room 07-24, level 7, SOC1,
National University of Singapore
What is Expected:
To learn the most-widely used bioinformatics tools
• Basic understanding of the method in each tool (normally required
in a college module)
• Capable of explaining the algorithm to a layperson (so that you
are perceived as an expert!)
• Knowing the application range and limitation of each tool (now
the real expert!)
To learn through real-case studies, focused on applications
and problem solving:
• Lectures, labs, tutorials oriented toward real-case studies (we
have an “open-lab” policy).
• Study of real and recently-emerged biological problems, virus
research, drug design, systems biology (to give you the experience
to work for a life-science lab or a pharmaceutical company).
2
Labs, Exams and Textbook:
“Open-lab” policy:
• Our lab assignments only uses internet tools and downloadable software
(which means that you can do the projects “any-time, any-place”)
• You do not have to show-up in your lab, as long as you submit lab-report
on time.
• Project-report submission system at:
http://bidd.nus.edu.sg/lsm3241/upload.htm
Exams:
• 2 Projects (25% each), 1 Final (open-book, 50%).
Textbook:
• As most of the topics are not covered by existing textbooks, you are not
required to have a textbook. The following are recommended reference
books:
– Introduction to Bioinformatics. Arthur M. Lesk. 2002. Oxford University Press;
ISBN: 0199251967
– Bioinformatics: The Machine Learning Approach (Adaptive Computation and
Machine Learning). Pierre Baldi, Soren Brunak. 2001. The MIT Press; ISBN:
026202506X
– Molecular modelling : principles and applications. Andrew R. Leach. Imprint
Harlow, England; Singapore:
3
Topics covered:
Lecture 1: Introduction (week1):
• Examples of bioinformatics tools applied to real-life
biological and drug design problems
• Identification of SARS pathogen.
• How a protein substrate escape?
• Computer aided drug design
Lecture 2: Bioinformatics of viral genome (week2):
• Viral genome database
• Protein annotation.
• Protein inhibitors.
Note: Please do not just listen. Get familiar with the biology-side of the topics in
advance
4
Topics covered:
Lecture 3: Molecular database development (week3):
• Protein inhibitor search
• Getting chem-info about inhibitors.
• 2D and 3D structures.
• Database construction
Lecture 4: Sequence analysis (week4):
• Sequence alignment methods revisited (pair-wise, BLAST,
MSA, PSI-BLAST)
• Identification of a novel coronavirus as the SARS pathogen.
Project 1: Functional prediction of proteins in viral
genomes by PSI-BLAST and SVM (25%) (week3-6)
Note: Please do not just listen. Get familiar with the biology-side of the topics in advance
5
Topics covered:
Lecture 5: Support vector machines for protein function
prediction (week5):
• Support vector machines method for protein function
prediction
• Use of SVMProt for protein function prediction.
Lecture 6: Fundamentals of molecular modeling (week6):
• Structural organization of a molecule.
• Basic interactions and models
• Modeling methods (conformation search, energy
minimization)
Lecture 7: Modeling software (week7):
• Learn to use a modeling software.
6
Topics covered:
Lecture 8: Gene Expression profiles and microarray data
analysis (week8)
Lecture 9: Clustering analysis of microarray data
Project 2: Clustering analysis of microarray data from
GEO database
(25%) (week9-11).
Lecture 10: Biological pathway simulation
7
Bioinformatics in Life Science Research
and Drug Discovery:
Examples:
• Identification of a novel coronavirus as the SARS
pathogen.
• How a metabolite escape from a protein?
• Design of anti-HIV drugs
Note:
• Learn from these examples how bioinformatics tools can be used to
solve biological and drug design problems, which tool to use.
• Also pay attention to the biological nature of each problem.
8
SARS Coronavirus
A novel coronavirus
Identified as
the cause of
severe respiratory
syndrome (SARS )
9
SARS Infection
How SARS
coronavirus
enters
a cell and
reproduce
itself?
10
History of SARS Epidemics
Big question
in early
stages:
What is the
cause of
SARS ?
11
The Search of SARS pathogen
Suspect groups:
• A broad range of viral, bacterial, chlamydial,and rickettsial agents
that likely to cause the SARS symptoms
Chief suspects:
• Versinia, mycoplasma, chlamydia, legionella, coxiella burnetii
• spotted fever and typhus group rickettsiae,influenzaviruses A and B,
Paramyxovirinae and Pneumovirinae subfamily viruses (specifically,
human respiratory syncytial virus and human metapneumovirus),
Mastadenoviridae, Herpetoviridae,Picornaviridae, Old and New
World hantaviruses, and Old World arenaviruses.
Consider yourself as a detective, how to solve a crime?
Identify Suspect and Come up with Search Strategies
12
The Search of SARS pathogen
Traditional detection methods:
•
•
•
•
•
Virus isolation in suckling mice and cell culture
Electron microscopy
Histopathological examination
Serologic analysis
General and specialized bacterial culture techniques
Molecular detection methods:
• Polymerase chain reaction (PCR)
• Reverse-transcription PCR (RT-PCR)
• Real-time PCR
Followed by sequence comparison with those of existing pathogens
New England Journal of Medicine 348, 1953-1966 (2003)
13
The Search of SARS pathogen
Findings from molecular detection:
• A 405-nucleotide segment of the coronavirus polymerase gene open
reading frame 1b was amplified from the isolation material by RT-PCR
with the broadly reactive primer set IN-2–IN-4. In contrast, this primer set
produced no specific band against uninfected cells.
• When compared with other human and animal coronaviruses, the
nucleotide and deduced amino acid sequence from this region had
similarity scores ranging from 0.56 to 0.63 and from 0.57 to 0.74,
respectively. The highest sequence similarity was obtained with group II
coronaviruses.
New England Journal of Medicine 348, 1953-1966 (2003)
14
The Search
of SARS pathogen
• Sequence comparison
identifies a novel
coronavirus as the SARS
pathogen
15
Sequence Comparison of
SARS coronavirus with others
16
SARS Coronavirus Genome
Get familiar with all the known genes (genome location, sequence,
function. Where to get these info?
17
How can an enzyme metabolite escape?
The enzyme acetylcholinesterase
generates a strong electrostatic field
that can attract the cationic substrate
acetylcholine to the active site.
However, the long and narrow active
site gorge seems inconsistent with the
enzyme's high catalytic rate.
E+S E+P
How does the metabolite P escape?
Acetylcholinesterase (AChE) is the enzyme
responsible for the termination of signaling
in cholinergic synapses (such as the
neuromuscular junction) by degrading the
neurotransmitter acetylcholine. AChE has a
gorge, 2 nm deep, leading to the catalytic
site
18
How can an enzyme metabolite escape?
Metabolite
unlikely
escape
through the
entrance
How can it
escape?
19
How can an enzyme metabolite escape?
How can it escape?
Can you tell which of
the following
possibilities is likely or
unlikely, and why?
Protein unfolding
Condensation of
ions on protein
surface to counterbalance the force
Change of electric
charge on
metabolite
Alternative escape
route
20
How can an enzyme metabolite escape?
Alternative
route
An “open
back door”
policy:
Transient
opening of a
channel to
allow the
metabolite to
escape
21
Protein structure
From static view to dynamic view
Protein should not be viewed as a static structure
Protein flexibility is an intrinsic feature of enormous biological significance
22
Modeling of protein motion
by molecular dynamics simulation
Protein motion can be simulated by
means of molecular dynamics
simulations:
Trajectory of atom movement is
determined by Newton’s second law:
F=ma
x(t)=x(0)+vt+1/2 a t2
Typical MD software:
AMBER, CHARM, TINKER
TINKER is freely downloadable
23
MD simulation of acetylcholinesterase
MD simulation clearly
reveals transient opening
of a channel “back door”
Science 263, 1276-1278
(1994)
The open “back door”
allows the metabolite P
to escape
24
Design of anti-HIV drugs
HIV virus structure
25
Design of anti-HIV drugs
HIV viral genome
26
Design of anti-HIV drugs
Recognition
of HIV infected cell
Vaccine-based
drugs
27
Design of anti-HIV drugs
Pathways of HIV infection and reproduction and
sites of drug action
28
Design of anti-HIV drugs
Pathways of HIV infection and reproduction and
sites of drug action
29
Design of anti-HIV drugs
Selection of a target: HIV-1 protease
30
Design of anti-HIV drugs
HIV-1 protease structure and cavity
31
Design of anti-HIV drugs
Drug and protein:
Lock and key mechanism, blocking=>stopping of protein function
32
Design of anti-HIV drugs
33
Design of anti-HIV drugs
Drug design:
• Step 1: Finding the right target in the genome
• A key protein involved in viral cycle (stop the disease process)
• Different from human proteins (reduce side-effects)
• Step 2: Finding or making a chemical agent to stop the
protein
• In majority of cases: protein inhibitors
• Step 3: Test and clinical trials
34
Design of HIV-1 Protease Inhibitor
.
35
Design of HIV-1 Protease Inhibitor
.
36
Design of HIV-1 Protease Inhibitor
.
37
Design of HIV-1 .Protease Inhibitor
38
Success Stories:
• HIV-1 Protease Inhibitors in the market:
– Inverase (Hoffman-LaRoche, 1995)
– Norvir (Abbot, 1996)
– Crixivan (Merck, 1996)
– Viracept (Agouron, 1997)
Drug discovery today 2, 261-272 (1997)
39
Design of anti-SARS drugs
Pathways of SARS infection
and reproduction and sites of
drug action.
Research works underway
But the efforts have cooled
down due to the “elimination”
of this virus
40
Summary of Today’s lecture
• Bioinformatics tools in real-life biological research and
drug design problems
• Tools include:
–
–
–
–
Sequence analysis
Microarray data analysis (relatively new, not covered)
Molecular modeling
Computer-aided drug design
41