To determine whether related genes appear in other species

Download Report

Transcript To determine whether related genes appear in other species

Bioinformatics
Ch1. Introduction (continue-2)
阮雪芬
Nov7, 2002
NTUST
www.ntut.edu.tw/~yukijuan/lectures/bioinfo/Nov7.ppt
Searching for Similar Sequences in
Databases: PSI-BLAST
Determine the sequence of a new gene
Identify within the human genome a gene responsible for some disease
To determine whether related genes appear in other species
The ideal method is
1. Sensitive
2. Selective
Searching for Similar Sequences in
Databases: PSI-BLAST
 A powerful tool for searching sequence
databases with a probe sequence is PSIBLAST
 BLAST: An earlier program worked by
identifying local regions of similarity
without gaps and then piecing them
together.
 PSI-BLAST: refer to enhancements that
identify patterns within the sequences at
preliminary stages of the database search,
and then progressively refine them.
Ex. 1.4
P26367
Ex. 1.4
PAX-6
Ex. 1.4
http://www.ncbi.nlm.nih.gov/blast/index.html
PSI- and PHI-BLAST
Sequence
Ex. 1.4
gi|1352719|sp|P47237|PAX6_CHICK Paired box protein Pax-6 379 e-105
Ex. 1.4
EX.1.5
 What species contain homologues of
human PAX-6 detectable by PSIBLAST?
EX.1.5
EX.1.5
EX.1.5
Introduction to Protein Structure
 Protein Function:
Structural proteins
Catalyse chemical reactions
Transport and storage proteins
Regulatory proteins
 Hormones
 Receptor/signal transduction proteins
 Control genetic transcription
 Protein involved in recognition
 cell adhesion molecules
 Antibodies and other proteins of the immune
system




Primary Structure
Peptide-bond Formation
Amino Acid Sequences Have
Direction
Components of a Polypeptide Chain
•Backbone: constant
•Side chains: variable
Introduction to Protein Structure
 Proteins are large
molecules
(backbone)
Secondary Structure
a-helix
b-sheet
Supersecondary structure
 a-helix hairpin
 b-helix hairpin
 b-a-b unit
Domains
 Many proteins
contain compact
units within the
folding pattern of a
single chain, that
look as if they
should have
independent
stability.
The cell-surface protein CD4 consists
of four similar domains
Modular proteins
 Modular proteins are multidomain
proteins which often contain many
copies of closely related domain.
 Fibronectin:
 a large extracellular protein involved in
cell adhesion and migration
 contains 29 domains including multiple
tandem repeats of three typesof domains
called F1, F2 and F3
http://www.bork.embl-heidelberg.de/Modules/07-matrix.gif
Classification of Protein Structure
 Occupies a key
position in
bioinformatics, not
least as a bridge
between sequence
and function
Divide the Proteins into Domains
Protein Structure Prediction and
Engineering
 Secondary
structure prediction
 Fold recognition
 Homology
modelling
Protein Structure Prediction and
Engineering
 Critical Assessment of Structure
Prediction (CASP)
 Protein engineering
Clinical Implications
 Categories of applications




Diagnosis of disease and disease risks
Genetics of responses to therapy
Identification of drug targets
Gene therapy
Diagnosis of Disease and Disease
Risks
 DNA sequencing can detect the absence of
a particular gene, or a mutation
 Identification of specific gene sequences
associated with disease
 In many cases our genes do not irrevocably
condemn us to contract a disease, but raise
the probability that we will.
 a1-antitrypsin: inhibit elastase in the alveoli of
the lung
A combination of genetic and environmental factors
Genetics of Responses to Therapy
 Customized treatment
 Sequence analysis permits selecting
drugs and dosages optimal for
individual patients--pharmacogenomics
 The very toxic drug 6-mercaptopurine
is used in the treatment of childhood
leukaemia
Lack Thiopurine methyltransferase
Die
Identification of Drug Targets
 A target is a protein the function of
which can be selectively modified by
interaction by a drug, to affect the
symptoms or underlying causes of a
diease.
Gene Therapy
 A gene is missing or defective 
replace it
 A gene is overactive  turn it off