slide - Molecular Biomedical Informatics / 分子生醫資訊實驗室
Download
Report
Transcript slide - Molecular Biomedical Informatics / 分子生醫資訊實驗室
Molecular Biomedical Informatics
分 子 生 醫 資 訊 實 驗 室
Machine Learning and Bioinformatics
機 器 學 習 與 生 物 資 訊 學
Machine Learning & Bioinformatics
1
Molecular biology
Nucleic acid
Protein
– DNA
– Amino acid
– RNA
– Primary structure
Central dogma
– Transcription
– Secondary structure
– Tertiary structure
– Translation
Machine Learning & Bioinformatics
2
Nucleic acid
A nucleic acid is a macromolecule composed
of chains of monomeric nucleotide
In biochemistry these molecules carry genetic
information or form structures within cells
The most common nucleic acids are
deoxyribonucleic acid (DNA) and ribonucleic
acid (RNA)
Machine Learning & Bioinformatics
3
http://juang.bst.ntu.edu.tw/BC2008/images/NA%20Fig1.jpg
Nucleic acid components
Sugar
http://www.mun.ca/biology/scarr/Fg10_09b_revised.gif
Machine Learning & Bioinformatics
5
Nucleic acid components
Base
Purine
– Adenine (A) and guanine (G)
Pyrimidine
– Thymine (T), cytosine (C)
– Uracil (U, only in RNA)
Machine Learning & Bioinformatics
6
http://www.elmhurst.edu/~chm/vchembook/images/580bases.gif
http://fig.cox.miami.edu/~cmallery/150/chemistry/sf3x14a.jpg
DNA
Chemically, DNA is a long polymer of simple units
called nucleotides, with a backbone made of sugars and
phosphate groups joined by ester bonds
Attached to each sugar is one
of four types of molecules
called bases
It is the sequence of these four
bases along the backbone that
encodes information
http://upload.wikimedia.org/wikipedia/commons/8/87/DNA_orbit_animated_small.gif
Machine Learning & Bioinformatics
9
DNA
Base pairing
Each type of base on one strand forms a bond
with just one type of base on the other strand
Here, purines form hydrogen bonds to
pyrimidines, with A bonding only to T, and C
bonding only to G
DNA sequence
– 5’CpGpCpApApTpT
3’TpTpApApCpGpC
– CGCGAATT
Machine Learning & Bioinformatics
10
http://www.ucl.ac.uk/~sjjgsca/NucleotidePairing.jpg
Double helix
http://www.coe.drexel.edu/ret/personalsites/2005/dayal/curriculum1_files/image001.jpg
Hydrogen bond
A hydrogen bond exists between an electronegative atom
and a hydrogen atom bonded to another electronegative
atom
This type of force always involves a hydrogen atom and the
energy of this attraction is close to that of weak covalent
bonds (155 kJ/mol), thus the name – Hydrogen Bonding
Biological functions
–
–
–
–
DNA/RNA base paring
protein secondary/tertiary structure formation
some properties of water molecule
antibody-antigen (and other protein-protein) binding
Machine Learning & Bioinformatics
13
Hydrogen bond is resulted
from electronegativity
http://upload.wikimedia.org/wikipedia/commons/4/43/Liquid_water_hydrogen_bond.png
Grooves
http://courses.biology.utah.edu/horvath/biol.3525/1_DNA/Fig2/marty_1.jpg
DNA structure
http://www.youtube.com/watch?v=qy8d
k5iS1f0&NR=1
Machine Learning & Bioinformatics
16
Any Questions?
About DNA
Machine Learning & Bioinformatics
17
Central dogma
http://fig.cox.miami.edu/~cmallery/255/255hist/mcb4.1.dogma.jpg
Central dogma
The process by witch information is extracted
from the nucleotide sequence of a gene and then
used to make a protein is essentially the same for
all living things on Earth
and is described by the grandly
named central dogma of
molecular biology
Information in cells passes from
DNA to RNA to proteins
http://upload.wikimedia.org/wikipedia/commons/3/3a/Crick's_1958_central_dogma.svg
Machine Learning & Bioinformatics
19
RNA
Information stored from DNA is used to make a more
transient, single-stranded polynucleotide called RNA
(Ribonucleic Acid)
RNA is very similar to DNA, but differs in a few
important structural details
– in the cell RNA is usually single stranded, while DNA is
usually double stranded
– RNA nucleotides contain ribose while DNA contains
deoxyribose (a type of ribose that lacks one oxygen atom)
– in RNA the nucleotide uracil substitutes for thymine, which
is present in DNA
Machine Learning & Bioinformatics
20
http://www.dadamo.com/wiki/dna-rna.png
Central dogma
Transcription
Transcription is the synthesis of RNA under
the direction of DNA
Both nucleic acid sequences use the same
language, and the information is simply
transcribed, or copied
DNA sequence is copied by RNA polymerase
to produce a complementary nucleotide RNA
strand, called messenger RNA (mRNA)
Machine Learning & Bioinformatics
22
DNA transcription
http://www.youtube.com/watch?v=vJSm
Z3DsntU
Machine Learning & Bioinformatics
23
Transcription detail
http://wwwclass.unl.edu/biochem/gp2/m_biology/an
imation/m_animations/gene2.swf
Machine Learning & Bioinformatics
24
RNA
Various types
mRNA
– messenger RNA (mRNA) is the RNA that carries
information from DNA to the ribosome
– the coding sequence of the mRNA determines the
amino acid sequence in the protein that is produced
Non-coding RNA
Machine Learning & Bioinformatics
25
Various RNA types
Non-coding RNA
Many RNAs do not code for protein
These ncRNAs encode in specific genes (RNA
genes) or mRNA introns
The most common ncRNAs are transfer RNA
(tRNA) and ribosomal RNA (rRNA)
Other ncRNAs such as microRNA (miRNA)
involve in post-transcriptional gene regulation
Machine Learning & Bioinformatics
26
http://eurheartj.oxfordjournals.org/content/vol0/issue2010/images/large/ehp57301.jpeg
Central dogma
Translation
Translation is the second stage of protein
biosynthesis
Translation occurs in the cytoplasm where the
ribosomes are located
In translation, mRNA is decoded to produce a
specific polypeptide according to the rules
specified by the genetic code
Machine Learning & Bioinformatics
28
From RNA to protein
synthesis
http://www.youtube.com/watch?v=NJxob
gkPEAo
Machine Learning & Bioinformatics
29
Protein translation
http://www.youtube.com/watch?v=nl8pS
lonmA0
Machine Learning & Bioinformatics
30
http://biology.kenyon.edu/courses/biol114/Chap05/code.gif
Any Questions?
About central dogma
Machine Learning & Bioinformatics
32
Protein
Machine Learning & Bioinformatics
33
Protein
Proteins are large organic compounds made of amino
acids arranged in a linear chain and joined together by
peptide bonds between the carboxyl and amino
groups of adjacent amino acid residues
Proteins can also work together to achieve a
particular function, and they often associate to form
stable complexes
Machine Learning & Bioinformatics
34
Protein
Amino acid
In chemistry, an amino acid is a molecule that
contains both amine and carboxyl functional
groups
In biochemistry, this term refers to alphaamino acids with the general formula
H2NCHRCOOH, where R is an organic
substituent
Machine Learning & Bioinformatics
35
http://upload.wikimedia.org/wikipedia/commons/thumb/c/ce/AminoAcidball.svg/702px-AminoAcidball.svg.png
Amino acid
Various side chains
The various alpha amino acids differ in which
side chain (R group) is attached to their alpha
carbon
They can vary in size from just a hydrogen
atom in glycine through a methyl group in
alanine to a large heterocyclic group in
tryptophan
Machine Learning & Bioinformatics
37
http://upload.wikimedia.org/wikipedia/commons/thumb/3/37/Aa.svg/2000px-Aa.svg.png
http://juang.bst.ntu.edu.tw/BC2008/images/Amino%281%29%202007/A1-7.JPG
http://juang.bst.ntu.edu.tw/BC2008/images/Amino%281%29%202007/A1-9.JPG
http://www.russell.embl-heidelberg.de/aas/other_images/lb3.gif
Machine Learning & Bioinformatics
41
Amino acid
The building blocks of proteins
Amino acids combine in a condensation
reaction and the new “amino acid residue” are
held together by peptide bonds
Proteins are defined by their unique sequence
of residues (primary structure)
As the letters form various words, amino acids
form a vast variety of sequences/proteins
Machine Learning & Bioinformatics
42
http://upload.wikimedia.org/wikipedia/commons/thumb/6/6d/Peptidformationball.svg/2000px-Peptidformationball.svg.png
http://juang.bst.ntu.edu.tw/BC2008/images/Amino(1)%202007/A1-11.JPG
http://juang.bst.ntu.edu.tw/BC2008/images/Amino(1)%202007/A1-13.JPG
Protein
After knowing amino acids
Amino acids form short polymer chains called
peptides or longer chains called either
polypeptides or proteins
The process of such formation from an mRNA
template (obeying genetic code) is known as
translation, which is part of protein
biosynthesis
Machine Learning & Bioinformatics
46
Protein structure hierarchy
Machine Learning & Bioinformatics
47
http://cropandsoil.oregonstate.edu/classes/css430/lecture%209-07/figure-09-03.JPG
http://juang.bst.ntu.edu.tw/BC2008/images/Protein(1)%202007/P1-4.JPG
50
http://juang.bst.ntu.edu.tw/BC2008/images/Protein(1)%202007/P1-8.JPG
http://juang.bst.ntu.edu.tw/BC2008/images/Protein(1)%202007/P1-9.JPG
Protein structure hierarchy
Secondary structure
In biochemistry and structural biology,
secondary structure is the general threedimensional form of local segments of
biopolymers such as proteins and nucleic acids
It does not, however, describe specific atomic
positions in three-dimensional space, which
are considered to be tertiary structure
Machine Learning & Bioinformatics
52
http://juang.bst.ntu.edu.tw/BC2008/images/Protein(2)%202007/P2-3.JPG
Protein structure hierarchy
Tertiary structure
The three-dimensional structure of a protein or
any other macromolecule, as defined by the
atomic coordinates
Describe the spatial relations among it
secondary structures
Tertiary structure is considered to be largely
determined by the protein’s primary sequence
Machine Learning & Bioinformatics
54
Protein tertiary structure
Experiment techniques
The majority of protein structures have been
solved with X-ray crystallography
The second common way is NMR (Nuclear
Magnetic Resonance)
– lower resolution
– limited to small proteins
– provide time-dependent information in solution
Machine Learning & Bioinformatics
55
http://campusapps.fullerton.edu/news/arts/2003/photos/protein-art.jpg
Protein structure hierarchy
Quaternary structure
Many proteins are actually
assemblies of more than one
polypeptide chain, which in the
context of the larger assemblage
are known as protein subunits
In addition to the tertiary structure
of the subunits, multiple-subunit
proteins possess a quaternary
structure, which is the arrangement
into which the subunits assemble
http://courses.cm.utexas.edu/jrobertus/ch339k/overheads-1/ch6_quat-struct1.jpg
Machine Learning & Bioinformatics
57
Protein sub-structure
Machine Learning & Bioinformatics
58
Protein sub-structure
Domain
A part of protein sequence
and structure that can
evolve, function, and exist
independently
About 25–500 aa
Often form functional
units
http://upload.wikimedia.org/wikipedia/commons/6/67/1pkn.png
Machine Learning & Bioinformatics
59
Zinc fingers are
small protein
structural motifs
that can coordinate
zinc ions to help
stabilize their
folds
http://upload.wikimedia.org/wikipedia/commons/7/79/Zinc_finger_DNA_complex.png
Protein sub-structure
Motif
A sequence motif indicate a nucleotide or
amino-acid sequence pattern that is widespread
and often has a biological significance
For proteins, a sequence motif is distinguished
from a structural motif, a motif formed by the
three dimensional arrangement of amino acids,
which may not be adjacent
Machine Learning & Bioinformatics
61
Protein sub-structure
Structure motif
A 3D structural element or fold, which appears
also in a variety of other molecules
In the context of proteins, the term is
sometimes used interchangeably with
“structure domain,” although a domain need
not be a motif nor, if it contains a motif, need
not be made up of only one
Machine Learning & Bioinformatics
62
http://www.biomedcentral.com/content/figures/1471-2164-8-60-8.jpg
http://juang.bst.ntu.edu.tw/BC2008/images/Protein(1)%202007/P1-3.JPG
Molecular biology
Reference
台大莊榮輝教授網站
– http://juang.bst.ntu.edu.tw/BC2008/index.htm
交大分子生物學網站
– http://www.life.nctu.edu.tw/~mb/c40101.htm
Machine Learning & Bioinformatics
66
Any Questions?
About molecular biology
Machine Learning & Bioinformatics
67