Biological Introduction
Download
Report
Transcript Biological Introduction
Class 1: Introduction
.
Source: Alberts et al
The Tree of Life
The Cell
Example: Tissues in Stomach
DNA Components
Four nucleotide types:
Adenine
Guanine
Cytosine
Thymine
Hydrogen bonds:
A-T
C-G
Source: Alberts et al
The Double Helix
Source: Mathews & van Holde
DNA Duplication
Source: Alberts et al
DNA Organization
Genome Sizes
E.Coli
(bacteria)
Yeast (simple fungi)
Smallest human chromosome
Entire human genome
4.6 x 106 bases
15 x 106 bases
50 x 106 bases
3 x 109 bases
Genes
The DNA strings include:
Coding regions (“genes”)
E. coli has ~4,000 genes
Yeast has ~6,000 genes
C. Elegans has ~13,000 genes
Humans have ~32,000 genes
Control regions
These typically are adjacent to the genes
They determine when a gene should be
expressed
“Junk” DNA (unknown function)
Transcription
sequences can be transcribed to RNA
Source: Mathews & van Holde
Coding
RNA
nucleotides:
Similar to DNA, slightly different backbone
Uracil (U) instead of Thymine (T)
RNA Editing
Source: Mathews & van Holde
RNA Editing
RNA roles
Messenger
RNA (mRNA)
Encodes protein sequences
Transfer RNA (tRNA)
Adaptor between mRNA molecules and aminoacids (protein building blocks)
Ribosomal RNA (rRNA)
Part of the ribosome, a machine for translating
mRNA to proteins
...
Transfer RNA
Anticodon:
matches a codon (triplet of mRNA nucleotides)
Attachment site:
matches a specific amino-acid
Translation
Translation
is mediated by the ribosome
Ribosome is a complex of protein & rRNA
molecules
The ribosome attaches to the mRNA at a translation
initiation site
Then ribosome moves along the mRNA sequence
and in the process constructs a poly-peptide
When the ribosome encounters a stop signal, it
releases the mRNA. The construct poly-peptide is
released, and folds into a protein.
Source: Alberts et al
Translation
Source: Alberts et al
Translation
Source: Alberts et al
Translation
Source: Alberts et al
Translation
Source: Alberts et al
Translation
Genetic Code
Protein Structure
Proteins
are polypeptides of 70-3000
amino-acids
This
structure is
(mostly) determined
by the sequence of
amino-acids that
make up the protein
Protein Structure
Evolution
Related
organisms have similar DNA
Similarity in sequences of proteins
Similarity in organization of genes along the
chromosomes
Evolution plays a major role in biology
Many mechanisms are shared across a wide
range of organisms
During the course of evolution existing
components are adapted for new functions
Evolution
Evolution of new organisms is driven by
Diversity
Different individuals carry different variants of
the same basic blue print
Mutations
The DNA sequence can be changed due to
single base changes, deletion/insertion of DNA
segments, etc.
Selection bias
Course Goals
Computational
We
tools in molecular biology
will cover computational tasks that are posed
by modern molecular biology
We will discuss the biological motivation and setup
for these tasks
We will understand the the kinds of solutions exist
and what principles justify them
Four Aspects
Biological
What is the task?
Algorithmic
How to perform the task at hand efficiently?
Learning
How to adapt parameters of the task form
examples
Statistics
How to differentiate true phenomena from
artifacts
Example: Sequence Comparison
Biological
Evolution preserves sequences, thus similar genes might
have similar function
Algorithmic
Consider all ways to “align” one sequence against
another
Learning
How do we define “similar” sequences? Use examples to
define similarity
Statistics
When we compare to ~106 sequences, what is a random
match and what is true one
Topics I
Dealing with DNA/Protein sequences:
Genome projects and how sequences are found
Finding similar sequences
Models of sequences: Hidden Markov Models
Transcription regulation
Protein Families
Gene finding
Topics II
Gene Expression:
Genome-wide expression patterns
Data organization: clustering
Reconstructing transcription regulation
Recognizing and classifying cancers
Topics III
Models of genetic change:
Long term: evolutionary changes among species
Reconstructing evolutionary trees from current day
sequences
Short term: genetic variations in a population
Finding genes by linkage and association
Topics IV
Protein World:
How proteins fold - secondary & tertiary structure
How to predict protein folds from sequences data
alone
How to analyze proteins changes from raw
experimental measurements (MassSpec)
2D gels
Class Structure
2
weekly meeting
Class: Mondays 16-18
Targil: Tuesday 18-20
Grade:
60% in five question sets
Each contains theoretical problems & practical
computer questions
40% test
5% bonus for active participation
Exercises & Handouts
Check
regularly
http://www.cs.huji.ac.il/~cbio