Brief Introduction of Bioinformatics

Download Report

Transcript Brief Introduction of Bioinformatics

Bioinformatics
Educated by Zhenglin Zhu
School of Life Sciences, Chongqing U.


Brief introduction to bioinformatics
Linux



Sequence Alignment







Theory
SPDBViewer
Microarray



Theory
Practice to make trees using MEGA
Protein modeling


Basics of sequence alignment
BLAST
EMBOSS
Biological Databases
Phylogenetic Tree


Basic orders and management
Bash Programming
Theory
Limma
Designing for PCR primers
For Education Materials:
http://life.cqu.edu.cn/courses/
bioinformatics/
Brief Introduction to Bioinformatics
Zhengli Zhu, School of Life Sciences
Outline
1.
2.
3.
4.
5.
What is bioinformatics?
What is biological data object?
Where to obtain biological data?
How to deal with biological data?
Start your career as a bioinformatician
Bioinformatics is the
storage, management and
analysis of biological data.
Management
Storage
Analysis
Biological
Data
Outline
1.
2.
3.
4.
5.
What is bioinformatics?
What is biological data object?
Where to obtain biological data?
How to deal with biological data?
Start your career as a bioinformatician
2003
1991
开始人类基因
1988-1990
研究论证
1986
Cited form“Wikipedia – Central
dogma of molecular提出计划
biology”
History
组计划
2000
完成人类基
发布人类基
因组测序
因组草图
The era of omics
Biological
data
DNA
(Genomics)
RNA
(Transcriptomics)
Protein
(proteomics)
Outline
1.
2.
3.
4.
5.
What is bioinformatics?
What is biological data object?
Where to obtain biological data?
How to deal with biological data?
Start your career as a bioinformatician
Raw
data
Raw data
Sequencing – DNA & RNA
Microarray – DNA & RNA
Level 1
database
literature
Mass spectrography - protein
Database
Level 1: raw data storage
Level 2: knowledge based and
manually curated
Level 2
database
Raw data
Level 1
database
User
dataset
Level 2
database
literature
Level 2
database
Reanalysis and literature
mining
Level 1 database
NCBI is friendly

Boolean expression query



Keyword1 AND keyword2
Keyword1 OR keyword2
Keyword1NOT keyword2
The
Entrez
system


Data transfer through FTP
More than 20 nodes query network
Basic literature mining
process of new gene
evolution in the form of
pseudo code.
While (keywords are not empty)
Do {
Search “new gene AND microevolution” in PubMed system;
focus on “review” and “recent” papers;
generate new keywords;
}
EBI is rigorous

Powerful tool development
EMBOSS
ClustalW
PICR
InterProScan
Bioconductor
DNA
RNA
• uID
• Annotation
•…
• uID
• Expression
•…
Protein
• uID
• Structure
•…

Ensemble mysql interface
 Well-defined tables
 Perl API
Level 2 database
In situ
hybridization
High throughput
Deep sequencing
More professional and evaluated details!
Outline
1.
2.
3.
4.
5.
What is bioinformatics?
What is biological data object?
Where to obtain biological data?
How to deal with biological data?
Start your career as a bioinformatician
Data analysis strategy
Bioinformatician:
data orientated
Biologist:
question driven
Analysis pipelines are usually generated by data-oriented tasks, while questiondriven tasks demand a combination of different pipelines.
Data oriented
Sequence analysis
GC content
Codon usage
Features
Motif/domain
recognition
Secondary
structure
analysis
sequence
Searching for
similar
sequences
Phylogenetic
analysis
Data oriented
Genome analysis
De novo
assembly
Comparative
genomics
annotate
Population
genetics
Genome
Resequencing
Variation
detection
Quantitative
trait locus
analysis
Genome wide
association
study
Data oriented
Transcriptome analysis
Profiling
Identification
differentially
expressed genes
Enrichment
analysis
Small RNA
Identify small
RNAs
Target prediction
RNA
Data oriented
Proteomics analysis
Interaction
identification
Peptide mass
fingerprinting
Protein
Novel
protein/peptide
identification
Profiling
Structural
proteomics
Homology
modeling
Function infer
The dream bioinformaticians From data integration to system biology
http://intermine.modencode.org/release-26/flyRegulatoryNetwork.do
Question driven
Information:
Drosophila Gene
Age-dependent
Male-biased
Distribution
Background:
12 Drosophila species
Melanogaster as model
Sex chromosome evolution
Cited from http://rana.lbl.gov/drosophila/
Swift thoughts

Solution




Fetch gene information from database
Find out the presence of genes in different species
Find out male-biased genes
Analyze their chromosomal distribution
Dating D. melanogaster genes on Drosophila
phylogenetic tree
Inference of gene orientation mechanism
Outline
1.
2.
3.
4.
5.
What is bioinformatics?
What is biological data object?
Where to obtain biological data?
How to deal with biological data?
Start your career as a bioinformatician
Two major types of bioinformatics guys
Work as a data analyst?


You should be familiar with
different type of biological data
and capable of dealing with
them in any way
Strong computer skills





Play biosoftwares
Experts in Linux
Very familiar with at least one
programming language
Know how to set up a database
Familiar with algorithms and
able to generate models
Or dedicate yourself to
biological questions?


The most important thing is
sophisticated understanding
of biological concepts
Basic computing skills



Comfortable to work in Linux
Able to write scripts
Maybe some wet
experimental skills


PCR
Cloning
Useful URL

Biological sources



NCBI http://www.ncbi.nlm.nih.gov/
EBI http://www.ebi.ac.uk/
Related labs


Our lab http://gene.cqu.edu.cn/labweb/
CBI http://www.cbi.pku.edu.cn/