Genome Science Syllabus - NCSU Bioinformatics Research Center
Download
Report
Transcript Genome Science Syllabus - NCSU Bioinformatics Research Center
NCSU Summer Institute of
Statistical Genetics, Raleigh 2004:
Genome Science
Session 1: Genome Projects
Genome Science Syllabus
1.
2.
3.
4.
5.
6.
Introduction to Genome Projects (PH)
Genome Sequencing (PH)
SNPs and Variation (GG)
Gene Expression Profiling (PH)
Proteomics and Functional Genomics (GG)
Metabolomics and in silico Genomics (PH)
Genomics Timeline
The Central Dogma
Replication
DNA
DNA
Transcription
mRNA
Translation
Protein
Gene Structure
• Bacteria
Gene 1
Promoter
Gene 2
Translation Start
Gene 3
Translation stop
Transcription Start
• Eukaryotes
Intron 1
Exon 1
Intron 2
Exon 2
Exon 3
Promoter
Translation Start
Translation stop
Transcription Start
REGULATORY REGIONS
Gene Regulation
• In any given cell type, only 10 - 20% of all genes are “expressed”
• Regulation at the transcription level occurs by trans-acting proteins
binding to cis-acting regulatory enhancers and repressors
• Post-transcriptional regulation includes:
- alternate splicing
- RNA localization
- translational control
- protein modification
• Maybe 15% - 30% of any genome is dedicated to gene regulation
Big Questions
• How can we find genes in the midst of genome sequence ?
• Which genes are expressed in which tissues ?
• What do all the proteins encoded by the genes do ?
• What is the molecular nature of genetic variation ?
• What lies beyond reductionist genetics: network genomics ?
Introduction to Genome Projects
•Aims of Genomics
•Genetic and Physical Maps
•The Human Genome Project
•Animal Genome Projects
•Plant Genome Projects
•Microbial and Parasite Genome Projects
General Goals of Genomics
(i) Establishment of an integrated web-based database and research interface.
(ii) Assembly of physical and genetic maps of the genome.
(iii) Generation and ordering of genomic and expressed sequences.
(iv) Identification and annotation of the complete set of genes encoded by the genome.
(v) Compilation of atlases of gene expression.
(vi) Accumulation of functional data, including biochemical and phenotypic properties.
(vii) Characterization of DNA sequence diversity.
(viii) Provision of resources for online comparison with other genomes.
Assembling a Genetic Map
Radiation Hybrid mapping
Assembly of physical maps
Chromosome painting
Cat-Human Synteny
Cytological, genetic, & physical maps
A Vision for the HGP
Collins et al, Nature 422: 835-847 (2003)
5 Year Plan 2
1.
2.
3.
4.
5.
6.
7.
The Human DNA Sequence
complete sequence by end of 2003, 2 years earlier than initially planned
complete one third of sequence by end of 2001
finish working draft of 90% of genome in mapped clones by end of 2001
Sequencing Technology
continue to increase throughput and reduce cost
support research on novel technologies and integration with genome projects
Human Genome Sequence Variation
develop technologies for large scale SNP identification and scoring
identify common variants in coding regions of the majority of identified genes
create a SNP map of at least 100,000 markers
develop intellectual foundations for studies of sequence variation
create public resources of DNA samples and cell lines
Technology for Functional Genomics
develop cDNA resources
support methods for study of function of non- protein-coding sequences
develop technology for comprehensive analysis of gene expression
improve methods for genome-wide mutagenesis
develop technology for global protein analysis
Comparative Genomics
complete sequences of C. elegans and D. melanogaster genomes by 2002
develop mouse physical and genetic maps and cDNA resources
aim to complete mouse genome sequence by 2005
identify and start work on other important model organisms
Ethical, Legal and Social Implications (ELSI)
Bioinformatics and Computational Biology
improve content and utility of databases
develop better tools fo data generation, capture and annotation
develop and improve tools for comprehensive functional studies
improve tools for representing and analyzing sequence similarity and variation
create mechanisms for production of robust, exportable shared software
5 Year Plan 3
I.
I-1
I-2
I-3
I-4
I-5
II.
II-1
II-2
II-3
II-4
II-5
II-6
III.
III-1
III-2
III-3
III-4
Genomics to Biology
Comprehensive identification of structural and regulatory components of the human genome
Elucidate the organization of genetic regulatory circuits and pathways in cells and organisms
Develop a detailed understanding of heritable variation in the human genome
Understand evolutionary variation across species and the mechanisms that generate it
Develop policy options for dissemination and use of genome information in research and clinics
Genomics to Health
Develop robust strategies of identifying genetic components of disease and drug susceptibility
Begin to develop strategies for identifying alleles that contribute to “good health”
Develop molecular taxonomy of disease states and predictive algorithms of progression
Use new understanding of genes and pathways to develop better therapeutics
Investigate how genetic information is used in the clinical setting and how it influences choice
Develop genome-based tools for improve the health of all
Genomics to Society
Develop policy options for use of genomics and medical and non-medical settings
Understand the relationship between genomics, race, and ethnicity
Understand the consequences of uncovering the genetic basis of human traits and behaviors
Assess how to define ethical boundaries for use of genomic information
Who’s genome was sequenced ?
The NCBI website
http://www.ncbi.nlm.nih.gov/
Mendelian Inheritance
MapViewer
Data Mining Tools
Links to Literature
LocusLink Gene viewer
Basic Local Alignment
Search Tool
Gene Annotation
Cancer Genome
Anatomy Project
Online Mendelian Inheritance in Man
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM
The Cancer Genome Anatomy Project
http://www.ncbi.nlm.nih.gov/ncicgap/
Genes
Chromosomes
Tissues
dbEST
UniGene
HomoloGene
Aberrations
Physical Maps
Genetic Maps
Cancer taxonomy
Cancer cell lines
cDNA libraries
NIH Resources
NCI
NHGRI
NLM
Gene Expression Profiling
Tools
SAGEmap
Digital Differential Display
Virtual Northerns
Bioinformatics
Methods
Resources
Mouse Genomics
http://www.informatics.jax.org/
Genes, Alleles and Phenotypes
Molecular Probes and Reagents
Genetic, Physical and Comparative Maps
Strains and Polymorphisms
Gene Expression
Search Engines
Mouse-Human Synteny
0 cM
10 cM
22
7
2
KIF-3A
10K
20 cM
5
163 bp, 83%
30 cM
20K
40 cM
50 cM
IL-4
30K
17
60 cM
40K
401 bp, 84%
70 cM
80 cM
IL-13
50%
100%
Online Mendelian Inheritance in Animals
http://morgan.angis.su.oz.au/Databases/BIRX/omia/
Cat
Salmon
Cow
Chicken
Tilapia
Sheep
Turkey
Pig
Horse
Deer
Little Humans with wings
Endocrine Diseases (31)
Cancer (67)
1.00
1.00
0.80
0.80
0.60
0.60
0.40
0.40
0.20
0.20
0.00
0.00
Fly
Wor m
Fly
Yeast
Neurological Disorders (59)
Worm
Yeast
Renal and Hem atological (34)
1.00
1.00
0.80
0.80
0.60
0.60
0.40
0.40
0.20
0.20
0.00
0.00
Fly
Wor m
Yeast
Fly
M alform ation Syndrom es (34)
Yeast
Im m une, Metabolic and Other (58)
1.00
1.00
0.80
0.80
0.60
0.60
0.40
0.40
0.20
0.20
0.00
Worm
0.00
Fly
Wor m
Yeast
Fly
Worm
Yeast
Plant Genome Projects
acedb Plant genetic maps
http://www.gramene.org/
Arabisopsis segmental duplications
http://www.arabidopsis.org/
Defining the minimal genome
A microbial genome at TIGR
http://www.tigr.org/tigr-scripts/CMR2/CMRHomePage.spl
The SGD yeast database
http://genome-www.stanford.edu/Saccharomyces/
Parasite genomics