PPT - Bioinformatics.ca

Download Report

Transcript PPT - Bioinformatics.ca

http://creativecommons.org/licenses/by-sa/2.0/
Lecture 4.2
1
Lecture 5.2:
Genome Annotation
Francis Ouellette
[email protected]
New slides
• All slides which are different then what you
have in your binder have a “ “ in the left
hand corner of the page. If it’s a modification
of the slide you have in your binder, I’ve tried
to note it as such.
Lecture 4.2
3
Outline
• What we have
• How we use it
• What we should get out of it
Lecture 4.2
4
What else?
• Genome sequences (DNA only) by themselves are not as useful
as genomes that are fully annotated.
• Functions of many processes reside 3D proteins, and the
structure of proteins and RNA is known for only few sequences.
• Need to know where the protein coding sequences are, and
what they do: this is a very big challenge in bioinformatics.
• Proteins are not everything, there are also ‘parts’ components
information in DNA and RNA (and carbohydrates and lipids).
• All of this becomes part of “the parts list”, from where all biology
will be understood.
Lecture 4.2
5
Challenges at building the “Parts List”
• Finding genes involves computational
methods as well as experimental validation
• Computational methods are often inadequate,
and often generate erroneous ‘gene’ (false
positive) sequences which:
–
–
–
–
Are missing exons
Have incorrect exons
Over predict genes
Where the 5’ and 3’ UTR are missing
Lecture 4.2
6
Assumptions we make:
• Reductionist approach still works, albeit we
are now becoming more and more
“systems biologists”
• Evolution drives everything, and will be the
way we figure things out. Or said in another
way:
– Evolutionary relationships and comparisons will be
essential in our efforts to solve and understand
genomes
Lecture 4.2
7
How we got started:
• GenBank database was populated by
common genes:
–
–
–
–
–
–
rRNA, tRNA
Globin
Histone
ATPases
Actin
Tubulin and others …
Lecture 4.2
8
Things we are looking to annotate?
•
•
•
•
•
•
CDS
mRNA
Alternative RNA
Promoter and Poly-A Signal
Pseudogenes
ncRNA
Lecture 4.2
10
Pseudogenes
• Could be as high as 20-30% of all Genomic sequence
predictions could be pseudogene
• Non-functional copy of a gene
– Processed pseudogene
•
•
•
•
Retro-transposon derived
No 5’ promoters
No introns
Often includes polyA tail
– Non-processed pseudogene
• Gene duplication derived
– Both include events that make the gene non-funtional
• Frameshift
• Stop codons
• We assume pseudogenes have no function, but we
really don’t know!
Lecture 4.2
11
LOCUS
DEFINITION
NG_005487
1850 bp
DNA
linear
ROD 14-FEB-2006
Mus musculus ubiquitin-conjugating enzyme E2 variant 2 pseudogene
(LOC625221) on chromosome 6.
ACCESSION
NG_005487
VERSION
NG_005487.1 GI:87239965
KEYWORDS
.
SOURCE
Mus musculus (house mouse)
ORGANISM Mus musculus
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia;
Sciurognathi; Muroidea; Muridae; Murinae; Mus.
REFERENCE
1 (bases 1 to 1850)
AUTHORS
Wilson,R.
TITLE
Mus musculus BAC clone RP24-201D17 from 6
JOURNAL
Unpublished (2003)
COMMENT
PROVISIONAL REFSEQ: This record has not yet been subject to final
NCBI review. The reference sequence was derived from AC121925.2.
FEATURES
Location/Qualifiers
source
1..1850
/organism="Mus musculus"
/mol_type="genomic DNA"
/db_xref="taxon:10090"
/chromosome="6"
/note="AC121925.2 32277..34126"
gene
101..1750
/gene="LOC625221"
/pseudo
/db_xref="GeneID:625221"
repeat_region
1792..1827
/rpt_family="ID"
ORIGIN
1 tcttctgcct caattcctca agtgctagta tcatatgccc atgccattat ttttaactcc
61 cctttttcat gctaagaatt gaacacacgg ccctgcgtgc ggtggtgcgt ctggtagcag
121
gagaagatgg cggtctccac aggagttaaa gttcctcgta attttcgctt gttggaagaa
Lecture 4.2
12
Noncoding RNA (ncRNA)
• ncRNA represent 98% of all transcripts in a
mammalian cell
• ncRNA have not been taken into account in gene
counts
• cDNA
• ORF computational prediction
• Comparative genomics looking at ORF
• ncRNA can be:
– Structural
– Catalytic
– Regulatory
Lecture 4.2
13
From NW_632744.1
gene
misc_RNA
Lecture 4.2
complement(55100..55691)
/locus_tag="CR40465"
/note="synonym: CR_tc_AT13310"
/db_xref="GeneID:3354945"
complement(55100..55691)
/locus_tag="CR40465"
/note="This annotation is identical to the ncRNA
CR_tc_AT13310 annotation, also mapped idenitcally to 2L
[20224138,20223553]
last curated on Thu Jan 15 13:37:02 PST 2004"
/db_xref="FlyBase:FBgn0058465"
/db_xref="GeneID:3354945"
14
Noncoding RNA (ncRNA)
• tRNA – transfer RNA: involved in translation
• rRNA – ribosomal RNA: structural component
of ribosome, where translation takes place
• snoRNA – small nucleolar RNA:
functional/catalytic in RNA maturation
• Antisense RNA: gene regulation/silencing?
Lecture 4.2
15
Rfam
• Covariance model searches
are extremely compute
intensive. A small model (like
tRNA) can search a
sequence database at a rate
of around 300 bases/sec.
The compute time scales
roughly to the 4th power of
the length of the RNA, so
larger models quickly
become infeasible without
significant compute
resources.
Lecture 4.2
16
BLAST
• Seeks high-scoring segment pairs (HSP)
– pair of sequences that can be aligned without gaps
– when aligned, have maximal aggregate score
(score cannot be improved by extension or trimming)
– score must be above score threshold S
• Public Search engines
– WWW search form
http://www.ncbi.nlm.nih.gov/BLAST
– Unix command line
blastall -p progname -d db -i query > outfile
• Making your own search space
Lecture 4.2
17
So many matrices...
• Triple-PAM strategy (Altschul, 1991)
– PAM 40
Short alignments, highly similar
• tblastn against ESTs
– PAM 120
– PAM 250
Longer, weaker local alignments
• Looking in the twilight zone
• BLOSUM (Henikoff, 1993)
– BLOSUM 90
– BLOSUM 62
Short alignments, highly similar
Most effective in detecting known
members of a protein family
• Standard on NCBI server – works in most cases
– BLOSUM 30
Lecture 4.2
Longer, weaker local alignments
18
Protein coding genes in prokaryotes,
and simple eukaryotes
• Use ORF finder
http://www.ncbi.nlm.nih.gov/gorf/orfig.cgi
• Simple ATG/Stop
• Simple link to FASTA formatted files and
BLAST.
• Problems:
– In frame Methionine
– Small protein
• Solution: comparative genomics
Lecture 4.2
19
Figure 11 from: Methods in comparative genomics: genome correspondence, gene identification and
regulatory motif discovery. Kellis M, Patterson N, Birren B, Berger B, Lander ES. J Comput Biol.
2004;11(2-3):319-55.
Saccharomyces
Saccharomyces
Saccharomyces
Saccharomyces
Lecture 4.2
cerevisiae.
paradoxus,
mikatae,
bayanus
20
Ab initio gene identification
• Goals
– Identify coding exons
– Seek gene structure information
– Get a protein sequence for further analysis
• Relevance
– Characterization of anonymous DNA genomic
sequences
– Works on all DNA sequences
Lecture 4.2
21
Gene-Finding Strategies
Genomic Sequence
Content-Based
Bulk properties of
sequence:
• Open reading frames
• Codon usage
• Repeat periodicity
• Compositional
complexity
Lecture 4.2
Site-Based
Absolute properties of
sequence:
• Consensus sequences
• Donor and acceptor
splice sites
• Transcription factor
binding sites
• Polyadenylation
signals
• “Right” ATG start
• Stop codons
out-of-context
Comparative
Inferences based
on sequence homology:
• Protein sequence
with similarity to
translated product
of query
• Modular structure of
proteins usually
precludes finding
complete gene
22
Gene-Finding Methods
Genomic Sequence
Rule-Based
Cutoff method:
• Criteria applied sequentially
to identify possible exons
• Rank or eliminate candidates
from consideration based on
pre-determined cutoff at
each step
Lecture 4.2
Neural Network
Composite method:
• Criteria applied in parallel
• Training sets used to optimize
performance
• Weight scores in order of
importance
23
Evaluation Statistics
TP
FP
TN
FN
TP
FN
TN
Actual
Predicted
Sensitivity
Fraction of actual coding regions that are correctly
predicted as coding
Specificity
Fraction of the prediction that is actually correct
Correlation
Coefficient
Combined measure of sensitivity and specificity,
ranging from –1 (always wrong)
to +1 (always right)
Lecture 4.2
24
Relative Performance
Individual Exons
MZEF
HEXON
SorFind
GRAIL II
Gene Structure
GENSCAN
FGENES
GRAIL II/Gap
GeneParser
HMMgene
Lecture 4.2
Claverie 1997
Sn (%)
Sp (%)
CC
78
71
42
51
86
65
47
57
0.79
0.64
0.62
0.47
78
73
51
35
81
78
52
40
0.86
0.74
0.66
0.54
Rogic 2000
CC
0.91
0.83
0.91
25
What works best when?
• Genome survey (draft) data:
expect only a single exon in any given stretch of contiguous sequence
– BLASTN vs. dbEST (3’ UTR)
– BLASTX vs. nr (protein CDS)
• Finished data:
large contigs are available, providing context
– GENSCAN
– HMMgene
Lecture 4.2
26
What you need
• Compute the prediction
• Confirm with biological sequences (also with
computational tools)
• Integrate all of this
• Annotate genome (often via a GUI: Graphical User
Interface)
• Validate
• Re-annotate/Update
• Check it twice
• Submit to GenBank
Lecture 4.2
27
Some of the things available:
•
•
•
•
•
•
EnsEMBL (EBI)
Sequin (NCBI)
PseudoCAP (SFU)
GMOD (CSHL)
Pegasys (UBiC)
Apollo (EBI/Berkeley)
Lecture 4.2
28
ENSEMBL
Lecture 4.2
29
http://www.pseudomonas.com/
Lecture 4.2
30
Lecture 4.2
31
Lecture 4.2
32
Lecture 4.2
33
http://bioinformatics.ubc.ca/pegasys/
Lecture 4.2
34
Pegasys
Lecture 4.2
35
Example output – GAME XML
(Genome Annotation Markup Elements XML)
• Input to Apollo
– Genome editor created by Berkeley
Drosophila group and Ensembl
– Simultaneously view heterogeneous
computational evidence
– Manually create and/or edit annotations
Lecture 4.2
36
Apollo
• Apollo is a collaborative project between the Berkeley
Drosophila Genome Project (www.bdgp.org) and
Ensembl (www.ensembl.org). The collaboration was
set up to create a tool to initially annotate fly but
which would also be able to annotate and browse any
large eukaryotic genome.
There is a sister developers' website at
www.fruitfly.org/annot/apollo to download the fly
specific Apollo annotation tool.
• All the code is open source and freely downloadable.
Lecture 4.2
37
Features of Apollo include:
•
•
•
•
•
•
•
•
•
•
Zoomable and scrollable feature display down to sequence level
optimized for display of large regions of genome.
User configurable feature types (colour, appearance, size, order,
score threshold)
Can connects directly to the Ensembl web site for the latest
human genome annotation
Reads/write gff format
Searchable for feature names or sequence string
Ability to select features and sort by different feature attributes
All features are linked out to their source database web sites
(ensembl,swissprot,embl,unigene etc)
Display of genomic sequence and any associated start and stop
codons
Prints postscript output
Display is reversible allowing easy interpretation of reverse strand
features.
Lecture 4.2
38
Lecture 4.2
39
GenBank Features
-10_signal
-35_signal
3'clip
3'UTR
5'clip
5'UTR
attenuator
CAAT_signal
CDS
conflict
C_region
D-loop
D_segment
enhancer
exon
Lecture 4.2
GC_signal
gene
iDNA
intron
J_segment
LTR
mat_peptide
misc_binding
misc_difference
misc_feature
misc_recomb
misc_RNA
misc_signal
misc_structure
modified_base
mRNA
N_region
old_sequence
polyA_signal
polyA_site
precursor_RNA
primer_bind
prim_transcript
promoter
protein_bind
RBS
repeat_region
repeat_unit
rep_origin
rRNA
satellite
scRNA
sig_peptide
snoRNA
snRNA
S_region
stem_loop
STS
TATA_signal
terminator
transit_peptide
tRNA
unsure
variation
V_region
V_segment
40
GenBank Features: the important ones
-10_signal
-35_signal
3'clip
3'UTR
5'clip
5'UTR
attenuator
CAAT_signal
CDS
conflict
C_region
D-loop
D_segment
enhancer
exon
Lecture 4.2
GC_signal
gene
iDNA
intron
J_segment
LTR
mat_peptide
misc_binding
misc_difference
misc_feature
misc_recomb
misc_RNA
misc_signal
misc_structure
modified_base
mRNA
N_region
old_sequence
polyA_signal
polyA_site
precursor_RNA
primer_bind
prim_transcript
promoter
protein_bind
RBS
repeat_region
repeat_unit
rep_origin
rRNA
satellite
scRNA
sig_peptide
snoRNA
snRNA
S_region
stem_loop
STS
TATA_signal
terminator
transit_peptide
tRNA
unsure
variation
V_region
V_segment
41
GenBank Features: the abundant one
-10_signal
-35_signal
3'clip
3'UTR
5'clip
5'UTR
attenuator
CAAT_signal
CDS
conflict
C_region
D-loop
D_segment
enhancer
exon
Lecture 4.2
GC_signal
gene
iDNA
intron
J_segment
LTR
mat_peptide
misc_binding
misc_difference
misc_feature
misc_recomb
misc_RNA
misc_signal
misc_structure
modified_base
mRNA
N_region
old_sequence
polyA_signal
polyA_site
precursor_RNA
primer_bind
prim_transcript
promoter
protein_bind
RBS
repeat_region
repeat_unit
rep_origin
rRNA
satellite
scRNA
sig_peptide
snoRNA
snRNA
S_region
stem_loop
STS
TATA_signal
terminator
transit_peptide
tRNA
unsure
variation
V_region
V_segment
42
Gene Prediction Caveats
• Predictions are of protein coding regions
– Do not detect non-coding areas (5’ and 3’ UTR)
– Non-coding RNA genes are missed
• Predictions are for “typical” genes
–
–
–
–
–
Must predict a beginning and an end
Partial or multiple genes are often missed
Training sets may be biased
Methods are sensitive to G+C content
Weighting of factors may be inordinately biased
Lecture 4.2
43
Moving along
• Sequencing technology led genomics, and to some
extant bioinformatics
• EST complicated things, and where the beginning of
specialized ‘methods’ or ‘functional’ division in
GenBank.
• Yeast chromosomes and bacterial chromosomes
rapidly lead us to our obvious ineptitude of genome
annotations, and these genomes where simple!
• A controlled vocabulary was necessary, albeit slow to
be created: Gene Ontology (GO), Sequence
Ontology (SO).
Lecture 4.2
44
Genome annotation problems:
•
•
•
•
•
•
•
•
•
•
•
Assembling the genome
Analysis & interpretation
Lack of consistency from gene to gene
Lack of consistency from person to person
Lack of controlled vocabulary
Parts we don’t know
Bacteria vs mammals
Graphical user interface
Gene expression/molecular interactions
Dimensions
Updates and maintenance
Lecture 4.2
45
Some comments about the human genome
•
•
•
•
•
“Finished” February 15, 2001
Finished April 25, 2003
Still not fully understood and definitely not finished.
We are still in the genomic era.
To get a full “parts list”, we need, as a community, to
develop a system to rigorously find all of the part of
the human genome:
–
–
–
–
–
Genes
Protein coding sequences
Non coding RNAs (ncRNA)
Identify and understand regulatory sequences
Many other cool things we don’t know about!
Lecture 4.2
46
The ideal annotation of “MyGene”
All clones
All SNPs
Promoter(s)
MyGene
All mRNAs
All proteins
All structures
Lecture 4.2
• All protein modifications
• Ontologies
• Interactions (complexes,
pathways, networks)
•Expression (where and
when, and how much)
47
•Evolutionary relationships
Things we will need to integrate in the future:
• Better gene predictions
• Haplotypes to map complex diseases
• Micro-array/gene expression data
– SAGE data
•
•
•
•
•
Protein-protein interaction data
GFP (Green Fluorescent Protein)
Human-base (Entrez Gene)
Better standardization of annotation protocols.
Integration!
Lecture 4.2
48
Some Concluding remarks
• Trust but verify
• Beware of gene prediction tools!
• Always use more than one gene prediction
tool and more than one genome when
possible.
• Active area of bioinformatics research, so be
mindful of the new literature in this .
Lecture 4.2
49
http://bioinformatics.ubc.ca/resources/links_directory/?subcategory_id=113
Lecture 4.2
50
http://bioinformatics.ubc.ca/resources/links_directory/?subcategory_id=39
Lecture 4.2
51
28
Finding records needing to be updated?
• Who updates?
Submitters, Journals, “3rd party”
• What to update?
Gene names, citations, new product,
sequencing errors
• Where?
[email protected]
• Why update?
Lecture 4.2
52
example
Lecture 4.2
53
Lecture 4.2
54
Lecture 4.2
55
From francis Wed Mar 3 22:32:19 1999
To: [email protected]
Subject: D25291 mito
Dear colleagues,
it appears that DDBJ record D25291 is contaminated with mitochondrial
sequences from nucleotide 673 to 1803, as it is identical to mouse
mitochondrial sequence (EMBL V00711) for more than 1100 nucleotides.
I would recommend deleting that segment of the record, or removing the
record altogether, as it leads to unfortunate misinterpretation of the
data when using GenBank or DDBJ. The protein sequence (which is
erroneous, as it is all of mitochondrial origin) should definitely be
removed as well.
….
LOCUS
DEFINITION
ACCESSION
MUSNGH
1803 bp
mRNA
ROD
29-AUG-1997
Mouse neuroblastoma and rat glioma hybridoma cell line NG108-15
cell TA20 mRNA, complete cds.
D25291
Lecture 4.2
56
31
Sequence Updated
LOCUS
DEFINITION
ACCESSION
NID
VERSION
LOCUS
DEFINITION
ACCESSION
VERSION
MUSNGH
1803 bp
mRNA
ROD
29-AUG-1997
Mouse neuroblastoma and rat glioma hybridoma cell line NG108-15
cell TA20 mRNA, complete cds.
D25291
g1850791
D25291.1 GI:1850791
length
Version
MUSNGH
DEF
Date
GI
619 bp
mRNA
ROD
12-MAR-1999
Mouse neuroblastoma and rat glioma hybridoma cell line NG108-15
mRNA.
D25291
D25291.2 GI:4520413
Lecture 4.2
57
Lecture 4.2
58
Lecture 4.2
59
Courses in program
Required courses:
Electives:
•
•
•
•
•
•
•
•
•
MBB 505/MEDG 548C | PROBLEM
BASED LEARNING IN
BIOINFORMATICS
MBB 659 | SPECIAL TOPICS IN
BIOINFORMATICS
MBB 841 | BIOINFORMATICS
CMPT 881 | THEORETICAL
COMPUTING
CMPT 889 | BIOINFORMATICS
ALGORITHMS
CPSC 545 | ALGORITHMS FOR
BIOINFORMATICS
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Lecture 4.2
CMPT 354 | DATABASE SYSTEMS I
CMPT 740 | DATA MINING
CMPT 880 | SPECIAL TOPICS IN MEDICAL IMAGE
ANALYSIS
CPSC 304 | INTRODUCTION TO RELATIONAL DATABASES
CPSC 504 | DATABASE DESIGN
HCEP 511 | CANCER EPIDEMIOLOGY
CPSC 53A | TOPICS IN ALGORITHMS AND COMPLEXITY –
BIOINFORMATICS
INFO 506 | CRITICAL RESEARCH ANALYSIS
MATH 561 | MATHEMATICAL BIOLOGY
MATH 612D | TOPICS IN MATHEMATICAL BIOLOGY MATHEMATICS OF INFECTIOUS DISEASES AND
IMMUNOLOGY
MBB 823 | PROTEIN STRUCTURE AND FUNCTION:
PROTEOMIC BIOINFORMATICS
MBB 831 | MOLECULAR EVOLUTION OF EUKARYOTE
GENOMES
MBB 835 | GENOMIC ANALYSIS
MEDG 505 | GENOME ANALYSIS
STAT 540| STATISTICAL METHODS FOR HIGH
DIMENSIONAL BIOLOGY
STAT 802 | MULTIVARIATE ANALYSIS
STAT 805 | NON-PARAMETRIC STATISTICS AND
DISCRETE DATA ANALYSIS
STAT 890 | STATISTICS SELECTED TOPICS BIOMETRICAL GENETICS
PATH 531/MEDG 521 | MOLECULAR AND CELL BIOLOGY
OF CANCER
60
Bioinformatics Faculty/Mentors
•
•
•
•
•
•
•
•
•
David Baillie Bioinformatics, Molecular
Biology & Biochemistry, SFU
Fiona Brinkman (on maternity leave
June - Sept/06) Molecular Biology &
Biochemistry, SFU
Ryan Brinkman Medical Genetics, UBC,
BC Cancer Research Centre, BCCA
Jenny Bryan (on maternity leave
beginning Jan/06) Statistics and Michael
Smith Laboratories, UBC
Artem Cherkasov Medicine, Division of
Infectious Diseases, UBC
Ann Condon (on sabbatical until
Sept/06) Computer Science, UBC
Martin Ester Computing Science, SFU
Arvind Gupta Computing Science, SFU
Phil Hieter Michael Smith Laboratories,
UBC
Lecture 4.2
•
•
•
•
•
•
•
•
•
Holger Hoos Computer Science, UBC
Steven Jones Program Director,
Bioinformatics
Genome Sciences Centre, BCCA
Marco Marra Genome Sciences Centre,
BCCA
Francis Ouellette Director, UBC
Bioinformatics Centre (UBiC) Michael
Smith Laboratories and Medical
Genetics, UBC
Paul Pavlidis UBC Bioinformatics
Centre (UBiC) Psychiatry, UBC
Frederic Pio Molecular Biology &
Biochemistry, SFU
Cenk Sahinalp Computing Science,
SFU
Wyeth Wasserman Centre for Molecular
Medicine & Therapeutics, UBC
Mark Wilkinson Medical Genetics, UBC
61
Associate Faculty
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Chris Bajdik, Cancer Control Research, BCCA
Andrew Beckenbach, Biological Sciences, SFU
Christopher Beh, Molecular Biology & Biochemistry, SFU
Bruce Brandhorst, Molecular Biology & Biochemistry, SFU
Felix Breden, Biological Sciences, SFU
Hugh Brock, Zoology, UBC
Angela Brooks-Wilson, Genome Sciences Centre, BCCA
Andy Coldman, Cancer Control Strategy, BCCA
Veronica Dahl, Computing Science, SFU
William Davidson, Molecular Biology & Biochemistry, SFU
Charmaine Dean, Statistical & Actuarial Science, SFU
Allen Eaves, Terry Fox Laboratory, BCCA
Connie Eaves, Terry Fox Laboratory, BCCA
Eldon Emberly, Physics, SFU
Joanne Emerman, Anatomy, UBC
Brett Finlay, Michael Smith Laboratories, UBC
Rick Gallagher, Cancer Control Research, BCCA
Raphael Gottardo, Statistics, UBC
Jinko Graham, Statistical &Actuarial Science, SFU
Qianping Gu, Computing Science, SFU
Jiawei Han, Computing Science, SFU
Robert Hancock, Microbiology & Immunology, UBC
Nicholas Harden, Molecular Biology & Biochemistry, SFU
Nancy Hawkins, Molecular Biology & Biochemistry, SFU
Michael Hayden, Centre for Molecular Medicine &
Therapeutics, UBC
Charles Haynes, Michael Smith Laboratories, UBC
Rob Holt, Genome Sciences Centre, BCCA
Barry Honda, Molecular Biology & Biochemistry, SFU
Pam Hoodless, Terry Fox Laboratory, BCCA
Keith Humphries, Terry Fox Laboratory, BCCA
Rob Kay, Terry Fox Laboratory, BCCA
Lecture 4.2
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Patrick Keeling, Botany, UBC
Leah Keshet, Math, UBC
Ted Kirkpatrick, Computing Science, SFU
Michael Kobor, Medical Genetics, UBC
Ben Koop, Biology, UVic
Jim Kronstad, Michael Smith Laboratories, UBC
Gerry Krystal, Terry Fox Laboratory, BCCA
Wan Lam, Cancer Genetics, BCCA
Peter Lansdorp, Terry Fox Laboratory, BCCA
Nhu Le, Cancer Control Research, BCCA
Michel Leroux, Molecular Biology & Biochemistry, SFU
Victor Ling, Cancer Genetics, BCCA
Calum MacAuley, Cancer Imaging, BCAA
Dixie Mager, Terry Fox Laboratory, BCCA
Brad McNeney, Statistical & Actuarial Science, SFU
Don Moerman, Zoology, UBC
Ed Moore, Physiology, UBC
Gregg Morin, Genome Sciences Centre, BCCA
Colleen Nelson, Surgery, UBC
Raymond Ng, Computer Science, UBC
Robert Olafson, Biochemistry & Microbiology, UVic
Mark Paetzel, Molecular Biology & Biochemistry, SFU
James Piret, Michael Smith Laboratories, UBC
Ann Rose, Medical Genetics, UBC
Miriam Rosin, Cancer Control Research, BCCA
Carl Schwarz, Statistical &Actuarial Science, SFU
Jamie Scott, Molecular Biology & Biochemistry, SFU
Dipankar Sen, Molecular Biology & Biochemistry, SFU
Elizabeth M. Simpson, Centre for Molecular Medicine &
Therapeutics, UBC
Michael J. Smith, Molecular Biology & Biochemistry, SFU
John Spinelli, Cancer Control Research, BCCA
Fumio Takei, Terry Fox Laboratory, BCCA
Peter Unrau, Molecular Biology & Biochemistry, SFU
Chris Upton, Biochemistry & Microbiology, UVic
Esther Verheyen, Molecular Biology & Biochemistry, SFU
Ke Wang, Computing Science, SFU
62
Z. Jane Wang, Electrical and Computer Engineering, UBC
Kay Wiese, IT, SFU
http://bioinformatics.ubc.ca/faculty/
Lecture 4.2
63
• Application Deadlines:
– Feb 14, 2006 for International applicants
– Mar 21, 2006 for North American applicants.
• For more information, please see the
graduate training website at:
– http://bioinformatics.bcgsc.ca.
Lecture 4.2
64