Transcript Slide 1
Annotation of Sarcocystis
neurona scaffolds
Nigel Austin
Turgay Ibrikci
Liliana Lopez Kleine
Marton Megyeri
Caribbean Training Programme on Bioinformatics January 2010
Sarcocystis neurona
• Genus: Sarcocystis - parasitic protozoa
• occur as sporocysts in the muscle of
mammals, birds, and reptiles.
• In humans – asymptomatic
• Sarcocystis neurona causes equine
protozoal myoencephalitis
2
S. neurona & Related Apicomplexa
Sarcocystis neurona
Eimeria
Neospora
Toxoplasma
3
Life Cycle of S. neurona
4
About Data
• Data cordially supplied by Dr. Jessica
Kissinger who very recently acquired the
genome sequence
• First 120,000 bp in 4 scaffolds – analysis
• Then 400,000 bp in 4 scaffolds - analysis
5
Objectives
• To annotate novel DNA sequences of S.
neurona.
• Detection of coding sequences by:
– comparison with other sequences in data
bases
• NB: No reference genome or other info
was available since sequences were
novel
6
Strategy for Scaffolds
• BLASTX in nr db: search of translated
sequence in protein databases
• TBLASTX in est db: search of translated
sequence in translated sequence
databases
• Comparison in ACT with most closely
related organisms (Toxoplasma gondii and
Neospora caninum)
7
Results – Blast Search
8
Results BLAST
BLAST
DB
Start
End
Similarity E-value
Subject
BLASTX
nr
41446
42924
71
2.00E-16
BLASTX
nr
41464
42942
42
2.00E-44 Conserved hypothetical protein Plasmodium falciparum
BLASTX
nr
"
"
44
2.00E-42
Conserved hypothetical protein Plasmodium vivax
BLASTX
nr
"
"
41
1.00E-37
Conserved hypothetical protein Plasmodium berghei
BLASTX
nr
"
"
40
BLASTX
nr
"
"
43
1.00E-37 Conserved hypothetical protein Cryptosporidium muris
Conserved hypothetical protein Cryptosporidium
1.00E-22
parvum
BLASTX
nr
10632
10992
69
6.00E-33
Putative lectin doman protein Toxoplasma gondii
BLASTX
nr
32690
32968
66
7.00E-18
Transcript GF18541 Drosophila melanogaster
BLASTX
nr
"
"
66
6.00E-17
Putative acylphosphatase Aedes aegypti
BLASTX
nr
"
"
69
4.00E-16
Putative acylphosphatase Toxoplasma gondii
TBLAST
est
1538
1840
45
5.00E-08
Xenopus mRNA (cDNA library)
TBLAST
est
"
"
51
1.00E-07
Cyprinus carpio mRNA (cDNA library)
TBLAST
est
10986
10967
82
5.00E-10
T. gondii mRNA (cDNA library)
TBLAST
est
14716
14904
87
2.00E-33
T. gondii mRNA (cDNA library)
Conserved hypothetical protein Toxoplasma gondii
9
ACT Results
Match of region with a conserved gene in Neospora caninum and
Toxoplasma gondii
Neospora caninum
scaffolds
10
Hmmm….
• No genes in 400,000
bp DNA???
• And then….
• Expertise, experience
• He was able to locate
a gene
11
Gene Discovered!
Match of region with a conserved gene in Neospora caninum
12
Discovered Gene - Gene1
• The discovered gene was expanded on
both the 5’ and 3’ end
• Start and stop codons were identified
• Protein sequence was determined
• BLAST – hypothetical protein with high
similarity to one found in Neospora and
Toxoplasma
13
Gene Comparison
Match of region with a conserved gene in Neospora caninum and Toxoplasma
gondii
Neighbouring genes are not present in the scaffold.
14
Results – Uniprot Search
Performed with GENE1
15
Further Protein Info
• Characterize our protein product
– Membrane protein? High regions of
hydrophobicity
– Domains and motifs
– Secondary structures
16
Hydrophobicity Graph
No transmembrane motifs present
17
Domains & Motifs
18
Conclusion
• Various blast searches may assist in location of
orthologous genes in other genomes
• ACT very useful tool for gene discovery and
annotation (along with experience & expertise)
• One gene (Gene1) was found in 400 Kb of DNA –
scaffolds perhaps in a gene poor region of genome
• Gene1 is perhaps orthologous with a gene in
Toxoplasma and Neurospora
• Hypothetical gene – no function prescribed to it
19
Thank You!!!
20