Cas_ProteinsFinal

Download Report

Transcript Cas_ProteinsFinal

CRISPR-associated Proteins
Sarah Pyfrom
[email protected]
Research Questions
 What Cas-proteins does our species share with the 10 other
species we chose to study?
 If so, how do they compare?
 How do Cas-proteins function in relation to CRISPR units?
[Edit]:
 Why did JGI change its annotation?
Cas Proteins
 Proteins that are almost always associated with (near)
CRISPR sequences
 Originally four major families
 Now, at least 45 families total
JGI annotation
“Old” Cas-Proteins
“New” Cas-Proteins
 Cas1
 Cas1
 Cas2
 Cas2
 Cas3
 Cas4
 Cas4
 Cas5
 TM1800
 Cas6
 TM1801
 Csh1
 Csh2
Changes:
 TM1800= Cas5
 TM1801=Csh2
 Hypothetical protein = Csh1
 Part of hypothetical protein = Cas6
 Cas3 = hypothetical protein
 Cas4:
 MTDSSGDPVDRFLAAARDESAELPFRLTGVMFQYYVVCER
ELWFLSRDVEIDRDTPAIVRGSDVDDSAYADKRRDVRVDGII
AIDVLDSGEILEVKPSSSMTEPARLQLLFYLWYLDRVTGVEK
TGVLAHPAEKRRETVELTPETSAEVESAIEGIRAVVTAESPPP
AEEKPVCDSCAYHDFCWSC (red = original Cas4)
Map of CRISPR region
TM1800
TM1801
Transposases
Cas3
Hypothetical
proteins
CRISPR
Cas1
Cas2
Unidentified
Cas4
Csh1
Cas5
Cas6
Csh2
Cas1
(from Sulfolobus solfataricus)
 high-affinity nucleic acid binding protein
 binds DNA, RNA and DNA–RNA hybrid
 sequence non-specific in a multi-site binding mode
 promotes the hybridization of complementary nucleic acid
strands.

From: SSO1450 – A CAS1 protein from Sulfolobus solfataricus P2 with high affinity for RNA and DNA
Cas3
Cas4
 Usually similar to helicases
 Often resemble Rec-B
 Unwinds double-stranded
DNA
 Thought to be involved in
DNA metabolism and
repair
Cas2 function unknown
From Genbank
exonucleases
 Break down nucleic acid
strains
 Thought to be involved in
DNA metabolism
Cas5
Cas6
 Often found with Cas1,
 Characterized by
and Cas6.
 Share and N-terminal
region of about 43 amino
acids in length
 Are usually 210-265 amino
acids long
From: EMBL IPR013422 profile page (: http://www.ebi.ac.uk/interpro/IEntry?ac=IPR013422)
GhGxxxxxGhG motif,
where h indicates a
hydrophobic residue, at the
C-terminus
From: Sanger PF09559 Profile page ( http://pfam.sanger.ac.uk/family/PF09559)
Csh1 and Csh2?
 Protein families determined for ease of alignment
 Often large differences between species
 Alignment easier if protein “soup” is divided into more readily-
compared subgroups.
CRISPRs thought to create stable
secondary RNA structures
 Spacers remain associated
with their DR neighbors.
 Provide a way for CasProteins to recognize the
spacers and facilitate
immune response.
From: Evolutionary conservation of sequence and secondary structures in CRISPR repeats
Cas-Proteins and Immunity
 Thought to act like Slicer and Dicer (eukaryotic counterpart)
 Create siRNA that will inhibit/break down invading RNA
 Not known if Cas-proteins are involved in integrating
pathogenic DNA into spacers
Video of eukaryotic siRNA process: http://www.youtube.com/watch?v=D77BvIOLd0
Alignments of Cas
 Compared Cas1, Cas2, Cas3 etc. proteins across all 10
species…
Comparison with other species:
(based on “old” proteins)
Species
Cas1
Cas2
Cas3
Cas4
TM1800
TM1801
X
X
X
X
H: vallismortis
H. volcanii
H. sulfurifontis
X
H. sinaiiensis
X
X
X
H. californiae
X
X
H. utahensis
X
X
X
X
X
X
H. mucosum
X
X
X
X
X
X
H. mediteranei
X
X
X
X
X
X
H. denitrificans
X
X
X
X
X
X
H. mukohataei
X
X
X
X
X
X
Phylogenetic tree comparing amino acid sequences for all CAS-proteins
2
2
2
2
1800
1
4
Halomicrobium mukohataei
Haloarcula sinaiiensis
Haloarcula californiae
Haloferax dentrificans
Haloferax mediteranei
Haloferax sulfurifontis
Haloferax mucosum
Halorhabdus utahensis
3
1
1801
1801
1801
1801
1801
3
1
1
1800
2
3
3
2
1800
1
1800
1800
1
3
1801
4
4
4
4
4
1
4
1800
3
3
1800
1801
Cas 1 and Cas2 did not change
Cas 4
• JGI revision shortened
this protein
•Would expect low
sequence similarity
near end of protein
TM1801 (Csh2)
• Revision by JGI simply
renamed this protein
•Would expect sequence
similarity
Map of CRISPR region
TM1800
TM1801
Transposases
Cas3
Hypothetical
proteins
CRISPR
Cas1
Cas2
Unidentified
Cas4
Csh1
Cas5
Cas6
Csh2
In conclusion:
We don’t know much….
…but we do know everything that
everybody else knows.
Questions?
References

Kunin, V., Sorek, R., Hugenholtz, P. (2007) Evolutionary conservation of sequence and secondary
structures in CRISPR repeats. Genome Biology.http://genomebiology.com/2007/8/4/R61.
Accessed 24 Nov, 2009.

Haft, D.H., Selengut, J. Mongodin, EF., Nelson, K.E. (2005). A guild of 45 CRISPR-associated
(Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes. PLoS
Comput Biol. http://www.ncbi.nlm.nih.gov/pubmed/16292354. Accessed 24 Nov 2009.