H. diversicolor - Springer Static Content Server

Download Report

Transcript H. diversicolor - Springer Static Content Server

Supplemental information
A. Sequence collection
B. qPCR calibration formula
Database subcollections and their taxonomy relationship
Mollusca
Symbol
Gastropoda
Patellogastropoda
Database
Lottia gigantea
Lgi
Genome, EST
Haliotis asinina
Haliotis midae
Haliotis diversicolor
Haliotis discus
Has
Hmi
Hdiv
Hdis
EST
Transcriptome
Transcriptome
EST
Littorina littorea
Lli
mRNA
Vetigastropoda
Haliotis
Caenogastropoda
Heterobranchia
Aplysia californica
EST
Bivalvia
Heteroconchia
Meretrix meretrix
Mme
Transcriptome
Crassostrea angulata
Can
Transcriptome
Pteriomorphia
Ostreoida
Mytiloida
Mytilus galloprovincialis Mga
EST
Octopus vulgaris
Transcriptome
Cephalopoda
SARP19-like genes
Gene symbols
and
corresponding
GenBank ID
vdg3-like genes
GenBank ID Symbol
GenBank ID
Symbol
JU004909
JU034913
GD241786
EE676526
EG362622
DN763845
EE664153
JU063184
JU069467
JU066047
JU063185
JU071961
JU078033
GT866721
JU063488
Cg 4092
Cg 5590
Cg 4147
Cg 19409
Cg 1131
Cg 9978
FC665937
FC706468
JI273633
JI268876
Can-II1
Can-II2
Has-I1
Hdis-I1
Hdis-I5
Hdis-I8
Hdis-I9
Hdiv-I1
Hdiv-I2
Hdiv-I3
Hdiv-I4
Hdiv-I5
Hdiv-I6
Hdiv-I7
Hdiv-II
Hmi-I2
Hmi-I3
Hmi-I4
Hmi-I5
Hmi-I6
Hmi-I7
LgiSARP19-1
LgiSARP19-2
Mme-II1
Mme-II2
GD272046
DY403113
GD241803
DY403155
GD241807
AY916060
EG362075
EE675817
EE664072
EG362106
EG362237
JU063200
JU063213
JU063214
JU063212
JU063203
JU071971
Cg 21259
Cg 22293
Cg 18717
Cg 22334
Cg 22249
FC598503
FC641104
FC714859
Has-I1
Has-I2
Has-I3
Has-I4
Has-I5
Has-I10
Hdis-I1
Hdis-I10
Hdis-I4
Hdis-I5
Hdis-I7
Hdiv-I1
Hdiv-I2
Hdiv-I3
Hdiv-I4
Hdiv-I5
Hdiv-I6
Hmi-I3
Hmi-I5
Hmi-I7
Hmi-I8
Hmi-I9
Lgi-vdg3-1
Lgi-vdg3-2
Lgi-vdg3-3
AF369698
SARP19
FL594967
Mga-II1
DQ268867
FL595024
AJ625851
AJ625949
Mga-II2
Mga-II3
Mga-II4
Mga-II5
SARP19 sequence collection
- Step1.
AA sequences of H. diversicolor SARP19-I1 (GB:
JU063184) and Littorina littorea SARP19 (GB:
AAM20842) were BLASTp to GenBank NR protein
database.
Hits distributed in a wide taxonomy catalog from
nematodes to birds (FigS1). Such wide distribution
could be resulted from the conservative EF-hand
calcium-binding motifs.
However, some of the distances are too long to
fit monophyletic hypothesis. Sequence collection
should be focused.
FigS1. Hitting pattern SARP19-I1 to
GenBank NR protein database
SARP19 sequence collection
- Step2.
• EST or TSA sequence libraries of three abalones, H. diversicolor, H.
discus and H. asinina (EST), were downloaded from NCBI.
• CDS of H. diversicolor SARP19-I1 and Littorina littorea SARP19 were
tBLASTx these mRNA libraries. Similar sequences were identified
and repeated such BLAST until no new sequence was added.
• Redundant sequences were manually removed, and CDSs were
predicted by interpreting the BLAST results. Because some
sequencing errors can cause breaks in the CDS, the obvious break
points were manually modified by adding or deleting single
nucleotides to replace the CDS.
• These CDS were BLAST Lottia gigantea genome (http://genome.jgipsf.org/pages/blast.jsf?db=Lotgi1) and EST. New non-redundant
CDS were added to the collection.
• Alignments of putative amino acid sequences were performed by
ClustalW and then manually modified. Neighbor-joining trees were
constructed by MEGA (FigS2).
FigS2. NJ tree of SARP19-like sequences collected from three
abalones, Lottia gigantea and Littorina littorea. ○ , H.
diversicolor; ●, Haliotis discus; ■, Haliotis asinina; △, Lottia
gigantea; ▲, Littorina littorea
JU063184
88
99
GD241786
EE676526
98
EG362622
95
65
JU071961
JU063185
74
JU078033
100
JU063488
JU069467
76
GT866721
85
EE664153
69
DN763845
33
30
JU066047
AF369698
FC665937
FC706468
91
0.2
SARP19 sequence collection
– Step3.
• mRNA libraries of Crassostrea angulata and
Meretrix meretrix were added. After search,
12 more SARP19-like sequences were
recruited.
• The NJ guild tree shows the collection could
be separated as two distinct groups (FigS3, S4).
• Average branch length is obviously different
between Group A and B. It may imply that
their evolutionary constraints were different.
F3
69
8
706
FC
Group A
468
FigS3. NJ tree of 27
SARP19-like sequences
collected from
○, H. diversicolor;
●, Haliotis discus;
■, Haliotis asinina;
△, Lottia gigantea;
▲, Littorina littorea ;
◇, Crassostrea angulata;
◆, Meretrix meretrix.
JU 0
69 4
67
JU066047
37
33
80
07
DN
763
59
845
66
FC
JU
JU
06
67
21
86
GT
31
85
85
10
0
73
EE664153
72
75
3255
26
67
22
51
EG
36
24
98
50
JU063488
89
JU071961
97
86
JU0
JU00
4909
349
13
77
82
95
GD
24
17
526
3184
JU06
676
62
EE
83
96
50
JI
26
88
87
76
Group B
63
98
73
JI2
3
0
10
JU090
JU
JU066
408
JU0
550
84
4
18
4
6608
JT
JU066548
42
99
95
0.2
07
62
JI2
69
78
FigS4. NJ tree of 27 SARP19-like sequences collected from ○, H. diversicolor; ●,
Haliotis discus; ■, Haliotis asinina; △, Lottia gigantea; ▲, Littorina littorea ; ◇,
Crassostrea angulata; ◆, Meretrix meretrix.
JU063184
83
96
GD241786
EE676526
97
JU071961
98
89
EG362622
JU063185
67
JU078033
100
AF369698
51
FC665937
75
Group A
FC706468
85
JU069467
JU066047
72
DN763845
73
GT866721
25
EE664153
JU063488
50
JU004909
Group B
JU034913
24
JI268876
50
JI273633
62
JU076278
100
77
JU066084
JU066550
JU066548
95
JU090408
82
JT999542
87
98
0.2
JI284184
SARP19 sequence collection
– Step4.
• More mRNA sequence libraries were added: Haliotis midae
(Bioproject: PRJNA79815), Aplysia californica (EST), Octopus
vulgaris (Bioproject: PRJNA79361)
• After search, 12 more mRNA sequences from H. midae were
recruited. No SARP19-like sequence was found from sea hare
and octopus neural transcriptoms.
• The NJ guild tree was built as previous described (FigS5) .
FigS5. NJ tree of 39 SARP19like sequences.
○, Haliotis diversicolor;
●, H. discus;
□ , H. midae;
■, H. asinina;
34
△, Lottia gigantea;
▲, Littorina littorea ;
◇, Crassostrea angulata;
◆, Meretrix meretrix; 17
74
98
99
91
99
92
52
100
JU063184
GD241786
EE676526
EG362622
JU071961
Cg 19409
JU063185
99
Cg 4147
JU078033
Cg 1131
100
AF369698
FC665937
FC706468
62
82
Group A
JU069467
Cg 4092
EE664153
DN763845
JU066047
95
Cg 5590
GT866721
Cg 9978
100
99
69
73
52
75
JI273633
JU034913
31
JI268876
65
JU063488
Collection boundary
JU004909
37
99
JU076278
80
JU066084
35
JU066550
Cg 1279
95
100
81
59
79
Group B
Cg 1847
JU066548
Cg 3912
31
JU090408
Cg 7118
JT999542
JI284184
Cg 22570
100
0.2
Outgroup
for
GroupA
Cg 20522
SARP19 sequence collection
– Step5.
• Setting the collection boundary
– Best hitting. All sequences from Group A show best
hitting to other members in Group A. However, some
sequences of Group B show ambivalent best hitting
pattern.
– To simplify the situation, a boundary was set as
showed in FigS5. Those ambivalent sequences could
act as outgroup of Group A.
• 26 sequences of the collection were final gothrough SWISSPROT, NR and NT databases to find
any new sequences that fit the boundary.
However, no new sequence was qualified.
Vdg3 sequence collection
- Step 1.
AA sequences of H. diversicolor Vdg3-I1 were
BLASTp to GenBank NR protein database.
The hitting pattern was much simpler than
SARP19 (FigS6). No conserved motif was
found in the putative vdg3 proteins.
FigS6. H. diversicolor vdg3-I1 Hitting to
GenBank NR protein database
Vdg3 sequence collection
- Step 2.
• CDS of H. diversicolor vdg3-I1 and H. asinina vdg3 were hit to
former constructed sequence libraries and Lottia gigantea
genome (http://genome.jgi-psf.org/pages/blast.jsf?db=Lotgi1).
• Similar sequences were identified and they were repeated
such hitting until no new sequence was added.
• As former procedure, redundancies were removed, CDS were
predicted and patched, alignments and NJ trees were
constructed, and collection boundaries were set by best
hitting or guild trees.
• Sequences of the collection were final go-through SWISSPROT,
NR protein database and nt DNA database to recruit any
missed sequences that fit the boundary.
• The NJ tree of 30 vdg3-like proteins was showed in FigS7.
FigS7. NJ tree of 30 vdg3-like
sequences.
○ H. diversicolor;
● H. discus;
□ H. midae;
■ H. asinina;
△ Lottia gigantea;
▲ Littorina littorea;
◇ Crassostrea angulata;
◆ Meretrix meretrix;
▼ Mytilus galloprovincialis.
qPCR calibration formula
•
For a set of qPCR reactions in a same run, fluorescence intensities (designate as F)
should be constant (set as f) when they reach their threshold cycle numbers (Ct)
i.e.,
FCt = f
(1).
• While the fluorescence intensity F in a SYBR Green qPCR system is in direct
proportion to the DNA amount of an amplicon, then
F = k•N•L
(2),
where k is an unknown constant, N is the molecular number and L is the amplicon
length.
• While
N = N0 • E^Ct
(3),
where N0 is the initial molecular number and E is the PCR efficiency; then, for each
gene
k •N0(gene) • E(gene)^Ct(gene)•L(gene) = f
(4).
• Then, we have the calibration formula
N0(target gene) / N0(control gene) = (E(control gene)^Ct(control gene)•L(control gene))/ (E(target gene)^Ct(target
(5),
gene)•L(target gene))
where OAZ1 was set as control gene and N0(control gene) in each stage was set as 100.