Transcript plotfold

Bioinformatics 90-
ucleic Acid Secondari
Structure
AND
Primer Selection
Graphic Display in GCG
Configuring Graphics Languages and Devices
Program
CodonPreference
DotPlot
Figure
Frames
FrameSearch /PLOt
GapShow
GrowTree
HelicalWheel
Isoelectric
MapPlot
Moment
PepPlot
PileUp
PAUPDisplay
PlasmidMap
PlotFold
PlotSimilarity
PlotStructure
PlotTest
PrettyBox
Prime
StatPlot
TestCode
WordSearch -PLOt
•GIF (Graphics Interchange Format) –
GIF87a, GIF89a
•HPGL (HP Graphics Language) –
ColorPro, HP7470, HP7475, HP7550,
HP7580, LaserJet3
•PNG (Portable Network Graphics) –
For WWW Browser
•PostScript
•ReGIS
•Sixel
•Tektronix
•Xwindows – Dowload x-win412.exe
Exercise 07-1
Configuring X-windows
Download x-win412.exe from ftp://163.25.92.42
Double click x-win412.exe, accept all default settings.
Start x-win32
Connect to GCG via TELNET
gcg 2% go
gcg 3% xwindows
Use XWindows graphics with what device:
Color Workstation
Gray Scale Workstation
Monochrome Workstation
Please choose one ( * COLORWORKSTATION * )
Plotting Configuration set to:
Language: xwindows
Device: COLORWORKSTATION
Port or Queue: GCG_Graphics
gcg 4% plottest
GIF & PostScript
Nucleic Acid Secondary Structure
Stemloop and Mfold
In Nucleic acids, inverted repeat sequences
may indicate foldback (self pairing)structures.
Identifying Inverted Repeats Stemloop
Calculating RNA Folding
Mfold
Displaying of Folding Structures
Plotfold/Dotplot
STEMLOOP
StemLoop finds stems (inverted repeats) within a sequence.
You specify the minimum stem length (number of nucleotides in a paired
stretch), minimum and maximum loop sizes, and the minimum number of
bonds per stem (length of nucleotide sequence between the paired regions).
Vertical bars ('|') indicating
the
base
pairs.
The
associated loop is shown to
the right of the stem. If either
the stem or loop is too long
to be displayed in its entirety
on the line, then only that
part that fits on the line is
shown. The first and last
coordinates of the stem are
displayed on the left, and the
length of the stem (size), the
number of bonds in the stem
(quality), and the loop size
are shown on the right.
start
end
size
217 AGGCTGCAGTG AGCCGTGAT 11, 25
|||||| ||||
C quality
257 TCCGGCCTCAC GTCACCGCG
stem
STEMLOOP
Output formats
1) See the stems
2) See the stem coordinates
3) File the stems (*.fld)
4) File the stems as points for DOTPLOT
5) Choose new parameters
6) Get a different sequence
Sort stems by:
1) Position
2) Quality
3) Size
221 TGCAGTG AGCCGTG 7, 18
|||||||
248 ACGTCAC CGCGCTA 14
Loop Start End Size Quality
1
35
54
8
18
*.stem
*.pnt  DOTPLOT
MFOLD
Using energy minimization criteria, any predicted "optimal" secondary
structure for an RNA or DNA molecule depends on the model of
folding and the specific folding energies used to calculate that
structure. Different optimal foldings may be calculated if the folding
energies are changed even slightly. Because of uncertainties in the
folding model and the folding energies, the "correct" folding may not
be the "optimal" folding determined by the program. You may therefore
want to view many optimal and suboptimal structures within a few
percent of the minimum energy. You can use the variation among
these structures to determine which regions of the secondary structure
you can predict reliably. For instance, a region of the RNA molecule
containing the same helix in most calculated optimal and suboptimal
secondary structures may be more reliably predicted than other
regions with greater variation.
Mfold output file: *.mfold
MFOLD
How to read *.mfold?
Survey of optimal and suboptimal foldings
A) sub-optimal energy plot
B) p-num plot
Sampling of optimal and suboptimal foldings
C) circles
D) domes
E) mountains
F) squiggles
PLOTFOLD
PLOTFOLD
A) sub-optimal energy plot
The energy dotplot indicates all of the base pairs involved in all optimal and suboptimal
secondary structures within the energy increment you specify. The plot takes the form
of a two-dimensional graph where both axes of the graph represent the same RNA
sequence. Each point drawn in the graph indicates a base pair between the
ribonucleotides whose positions in the sequence are the coordinates of that point on
the graph
PLOTFOLD
B) p-num plot
This plot shows the amount of variability in pairing at each position in the sequence in all
predicted foldings within the increment of the optimal folding energy you specify
PLOTFOLD
plotC) circles
PLOTFOLD
D) domes
PLOTFOLD
E) mountains
The program plots representative secondary structures that satisfy the energy increment and
window size criteria you specify.
PLOTFOLD
F) squiggles
Exercise 07-2
Stemloop & X-windows
Open the file “exercise07-2.doc” and follow the steps.
gcg2 4% fetch gb:d00063
d00063.gb_pl1
gcg2 5% stemloop d00063.gb_pl1
There are 16 stems. Would you like to
1) See the stems
2) See the stem coordinates
3) File the stems
4) File the stems as points for DOTPLOT
5) Choose new parameters
6) Get a different sequence
Q)uit?
Please choose one (* 1 *): Try 1-4
Sort stems by:
1) Position
2) Quality
3) Size
Q)uit
Please choose one (* 1 *):
Exercise 07-3
Mfold
Plotfold
Open the file “Exercixe07-3.doc” and follow the steps.
gcg2 4% fetch gb:j02061
J02061.gb_vi
gcg2 5% mfold j02061.gb_vi  j02061.mfold
$ Mfold
(Linear) MFOLD what sequence ? j02061.gb_vi
Begin (* 1 *) ?
End (* 121 *) ?
What should I call the energy matrix output file (* j02061.mfold *) ?
Primer Selection
Nucleotide
sequences
Specificity - %GC Dimer – Hairpin - Tm
Amino Acid
sequences
Pileup
Pretty
Prettybox
CONSENSUS
Nucleotide
Amino Acid
backtranslate
Primer Selection Program-Prime
Confirm by BLAST
Primer Length
Minimum Maximum -
---------------------------------------------PCR Product Length
Minimum Maximum -
---------------------------------------------Maximum number of primers or PCR
products in output (range 1 thru 2500)
Primer DNA concentration (nM)
(range .1 thru 500.0) Salt concentration (mM) (range .1 thru
500.0) -
---------------------------------------------Select:
forward primers, only
reverse primers, only
primers on both strands for PCR
Set maximum overlap (in base pairs)
between predicted PCR products
Forward strand primer extension must
include position
Reverse strand primer extension must
include position
----------------------------------------------
Reject duplicate primer binding sites on
template
Specify primer 3' clamp (using IUB
ambiguity codes)
----------------------------------------------Primer % G+C
Minimum (range 0.0 thru 100.0)
Maximum
----------------------------------------------Primer Melting Temperature
(degrees Celsius)
Minimum (range 0.0 thru 200.0)
Maximum
----------------------------------------------Maximum difference between
melting temperatures of two
primers in PCR (degrees Celsius)
(range 0.0 thru 25.0)
----------------------------------------------Product % G+C
Minimum (range 0.0 thru 100.0)
Maximum
----------------------------------------------Product Melting Temperature
(degrees Celsius)
Minimum (range 0.0 thru 200.0)
Maximum
Exercise 07-4
Primer Selection
Use the human npm cDNA sequence to design
a pair of primers that will copy the
whole coding sequence when
translated in frame.
THEN
Check the specificity of the primers by using BLAST.