Bas`s slides from Monday morning introducing

Download Report

Transcript Bas`s slides from Monday morning introducing

Perl scripting
Bas E. Dutilh, Marc van Driel
This course is licensed under the CreativeCommons
Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) license
For more information: http://creativecommons.org/licenses/by-nc-sa/3.0/
Programming

Analysis
What is the problem
you want to solve?

Modeling
Which steps do you
need?

Implementation
Write the steps in code

Debugging
Removing the errors
Why program?
Perl




Scripting language
Rapid development
For Unix / Windows / …
Big community



Google answers most scripting questions!
Fast / efficient
Fun
Layout
#!/usr/bin/perl
# My first Perl script
print "My first Perl script\n";

Location of the interpreter (Perl program)
/usr/bin/perl
Comment lines indicated with a hash
Functions are used to do things
#
print







#!
Shebang
Functions are case sensitive
Built-in (functions) or write yourself (subroutines)
End each operation with a semicolon
String of characters delimited by quotes

Newline character
;
""
\n
$
Variables: scalars

Number

Operators








+
*
/
%
**
=
+=
-=
*=
/=
%=
**=
.
=
.=
$n =
$a = $n +
$b = $a $c = $b *
$d = $c /
$e = $d %
$f = $e **
3;
2;
1;
4;
2;
3;
2;
String

Operators



Be
Add
Subtract
Multiply
Divide
Modulus
Exponent
Be
Concatenate
Pointer
$s = "hell";
$g = $s . "o";
Strings

Single quotes




Escape special characters
print 'Single quote: \'';
'
print 'Backslash: \\';
\
Double quotes





print 'String';
Tab
Newline
"
\
$
print "String\n";
print "column1\tcolumn2\n";
print "line1\n$line2\n";
print "Double quote: \"\n";
print "Backslash: \\\n";
print "Dollar \$ign\n";
Conditions
Number
$n = 3;

if ($n == 3) { }
If



Equal to
Not equal to
Greater than





$g eq "hello"
$g ne "bye"
$g gt "bye"
$n >= 3
$n < 4
$g ge "hello"
$g lt "yes"
$n <= 3
$g le "hello"
Strings: alphabetically
Equal to or smaller than
Else if
Else
$n == 3
$n != 2
$n > -2
Strings: alphabetically
Equal to or greater than
Smaller than


String
$g = "hello";
elsif ($g == "hello") { }
else { }
Logical operators





Non-zero numbers
Strings
if (! ("")) { }
TRUE if condition is FALSE
AND


if (-488) { }
NOT


Number: 0
Empty string: ""
TRUE


if (0) { }
FALSE
TRUE if condition1 AND condition2 are true
OR

if (("true") && (1)) { }
if (("false") || (0)) { }
TRUE if condition1 AND / OR condition2 are true
Some string functions
$g = "hello";

Length
$l = length ($g);
print "$l\n";

Reverse
$r = reverse ($s);
print "$r\n";

Sub-string

$s = substr ($g, 0, 4);
Arguments: string, start-position, length
print "$s\n";

Split
@a = split //, $s;
print "$a[4]\n";
@
Variables: arrays

Array (= list) of scalars:
@a = (1, "A", "yes");
$a[3] = 5.66;
$a[4] = 3;
$a[5] = $a[3] - $a[0] ** $a[4];
$a[6] = $a[4] . $a[3];

1
A
yes
5.66
3
101.19
35.66
0
1
2
3
4
5
6
Special elements
$b = $a[-1] - $a[-4];
$c = $#a;
Some array functions

Scalar
@a = (1, "A", "yes", 5.66, 3);
$s = scalar (@a);
print "$s\n";

$j = join (@a, "; ");
Join
print "$j\n";

push (@a, "next");
Push
print "$a[-1]\n";

Splice

@b = splice (@a, 3, 2, "no");
Arguments: array, start, length, replace
print "$a[3]\t$b[-1]\n";

Sort


Default: alphabetically
Reverse
@c = sort (@a);
print "$c[0]\n";
@r = reverse (@a);
print "$r[0]\n";
%
Variables: Hashes
@a
Array:
$a[0]

$a[5]
1
A
yes
0
1
2
5.66
3
3
4
101.19
5
35.66
6
$#a
A hash is a list with a string as index

No order
%h
$h{"I am last"}
Hash:
1
$h{"a"}
A
yes
5.66
3
101.19
35.66
Some hash functions
%h = ("a" => 1, "segundo" => "A",
"Bob" => "yes", "23" => 5.66,
"me too" => 3, "me" => 101.19,
"I am last" => 35.66);

Keys

Values
@k = keys (%h);
print "$k[2]\n";
@v = values (%h);
print "$v[2]\n";
Loops

For each
Note the indent: keep
your script readable!
@a = (0, 1, 2, 3, 4, 5, 6, 7, 8, 9);
foreach $i (0
(@a)
(0,
..
1,
{9)
2,{3, 4, 5, 6, 7, 8, 9) {
print "$i\n";
}

While
$i = 0;
while ($i < 10) {
print "$i\n";
++$i;
}

For
for ($i = 0; $i < 10; $i += 1) {
print "$i\n";
}



Start
Condition
Iterate
Jumping loops

Next
ID: foreach $i (keys %seqs) {
if (substr ($i, 0, 1) ne ">") {
next ID;
}
print "$i\n$seqs{$i}\n";
}

Last
LOOP: for ($i = 0; $i < 10; $i += 1) {
if ($i >= 5) {
last LOOP;
}
print "$i\n";
}
Reading from files


Open for reading
Read
if (!(IN,
open
(open
"<genes.txt");
(IN, "<genes.txt"))) {
die "Can't open genes.txt: $!\n";
}
$line = <IN>;
print "The first line is: $line";
while ($line = <IN>){

chomp $line;
print $line . "\n";
Chomp
}

Close
close IN;
Writing to files


Open for writing > overwrite
if (! (open (G, ">genes.txt"))) {
Write
die "Can't write to genes.txt:
$!\n";
}
@genes = ("gene1", "gene2", "gene3");
print G "$genes[0]\n";
close G;

Close

Open for writing >> append
if (! (open (G, ">>genes.txt"))) {
die "Can't append to genes.txt: $!\n“;
}
for ($i = 1; $i < 3; ++$i) {
print G "$genes[$i]\n";
}
close G;
Further reading
Programming Perl
Larry Wall
Beginning perl for bioinformatics
James Tisdall
CPAN - Comprehensive Perl Archive Network
http://search.cpan.org/