Slides - Edwards Lab

Download Report

Transcript Slides - Edwards Lab

Advanced Python
Concepts: Object
Oriented Programming
BCHB524
2015
Lecture 17
11/4/2015
BCHB524 - 2015 - Edwards
Using Classes

We've actually been using objects and their
methods already!
s = 'ACGTACGTACGTACGT'
print s.count('T')
print s.replace('T','U')
l = [6,5,4,3,2,1]
l.append(10)
l.sort()
s = set()
s.add(1)
s.add(2)
11/4/2015
BCHB524 - 2015 - Edwards
2
Using Classes

We've actually been using objects and their
methods already!
import Bio.SeqIO
thefile = open("ls_orchid.fasta")
for seq_record in Bio.SeqIO.parse(thefile, "fasta"):
print seq_record.id
print seq_record.description
print seq_record.seq
thefile.close()
11/4/2015
BCHB524 - 2015 - Edwards
3
Using Classes

Classes make instances of objects



Objects can hold information



string is a class, 'ACGT' is an instance of a string.
Make new instances using class name:
 s = string(), d = dict(), s = set(), i = int(2)
seq_record.id, seq_record.seq, seq_record.annotations
Called data members or attributes
Objects can perform actions


11/4/2015
s = 'ACGT'; print s.count('a')
Called methods
BCHB524 - 2015 - Edwards
4
Classes as Concepts

Classes allow us to add new concepts to a
language.

Suppose we wanted to add a "DNA
sequence" concept to python


11/4/2015
What information should "DNA sequence"
capture?
What actions or operations should "DNA
sequence" provide?
BCHB524 - 2015 - Edwards
5
DNA Sequence Class

Data members:


sequence, name, organism.
Methods:

11/4/2015
length, reverse, complement,
reverseComplement, transcribe, translate
percentGC, initMet, freq
BCHB524 - 2015 - Edwards
6
DNA Sequence Class
class DNASeq:
def reverse(self):
return self.seq[::-1]
def complement(self):
d = {'A':'T','C':'G','G':'C','T':'A'}
return ''.join(map(d.get,self.seq))
def reverseComplement(self):
return ''.join(reversed(self.complement()))
def length(self):
return len(self.seq)
ds = DNASeq()
ds.seq = 'ACGTACGTACGT'
ds.name = 'My sequence'
print ds.complement(),ds.length(),ds.reverseComplement()
11/4/2015
BCHB524 - 2015 - Edwards
7
DNA Sequence Class
class DNASeq:
#....
def length(self):
return len(self.seq)
def freq(self,nuc):
return self.seq.count(nuc)
def percentGC(self):
gccount = self.freq('C') + self.freq('G')
return 100*float(gccount)/self.length()
ds = DNASeq()
ds.seq = 'ACGTACGTACGT'
ds.name = 'My sequence'
print ds.freq('C'),ds.freq('G'),ds.length(),ds.percentGC()
11/4/2015
BCHB524 - 2015 - Edwards
8
DNA Sequence Class

The special method __init__ is called when a
new instance is created.


Used to initialize data-members.
Forces class user to provide valid initial information.
class DNASeq:
def __init__(self,seq,name):
self.seq = seq
self.name = name
#....
ds = DNASeq('ACGTACGTACGTACGT', 'My sequence')
print ds.freq('C'),ds.freq('G'),ds.length(),ds.percentGC()
11/4/2015
BCHB524 - 2015 - Edwards
9
DNA Sequence Class

Somtimes __init__ is used to set up an "empty"
instance.

Other methods or data-members used to instantiate
class DNASeq:
def __init__(self):
self.seq = ""
self.name = ""
def read(self,filename):
self.seq = ''.join(open(filename).read().split())
#....
ds = DNASeq()
ds.name = 'Anthrax SASP'
ds.read('anthrax_sasp.nuc')
11/4/2015
BCHB524 - 2015 - Edwards
print ds.freq('C'),ds.freq('G'),ds.length(),ds.percentGC()
10
DNA Sequence Class

Default arguments allow us to set up "empty",
partial, or completely instantiated instances.
class DNASeq:
def __init__(self,seq="",name=""):
self.seq = seq
self.name = name
def read(self,filename):
self.seq = ''.join(open(filename).read().split())
#....
ds = DNASeq(name='Anthrax SASP')
ds.read('anthrax_sasp.nuc')
print ds.freq('C'),ds.freq('G'),ds.length(),ds.percentGC()
11/4/2015
BCHB524 - 2015 - Edwards
11
Complete DNASeq.py Module
class DNASeq:
def __init__(self,seq="",name=""):
self.seq = seq
self.name = name
def read(self,filename):
self.seq = ''.join(open(filename).read().split())
def reverse(self):
return self.seq[::-1]
def complement(self):
d = {'A':'T','C':'G','G':'C','T':'A'}
return ''.join(map(d.get,self.seq))
def reverseComplement(self):
return ''.join(reversed(self.complement()))
def length(self):
return len(self.seq)
def freq(self,nuc):
return self.seq.count(nuc)
def percentGC(self):
gccount = self.freq('C') + self.freq('G')
return 100*float(gccount)/self.length()
11/4/2015
BCHB524 - 2015 - Edwards
12
Complete DNASeq.py Module

Describe class in a module, then access
using an import statement
from DNASeq import DNASeq
ds = DNASeq('ACGTACGTACGTACGT','My sequence')
print ds.complement(),ds.length(),ds.reverseComplement()
print ds.freq('C'),ds.freq('G'),ds.length(),ds.percentGC()
ds = DNASeq()
ds.read('anthrax_sasp.nuc')
print ds.complement(),ds.length(),ds.reverseComplement()
print ds.freq('C'),ds.freq('G'),ds.length(),ds.percentGC()
11/4/2015
BCHB524 - 2015 - Edwards
13
A class for codon tables
Method calls, for instance "codons":

codons.read(filename)


codons.amino_acid(codon)


returns single amino-acid represented by a codon with N's
codons.startswith_init(seq)


returns true if codon is an initiation codon false, otherwise
codons.get_ambig_aa (codon)


returns amino-acid symbol for codon
codons.is_init(codon)


stores the contents of filename in the codon_table object.
returns true if DNA sequence seq starts with init codon
codons.translate(seq,frame)

11/4/2015
returns amino-acid sequence for DNA sequence seq
BCHB524 - 2015 - Edwards
14
A class for codons
from DNASeq import *
from codon_table import *
import sys
if len(sys.argv) < 3:
print "Require codon table and DNA sequence on command-line."
sys.exit(1)
codons = codon_table()
codons.read(sys.argv[1])
seq = DNASeq()
seq.read(sys.argv[2])
if codons.startswith_init(seq):
print "Initial codon is an initiation codon"
for frame in (1,2,3):
print "Frame",frame,"(forward):",codons.translate(seq,frame)
11/4/2015
BCHB524 - 2015 - Edwards
15
A class for codons

In codon_table.py:
11/4/2015
class codon_table:
def __init__(self):
self.table = {}
def read(self,filename):
# magic
def amino_acid(self,codon):
# magic
return aa
def is_init(self,codon):
# magic
return result
def get_ambig_aa(self,codon):
# magic
return aa
def startswith_init(self,seq):
# magic
return result
def translate(self,seq,frame):
# magic
return aaseq
BCHB524 - 2015 - Edwards
16
Side by side
from MyNucStuff import *
from codon_table import *
import sys
from DNASeq import *
from codon_table import *
import sys
if len(sys.argv) < 3:
print "Require codon table and", \
"DNA sequence on command-line."
sys.exit(1)
if len(sys.argv) < 3:
print "Require codon table and", \
"DNA sequence on command-line."
sys.exit(1)
codons = read_codons(sys.argv[1])
codons = codon_table()
codons.read(sys.argv[1])
seq = read_seq(sys.argv[2])
seq = DNASeq()
seq.read(sys.argv[2])
if is_init(codons,seq[:3]):
print "Initial codon"
if codons.startswith_init(seq):
print "Initial codon"
print translate(codons,seq,1)
print codons.translate(seq,1)
11/4/2015
BCHB524 - 2015 - Edwards
17
Exercises

Convert your modules for DNA sequence and
codons to a codon_table and DNASeq class.

Demonstrate the use of this module and the codon
table module to translate an amino-acid sequence in
all six-frames with just a few lines of code.

11/4/2015
Hint: just import the new classes from their module(s) and
call the necessary methods/functions!
BCHB524 - 2015 - Edwards
18
Homework 10

Due Monday, November 9th.

Exercises from Lecture 16

Exercises from Lecture 17
11/4/2015
BCHB524 - 2015 - Edwards
19