Slides - Edwards Lab

Download Report

Transcript Slides - Edwards Lab

Introduction to Python
BCHB524
Lecture 4
BCHB524 - Edwards
Outline






Homework #1 Notes
Review
Functions, again (End of Lecture 3)
Control flow: if statement
Control flow: for statement
Exercises
BCHB524 - Edwards
2
Homework #1 Notes

Python programs:




Upload .py files
Don't paste into comment box
Don't paste into your writeup
Writeup:



Upload .txt files,
Don't paste into comment box
Text document preferred
BCHB524 - Edwards
3
Homework #1 Notes

Multiple submissions:




OK, but…
…I'll ignore all except the last one
Make each (re-)submission complete
Grading:



Random grading order
Comments
Grading "curve"
BCHB524 - Edwards
4
Review


Printing and execution
Variables and basic data-types:




Functions, using/calling and defining:




integers, floats, strings
Arithmetic with, conversion between
String characters and chunks, string methods
Use in any expression
Parameters as input, return for output
Functions calling other functions (oh my!)
If statements – conditional execution
BCHB524 - Edwards
5
Control Flow: if statement
# The input DNA sequence
seq = 'atggcatgacgttattacgactctgtgtggcgtctgctggg'
# Remove the initial Met codon if it is there
if seq.startswith('atg'):
print "Sequence without initial Met:",seq[3:]
else:
print "Sequence (no initial Met):",seq


Execution path depends on string in seq.
Make sure you change seq to different
values.
BCHB524 - Edwards
6
Control Flow: if statement
# The input DNA sequence
seq = 'atggcatgacgttattacgactctgtgtggcgtctgctggg'
# Remove the initial Met codon if it is there
if seq.startswith('atg'):
initMet = True
newseq = seq[3:]
else:
initMet = False
newseq = seq
# Output the results
print "Original sequence:",seq
print "Sequence starts with Met:",initMet
print "Sequence without initial Met:",newseq
BCHB524 - Edwards
7
Control Flow: if statement
# The input DNA sequence
seq = 'atggcatgacgttattacgactctgtgtggcgtctgctggg'
# Remove the initial Met codon if it is there
initMet = seq.startswith('atg'):
if initMet:
newseq = seq[3:]
else:
newseq = seq
# Output the results
print "Original sequence:",seq
print "Sequence starts with Met:",initMet
print "Sequence without initial Met:",newseq
BCHB524 - Edwards
8
Control Flow: if statement
# The input DNA sequence
seq = 'atggcatgacgttattacgactctgtgtggcgtctgctggg'
# Remove the initial Met codon if it is there
initMet = seq.startswith('atg')
if initMet:
seq = seq[3:]
# Output the results
print "Sequence starts with Met:",initMet
print "Sequence without initial Met:",seq
BCHB524 - Edwards
9
Serial if statement
# Determine the complementary nucleotide
def complement(nuc):
if nuc == 'A':
comp = 'T'
if nuc == 'T':
comp = 'A'
if nuc == 'C':
comp = 'G'
if nuc == 'G':
comp = 'C'
return comp
# Use
print
print
print
print
the complement function
"The complement of A is",complement('A')
"The complement of T is",complement('T')
"The complement of C is",complement('C')
"The complement of G is",complement('G')
BCHB524 - Edwards
10
Compound if statement
# Determine the complementary nucleotide
def complement(nuc):
if nuc == 'A':
comp = 'T'
elif nuc == 'T':
comp = 'A'
elif nuc == 'C':
comp = 'G'
elif nuc == 'G':
comp = 'C'
else:
comp = nuc
return comp
# Use
print
print
print
print
the complement function
"The complement of A is",complement('A')
"The complement of T is",complement('T')
"The complement of C is",complement('C')
"The complement of G is",complement('G')
BCHB524 - Edwards
11
If statement conditions


Any expression (variable, arithmetic, function
call, etc.) that evaluates to True or False
Any expression tested against another
expression using:





== (equality), != (inequality)
< (less than), <= (less than or equal)
> (greater than), >= (greater than or equal)
in (an element of)
Conditions can be combined using:

and, or, not, and parentheses
BCHB524 - Edwards
12
For (each) statements

Sequential/Iterative execution
# Print the numbers 0 through 4
for i in range(0,5):
print i
# Print the nucleotides in seq
seq = 'ATGGCAT'
for nuc in seq:
print nuc

Note use of indentation to define a
block!
BCHB524 - Edwards
13
For (each) statements
# Input to program
seq = 'AGTAGTTCGCGTAGCTAGCTAGCTATGCG'
# Examine each symbol in seq and count the A's
count = 0
for nuc in seq:
if nuc == 'A':
count = count + 1
# Output the result
print "Sequence",seq,"contains",count,"A symbols"
BCHB524 - Edwards
14
For (each) statements
# Examine each symbol in seq and count the A's
def countAs(seq):
count = 0
for nuc in seq:
if nuc == 'A':
count = count + 1
return count
# Input to program
inseq = 'AGTAGTTCGCGTAGCTAGCTAGCTATGCG'
# Compute count
aCount = countAs(inseq)
# Output the result
print "Sequence",inseq,"contains",aCount,"A symbols"
BCHB524 - Edwards
15
For (each) statements
# Examine each symbol in seq and count those that match sym
def countSym(seq,sym):
count = 0
for nuc in seq:
if nuc == sym:
count = count + 1
return count
# Input to program
inseq = 'AGTAGTTCGCGTAGCTAGCTAGCTATGCG'
# Compute count
aCount = countSym(inseq,'A')
# Output the result
print "Sequence",inseq,"contains",aCount,"A symbols"
BCHB524 - Edwards
16
Exercise 1

Write a Python program to compute the
reverse complement of a codon



Modularize! Place the reverse complement
code in a new function.


Use my solution to Homework #1 Exercise #1 as a
starting point
Add the “complement” function of this lecture
(slide 12) as provided.
Call the new function with a variety of codons
Change the complement function to handle
upper and lower-case nucleotide symbols.

Test your code with upper and lower-case codons.
BCHB524 - Edwards
17
Exercise 2

Write a Python program to determine whether or not
a DNA sequence consists of a (integer) number of
(perfect) "tandem" repeats.



Test it on sequences:
 AAAAAAAAAAAAAAAA
 CACACACACACACAC
 ATTCGATTCGATTCG
 GTAGTAGTAGTAGTA
 TCAGTCACTCACTCAG
Hint: Is the sequence the same as many repetitions of its
first character?
Hint: Is the first half of the sequence the same as the
second half of the sequence?
BCHB524 - Edwards
18