Introduction to Python

Download Report

Transcript Introduction to Python

Introduction to Python
Modified from
• Chen Lin
• Guido van Rossum
• Mark Hammond
• John Zelle
What Is Python?

Created in 1990 by Guido van Rossum
While at CWI, Amsterdam
 Now hosted by centre for national research
initiatives, Reston, VA, USA


Free, open source


And with an amazing community
Object oriented language

“Everything is an object”
Why Python?

Designed to be easy to learn and master
Clean, clear syntax
 Very few keywords


Highly portable
Runs almost anywhere - high end servers
and workstations, down to windows CE
 Uses machine independent byte-codes


Extensible

Designed to be extensible using C/C++,
allowing access to many external libraries
Most obvious and notorious
features

Clean syntax plus high-level data types


Uses white-space to delimit blocks


Leads to fast coding
Humans generally do, so why not the
language?
Variables do not need declaration

Although not a type-less language
4 Major Versions of Python

“Python” or “CPython” is written in C/C++
Version 2.7
 Version 3.3


“Jython” is written in Java for the JVM

“IronPython” is written in C# for the .Net
environment
Pydev with Eclipse
Python Interactive Shell
% python
Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
You can type things directly into a running Python session
>>> 2+3*4
14
>>> name = "Andrew"
>>> name
'Andrew'
>>> print "Hello", name
Hello Andrew
>>>
Background
 Data Types/Structure
 Control flow
 File I/O
 Modules

“Hello World” in Python
print "hello World!"
Blocks

Blocks are delimited by indentation
Colon used to start a block
 Tabs or spaces may be used
 Maxing tabs and spaces works, but is
discouraged


>>> if 1:
...
print "True"
...
True
>>>
Python Build-in types
Numbers
 Strings
 Lists
 Dictionaries
 Files

3.1415
“Hello World”
[1,2,3,4]
{‘test’: ‘yum’}
input=open(‘file.txt’, ‘r’)
Variables and Types
Objects always have a type
 >>> a = 1
>>> type(a)
<type 'int'>
>>> a = "Hello"
>>> type(a)
<type 'string'>
>>> type(1.0)
<type 'float'>

Number in Action
>>> a=3
>>> b=4
#Name Created
>>> a+1, a-1
(4,2)
>>> b*3, b/2
(12,2)
# (3+1), (3-1)
Simple Data Types

Integer objects implemented using C
longs
Like C, integer division returns the floor
 >>> 5/2
2


Float types implemented using C doubles

No point in having single precision since
execution overhead is large anyway
Simple Data Types

Long Integers have unlimited size
Limited only by available memory
 >>> long = 1L << 64
>>> long ** 5

2135987035920910082395021706169552114602704522
3566527699470416078222197257806405500229620869
36576L
High Level Data Types

Tuples are similar to lists
Sequence of items
 Key difference is they are immutable
 Often used in place of simple structures

Automatic unpacking
 >>> point = 2,3
>>> x, y = point
>>> x
2

Reference Semantics

Assignment manipulates references
x
= y does not make a copy of y
 x = y makes x reference the object y references
Very useful; but beware!
 Example:

>>> a = [1, 2, 3]
>>> b = a
>>> a.append(4)
>>> print b
[1, 2, 3, 4]
Changing a Shared List
a = [1, 2, 3]
a
1
2
3
1
2
3
1
2
3
a
b=a
b
a
a.append(4)
b
4
Changing an Integer
a=1
a
1
a
b=a
1
b
a
new int object created
by add operator (1+1)
2
a = a+1
b
old reference deleted
by assignment (a=...)
1
Strings

The next major build-in type is the Python
STRING --- an ordered collection of characters
to store and represent text-based information

Python strings are categorized as immutable
sequences --- meaning they have a left-to-right
order (sequence) and cannot be changed in
place (immutable)
Single- and Double-Quoted
Single- and Double-Quoted strings are the
same
>>> ‘Hello World’ , “Hello World”

The reason for including both is that it
allows you to embed a quote character of
the other inside a string
>>> “knight’s” , ‘knight”s’

len()

The len build-in function returns the length
of strings
>>> len(‘abc’)
>>> a=‘abc’
>>> len(a)
+

Adding two string objects creates a new string
object
>>> ‘abc’ + ‘def’
>>> a=‘Hello’
>>> b=‘World’
>>> a + b
>>> a+ ‘ ’ +b
*
Repetition may seem a bit obscure at first,
but it comes in handy in a surprising
number of contexts
 For example, to print a line of 80 dashes
>>> print ‘-’ * 80

Strings and Secret Codes
In the early days of computers, each
manufacturer used their own encoding of
numbers for characters.
 ASCII system (American Standard Code
for Information Interchange) uses 127 bit
codes
 Python supports Unicode (100,000+
characters)

25
Strings and Secret Codes


The ord function returns the numeric (ordinal)
code of a single character.
The chr function converts a numeric code to the
corresponding character.
>>> ord("A")
65
>>> ord("a")
97
>>> chr(97)
'a'
>>> chr(65)
'A'
26
Lists
Lists are Python’s most flexible ordered
collection object type
 Lists can contain any sort of object:
numbers, strings and even other lists
 Ordered collections of arbitrary objects
 Accessed by offset
 Variable length, heterogeneous,
arbitrary nestable

List
A compound data type:
[0]
[2.3, 4.5]
[5, "Hello", "there", 9.8]
[]
Use len() to get the length of a list
>>> names = [“Ben", “Chen", “Yaqin"]
>>> len(names)
3
Use [ ] to index items in the list
>>> names[0]
‘Ben'
>>> names[1]
‘Chen'
>>> names[2]
‘Yaqin'
>>> names[3]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>> names[-1]
‘Yaqin'
>>> names[-2]
‘Chen'
>>> names[-3]
‘Ben'
[0] is the first item.
[1] is the second item
...
Out of range values
raise an exception
Negative values
go backwards from
the last element.
Strings share many features with lists
>>> smiles = "C(=N)(N)N.C(=O)(O)O"
>>> smiles[0]
'C'
>>> smiles[1]
'('
>>> smiles[-1]
'O'
Use “slice” notation to
>>> smiles[1:5]
get a substring
'(=N)'
>>> smiles[10:-4]
'C(=O)'
String Methods: find, split
smiles = "C(=N)(N)N.C(=O)(O)O"
>>> smiles.find("(O)")
15
Use “find” to find the
>>> smiles.find(".")
start of a substring.
9
Start looking at position 10.
>>> smiles.find(".", 10)
Find returns -1 if it couldn’t
-1
find a match.
>>> smiles.split(".")
the string into parts
['C(=N)(N)N', 'C(=O)(O)O'] Split
with “.” as the delimiter
>>>
String operators: in, not in
if "Br" in “Brother”:
print "contains brother“
email_address = “clin”
if "@" not in email_address:
email_address += "@brandeis.edu“
String Method: “strip”, “rstrip”, “lstrip” are ways to
remove whitespace or selected characters
>>> line = " # This is a comment line \n"
>>> line.strip()
'# This is a comment line'
>>> line.rstrip()
' # This is a comment line'
>>> line.rstrip("\n")
' # This is a comment line '
>>>
More String methods
email.startswith(“c") endswith(“u”)
True/False
>>> "%[email protected]" % "clin"
'[email protected]'
>>> names = [“Ben", “Chen", “Yaqin"]
>>> ", ".join(names)
‘Ben, Chen, Yaqin‘
>>> “chen".upper()
‘CHEN'
Unexpected things about strings
>>> s = "andrew"
Strings are read only
>>> s[0] = "A"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item
assignment
>>> s = "A" + s[1:]
>>> s
'Andrew‘
“\” is for special characters
\n -> newline
\t -> tab
\\ -> backslash
...
But Windows uses backslash for directories!
filename = "M:\nickel_project\reactive.smi" # DANGER!
filename = "M:\\nickel_project\\reactive.smi" # Better!
filename = "M:/nickel_project/reactive.smi" # Usually works
Lists are mutable - some useful
methods
>>> ids = ["9pti", "2plv", "1crn"]
>>> ids.append("1alm")
>>> ids
['9pti', '2plv', '1crn', '1alm']
>>>ids.extend(L)
Extend the list by appending all the items in the given list; equivalent to a[len(a):] = L.
>>> del ids[0]
>>> ids
['2plv', '1crn', '1alm']
>>> ids.sort()
>>> ids
['1alm', '1crn', '2plv']
>>> ids.reverse()
>>> ids
['2plv', '1crn', '1alm']
>>> ids.insert(0, "9pti")
>>> ids
['9pti', '2plv', '1crn', '1alm']
append an element
remove an element
sort by default order
reverse the elements in a list
insert an element at some
specified position.
(Slower than .append())
Tuples: sort of an immutable list
>>> yellow = (255, 255, 0) # r, g, b
>>> one = (1,)
>>> yellow[0]
>>> yellow[1:]
(255, 0)
>>> yellow[0] = 0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
Very common in string interpolation:
>>> "%s lives in %s at latitude %.1f" % ("Andrew", "Sweden", 57.7056)
'Andrew lives in Sweden at latitude 57.7'
zipping lists together
>>> names
['ben', 'chen', 'yaqin']
>>> gender = [0, 0, 1]
>>> zip(names, gender)
[('ben', 0), ('chen', 0), ('yaqin', 1)]
High Level Data Types

Dictionaries hold key-value pairs
Often called maps or hashes. Implemented
using hash-tables
 Keys may be any immutable object, values
may be any object
 Declared using braces


>>> d={}
>>> d[0] = "Hi there"
>>> d["foo"] = 1
Dictionaries




Dictionaries are lookup tables.
They map from a “key” to a “value”.
symbol_to_name = {
"H": "hydrogen",
"He": "helium",
"Li": "lithium",
"C": "carbon",
"O": "oxygen",
"N": "nitrogen"
}
Duplicate keys are not allowed
Duplicate values are just fine
Keys can be any immutable value
numbers, strings, tuples, frozenset,
not list, dictionary, set, ...
atomic_number_to_name = { A set is an unordered collection
1: "hydrogen"
with no duplicate elements.
6: "carbon",
7: "nitrogen"
8: "oxygen",
}
nobel_prize_winners = {
(1979, "physics"): ["Glashow", "Salam", "Weinberg"],
(1962, "chemistry"): ["Hodgkin"],
(1984, "biology"): ["McClintock"],
}
Dictionary
>>> symbol_to_name["C"]
Get the value for a given key
'carbon'
>>> "O" in symbol_to_name, "U" in symbol_to_name
(True, False)
>>> "oxygen" in symbol_to_name Test if the key exists
(“in” only checks the keys,
False
>>> symbol_to_name["P"]
not the values.)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'P'
>>> symbol_to_name.get("P", "unknown")
'unknown'
>>> symbol_to_name.get("C", "unknown")
'carbon'
[] lookup failures raise an exception.
Use “.get()” if you want
to return a default value.
Some useful dictionary methods
>>> symbol_to_name.keys()
['C', 'H', 'O', 'N', 'Li', 'He']
>>> symbol_to_name.values()
['carbon', 'hydrogen', 'oxygen', 'nitrogen', 'lithium', 'helium']
>>> symbol_to_name.update( {"P": "phosphorous", "S": "sulfur"} )
>>> symbol_to_name.items()
[('C', 'carbon'), ('H', 'hydrogen'), ('O', 'oxygen'), ('N', 'nitrogen'), ('P',
'phosphorous'), ('S', 'sulfur'), ('Li', 'lithium'), ('He', 'helium')]
>>> del symbol_to_name['C']
>>> symbol_to_name
{'H': 'hydrogen', 'O': 'oxygen', 'N': 'nitrogen', 'Li': 'lithium', 'He': 'helium'}
Background
 Data Types/Structure
list, string, tuple, dictionary
 Control flow
 File I/O
 Modules

Control Flow
Things that are False
 The boolean value False
 The numbers 0 (integer), 0.0 (float) and 0j (complex).
 The empty string "".
 The empty list [], empty dictionary {} and empty set set().
Things that are True
 The boolean value True
 All non-zero numbers.
 Any string containing at least one character.
 A non-empty data structure.
If
>>> smiles = "BrC1=CC=C(C=C1)NN.Cl"
>>> bool(smiles)
True
>>> not bool(smiles)
False
>>> if not smiles:
...
print "The SMILES string is empty"
...
 The “else” case is always optional
Use “elif” to chain subsequent tests
>>> mode = "absolute"
>>> if mode == "canonical":
...
smiles = "canonical"
... elif mode == "isomeric":
...
smiles = "isomeric”
... elif mode == "absolute":
...
smiles = "absolute"
... else:
...
raise TypeError("unknown mode")
...
>>> smiles
' absolute '
>>>
“raise” is the Python way to raise exceptions
Boolean logic
Python expressions can have “and”s and
“or”s:
if (ben <= 5 and chen >= 10 or
chen == 500 and ben != 5):
print “Ben and Chen“
Range Test
if (3 <= Time <= 5):
print “Office Hour"
Looping
The for statement loops over sequences
 >>> for ch in "Hello":
...
print ch
...
H
e
l
l
o
>>>

For
>>> names = [“Ben", “Chen", “Yaqin"]
>>> for name in names:
...
print smiles
...
Ben
Chen
Yaqin
Tuple assignment in for loops
data = [ ("C20H20O3", 308.371),
("C22H20O2", 316.393),
("C24H40N4O2", 416.6),
("C14H25N5O3", 311.38),
("C15H20O2", 232.3181)]
for (formula, mw) in data:
print "The molecular weight of %s is %s" % (formula, mw)
The molecular weight of C20H20O3 is 308.371
The molecular weight of C22H20O2 is 316.393
The molecular weight of C24H40N4O2 is 416.6
The molecular weight of C14H25N5O3 is 311.38
The molecular weight of C15H20O2 is 232.3181
Break, continue
Checking 3
>>> for value in [3, 1, 4, 1, 5, 9, 2]:
The square is 9
...
print "Checking", value
Checking 1
Ignoring
...
if value > 8:
Checking 4
The square is 16
...
print "Exiting for loop"
Checking
1
Use
“break”
to
stop
...
break
the for loopIgnoring
Checking 5
...
elif value < 3:
The to
square
is 25
Use
“continue”
stop
...
print "Ignoring"
processing Checking
the current9 item
Exiting for loop
...
continue
>>>
...
print "The square is", value**2
...
Range()



“range” creates a list of numbers in a specified range
range([start,] stop[, step]) -> list of integers
When step is given, it specifies the increment (or decrement).
>>> range(5)
[0, 1, 2, 3, 4]
>>> range(5, 10)
[5, 6, 7, 8, 9]
>>> range(0, 10, 2)
[0, 2, 4, 6, 8]
How to get every second element in a list?
for i in range(0, len(data), 2):
print data[i]
Functions
Functions are defined with the def
statement:
 >>> def foo(bar):
...
return bar
>>>
 This defines a trivial function named foo
that takes a single parameter bar

Functions

A function definition simply places a
function object in the namespace

>>> foo
<function foo at fac680>
>>>

And the function object can obviously be
called:

>>> foo(3)
3
>>>
Functions, Procedures
def name(arg1, arg2, ...):
"""documentation""" # optional doc
string
statements
return
return expression
# from procedure
# from function
Example Function
def gcd(a, b):
"greatest common divisor"
while a != 0:
a, b = b%a, a # parallel assignment
return b
>>> gcd.__doc__
'greatest common divisor'
>>> gcd(12, 20)
4
Classes
Classes are defined using the class
statement
 >>> class Foo:
...
def __init__(self):
...
self.member = 1
...
def GetMember(self):
...
return self.member
...
>>>

Classes

A few things are worth pointing out in the
previous example:
The constructor has a special name
__init__, while a destructor (not shown)
uses __del__
 The self parameter is the instance (ie, the
this in C++). In Python, the self parameter
is explicit (c.f. C++, where it is implicit)
 The name self is not required - simply a
convention

Classes

Like functions, a class statement simply
adds a class object to the namespace

>>> Foo
<class __main__.Foo at 1000960>
>>>

Classes are instantiated using call syntax

>>> f=Foo()
>>> f.GetMember()
1
Example class
class Stack:
"A well-known data structure…"
def __init__(self):
# constructor
self.items = []
def push(self, x):
self.items.append(x)
# the sky is the limit
def pop(self):
x = self.items[-1]
del self.items[-1]
return x
# what happens if it’s empty?
def empty(self):
return len(self.items) == 0
# Boolean result
Using classes

To create an instance, simply call the class object:
x = Stack()

To use methods of the instance, call using dot notation:
x.empty()
x.push(1)
x.empty()
x.push("hello")
x.pop()

# no 'new' operator!
# -> 1
# [1]
# -> 0
# [1, "hello"]
# -> "hello"
# [1]
To inspect instance variables, use dot notation:
x.items
# -> [1]
Subclassing
class FancyStack(Stack):
"stack with added ability to inspect inferior stack items"
def peek(self, n):
"peek(0) returns top; peek(-1) returns item below that; etc."
size = len(self.items)
assert 0 <= n < size
# test precondition
return self.items[size-1-n]
Subclassing (2)
class LimitedStack(FancyStack):
"fancy stack with limit on stack size"
def __init__(self, limit):
self.limit = limit
FancyStack.__init__(self)
# base class constructor
def push(self, x):
assert len(self.items) < self.limit
FancyStack.push(self, x)
# "super" method call
Class & instance variables
class Connection:
verbose = 0
def __init__(self, host):
self.host = host
def debug(self, v):
self.verbose = v
def connect(self):
if self.verbose:
print "connecting to", self.host
# class variable
# instance variable
# make instance variable!
# class or instance variable?
Instance variable rules

On use via instance (self.x), search order:
(1) instance, (2) class, (3) base classes
 this also works for method lookup


On assigment via instance (self.x = ...):

always makes an instance variable
Class variables "default" for instance
variables
 But...!

mutable class variable: one copy shared by all
 mutable instance variable: each instance its own

Exceptions

Python uses exceptions for errors
try / except block can handle exceptions
>>> try:
...
1/0
... except ZeroDivisionError:
...
print "Eeek"
...
Eeek
>>>


Exceptions

try / finally block can guarantee
execute of code even in the face of
exceptions

>>> try:
...
1/0
... finally:
...
print "Doing this anyway"
...
Doing this anyway
Traceback (innermost last): File "<interactive input>", line 2, in ?
ZeroDivisionError: integer division or modulo
>>>
More on exceptions

User-defined exceptions


Old Python: exceptions can be strings


sys.exc_info() == (exc_type, exc_value, exc_traceback)
Last uncaught exception (traceback printed):


WATCH OUT: compared by object identity, not
==
Last caught exception info:


subclass Exception or any other standard
exception
sys.last_type, sys.last_value, sys.last_traceback
Printing exceptions: traceback module
Background
 Data Types/Structure
 Control flow
 File I/O
 Modules

Files: Multi-line Strings
A file is a sequence of data that is stored
in secondary memory (disk drive).
 Files can contain any data type, but the
easiest to work with are text.
 A file usually contains more than one line
of text.
 Python uses the standard newline
character (\n) to mark line breaks.

File Processing

Reading a file into a word processor
File opened
 Contents read into RAM
 File closed
 Changes to the file are made to the copy
stored in memory, not on the disk

 It
has to be flushed
File Objects

f = open(filename[, mode[, buffersize])





mode can be "r", "w", "a" (like C stdio); default "r"
append "b" for text translation mode
append "+" for read/write open
buffersize: 0=unbuffered; 1=line-buffered; buffered
methods:





read([nbytes]), readline(), readlines()
write(string), writelines(list)
seek(pos[, how]), tell()
flush(), close()
fileno()
Reading files
>>> f = open(“names.txt")
>>> f.readline()
'Yaqin\n'
File Processing

readline can be used to read the next line
from a file, including the trailing newline
character

infile = open(someFile, "r")
for i in range(5):
line = infile.readline()
print line[:-1]
This reads the first 5 lines of a file
 Slicing is used to strip out the newline
characters at the ends of the lines

File Processing
Another way to loop through the contents
of a file is to read it in with readlines and
then loop through the resulting list.
 infile = open(someFile, "r")
for line in infile.readlines():
# Line processing here
infile.close()

File Processing
Python treats the file itself as a sequence
of lines!
 Infile = open(someFile, "r")
for line in infile:
# process the line here
infile.close()

Quick Way
>>> lst= [ x for x in open("text.txt","r").readlines() ]
>>> lst
['Chen Lin\n', '[email protected]\n', 'Volen 110\n', 'Office
Hour: Thurs. 3-5\n', '\n', 'Yaqin Yang\n',
'[email protected]\n', 'Volen 110\n', 'Offiche Hour:
Tues. 3-5\n']
Ignore the header?
for (i,line) in enumerate(open(‘text.txt’,"r").readlines()):
if i == 0: continue
print line
Using dictionaries to count
occurrences
>>> for line in open('names.txt'):
...
name = line.strip()
...
name_count[name] = name_count.get(name,0)+ 1
...
>>> for (name, count) in name_count.items():
...
print name, count
...
Chen 3
Ben 3
Yaqin 3
File Processing
Opening a file for writing prepares the file
to receive data
 If you open an existing file for writing, you
wipe out the file’s contents. If the named
file does not exist, a new one is created.
 Outfile = open("mydata.out", "w")
 print(<expressions>, file=Outfile)

File Output
input_file = open(“in.txt")
output_file = open(“out.txt", "w")
for line in input_file:
“w” = “write mode”
output_file.write(line) “a” = “append mode”
“wb” = “write in binary”
“r” = “read mode” (default)
“rb” = “read in binary”
“U” = “read files with Unix
or Windows line endings”
Background
 Data Types/Structure
 Control flow
 File I/O
 Modules

Modules
When a Python program starts it only has
access to a basic functions and classes.
(“int”, “dict”, “len”, “sum”, “range”, ...)
 “Modules” contain additional functionality.
 Modules can be implemented either in
Python, or in C/C++
 Use “import” to tell Python to load a
module.
>>> import math

import the math module
>>> import math
>>> math.pi
3.1415926535897931
>>> math.cos(0)
1.0
>>> math.cos(math.pi)
-1.0
>>> dir(math)
['__doc__', '__file__', '__name__', '__package__', 'acos', 'acosh',
'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos',
'cosh', 'degrees', 'e', 'exp', 'fabs', 'factorial', 'floor', 'fmod',
'frexp', 'fsum', 'hypot', 'isinf', 'isnan', 'ldexp', 'log', 'log10',
'log1p', 'modf', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan',
'tanh', 'trunc']
>>> help(math)
>>> help(math.cos)
“import” and “from ... import ...”
>>> import math
math.cos
>>> from math import cos, pi
cos
>>> from math import *
Standard Library
Python comes standard with a set of
modules, known as the “standard library”
 Rich and diverse functionality available
from the standard library


All common internet protocols, sockets, CGI,
OS services, GUI services (via Tcl/Tk),
database, Berkeley style databases, calendar,
Python parser, file globbing/searching,
debugger, profiler, threading and
synchronisation, persistency, etc
External library

Many modules are available externally
covering almost every piece of
functionality you could ever desire

Imaging, numerical analysis, OS specific
functionality, SQL databases, Fortran
interfaces, XML, Corba, COM, Win32 API, etc
For More Information?
http://python.org/
- documentation, tutorials, beginners guide, core
distribution, ...
Books include:
 Learning Python by Mark Lutz
 Python Essential Reference by David Beazley
 Python Cookbook, ed. by Martelli, Ravenscroft and
Ascher
 (online at
http://code.activestate.com/recipes/langs/python/)
 http://wiki.python.org/moin/PythonBooks
Python Videos
http://showmedo.com/videotutorials/python
 “5 Minute Overview (What Does Python
Look Like?)”
 “Introducing the PyDev IDE for Eclipse”
 “Linear Algebra with Numpy”
 And many more