Python Training for Java Programmers

Download Report

Transcript Python Training for Java Programmers

Python Training
for HP OSO
Guido van Rossum
CNRI
7/23/1999
9am - 1pm
1
Plug
• The Practice of
Programming
• Brian W. Kernighan
and Rob Pike
• Addison-Wesley, 1999
Mostly about C, but very useful!
http://cm.bell-labs.com/cm/cs/tpop/
2
CODE STRUCTURE
3
The importance of readability
• Most time is spent on maintenance
• Think about the human reader
• Can you still read your own code...
– next month?
– next year?
4
Writing readable code
• Be consistent
(but not too consistent!)
• Use whitespace judicously
• Write appropriate comments
• Write helpful doc strings
– not novels
• Indicate unfinished business
5
Modifying existing code
• Conform to the existing style
– even if it’s not your favorite style!
– local consistency overrides global
• Update the comments!!
– and the doc strings!!!
6
Organizing code clearly
• Top-down or bottom-up?
• Pick one style, stick to it
• Alternative: group by functionality
– eg:
• constructor, destructor
• housekeeping
• low level methods
• high level methods
7
When to use classes
(...and when not!)
• Use a class:
– when multiple copies of state needed
• e.g.: client connections; drawing objects
• Use a module:
– when on copy of state always suffices
• e.g.: logger; cache
• Use functions:
– when no state needed; e.g. sin()
8
Class hierarchies
• Avoid deep class hierarchies
– inefficient
• multi-level lookup
– hard to read
• find method definitions
– easy to make mistakes
• name clashes between attribute
9
Modules and packages
• Modules collect classes, functions
• Packages collect modules
• For group of related modules:
– consider using a package
• minimizes chance of namespace clashes
10
Naming conventions
(my preferred style)
• Modules, packages: lowercase
• except when 1 module ~ 1 class
• Classes: CapitalizedWords
• also for exceptions
• Methods, attrs: lowercase_words
• Local variables: i, j, sum, x0, etc.
• Globals: long_descriptive_names
11
The main program
• In script or program:
def main(): ...
if __name__ == “__main__”:
main()
• In module:
def _test(): ...
if __name__ == “__main__”:
_test()
• Always define a function!
12
DOCUMENTATION
13
Writing comments
• Explain salient points (only)
n = n+1
# include end point
• Note dependencies, refs, bugs
# Assume reader() handles I/O errors
# See Knuth, vol.3, page 410
# XXX doesn’t handle x<0 yet
14
Writing doc strings
"""Brief one-line description.
Longer description, documenting
argument values, defaults,
return values, and exceptions.
"""
15
When NOT to use comments
• Don’t comment what’s obvious
n = n+1
# increment n
• Don’t put a comment on every line
• Don’t draw boxes, lines, etc.
#-----------------------def remove_bias(self):
#-----------------------self.bias = 0
16
One more thing...
UPDATE THE COMMENTS WHEN
UPDATING THE CODE!
(dammit!)
17
THE LIBRARY
18
The library is your friend!
• Know what's there
• Study the library manual
– especially the early chapters:
• Python, string, misc, os services
• Notice platform dependencies
• Avoid obsolete modules
19
Stupid os.path tricks
• os.path.exists(p), isdir(p), islink(p)
• os.path.isabs(p)
• os.path.join(p, q, ...), split(p)
• os.path.basename(p), dirname(p)
• os.path.splitdrive(p), splitext(p)
• os.path.normcase(p), normpath(p)
• os.path.expanduser(p)
20
PORTING YOUR BRAIN
(from Java to Python)
21
Class or module?
• Stateless operatons, factory funcs
• Java: static methods
• Python: functions in module
• Singleton state
• Java: static members, methods
• Python module globals, functions
22
Private, protected, public?
• Java:
• private, protected, public
– enforced by compiler (and JVM?)
• Python:
• __private
– enforced by compiler
– loophole: _Class__private
• _private, _protected, public
– used by convention
23
Method/constr. overloading
• Java:
class C {
int f() { ... }
int f(int i) { ... }
int f(int i, int arg)
{ ... }
}
• Python:
class C:
def f(i=0, arg=None):
...
24
Java interfaces
• In Python, interfaces often implied
class File:
def read(self, n): ...
class CompressedFile:
def read(self, n): ...
25
Abstract classes
• Not used much in Python
• Possible:
class GraphicalObject:
def draw(self, display):
raise NotImplementedError
def move(self, dx, dy):
raise NotImplementedError
....
26
ERROR HANDLING
27
When to catch exceptions
• When there's an alternative option
try:
f = open(".startup")
except IOError:
f = None # No startup file; use defaults
• To exit with nice error message
try:
f = open("data")
except IOError, msg:
print "I/O Error:", msg; sys.exit(1)
28
When NOT to catch them
• When the cause is likely a bug
• need the traceback to find the cause!
• When the caller can catch it
• keep exception handling in outer layers
• When you don't know what to do
try:
receive_message()
except:
print "An error occurred!"
29
Exception handling style
• Bad:
try:
parse_args()
f = open(file)
read_input()
make_report()
except IOError:
print file, "not found"
• Good:
parse_args()
try:
f = open(file)
except IOError, msg:
print file, msg
sys.exit(1)
read_input()
make_report()
# (what if read_input()
# raises IOError?)
30
Error reporting/logging
• Decide where errors should go:
– sys.stdout - okay for small scripts
– sys.stderr - for larger programs
– raise exception - in library modules
• let caller decide how to report!
– log function - not recommended
• better redirect sys.stderr to log object!
31
The danger of “except:”
• What's wrong with this code:
try:
return self.children[O] # first child
except:
return None # no children
• Solution:
except IndexError:
32
PYTHON PITFALLS
33
Sharing mutable objects
• through variables
a = [1,2]; b = a; a.append(3); print b
• as default arguments
def add(a, list=[]):
list.append(a); return list
• as class attributes
class TreeNode:
children = []
...
34
Lurking bugs
• bugs in exception handlers
try:
f = open(file)
except IOError, err:
print "I/O Error:", file, msg
• misspelled names in assignments
self.done = 0
while not done:
if self.did_it(): self.Done = 1
35
Global variables
# logging module
# logging module
# corrected version
log = []
log = []
def addlog(x):
log.append(x)
def addlog(x):
log.append(x)
def resetlog():
log = []
# doesn’t work!
def resetlog():
global log
log = []
36
kjpylint
• Detects many lurking bugs
– http://www.chordate.com/kwParsing/
37
PERFORMANCE
38
When to worry about speed
• Only worry about speed when...
– your code works (!)
– and its overall speed is too slow
– and it must run many times
– and you can't buy faster hardware
39
Using the profile module
>>>
>>>
>>>
>>>
>>>
import profile
import xmlini
data = open("test.xml").read()
profile.run("xmlini.fromxml(data)")
profile.run("for i in range(100): xmlini.fromxml(data)")
10702 function calls in 1.155 CPU seconds
Ordered by: standard name
ncalls
1
1
0
500
700
700
200
200
1600
1600
100
100
100
3600
100
100
100
100
200
200
500
tottime
0.013
0.001
0.000
0.018
0.032
0.050
0.007
0.014
0.190
0.163
0.004
0.007
0.007
0.161
0.003
0.420
0.007
0.004
0.012
0.014
0.029
percall
0.013
0.001
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.004
0.000
0.000
0.000
0.000
0.000
cumtime
1.154
1.155
0.000
0.018
0.032
0.050
0.007
0.014
0.270
0.258
0.004
0.007
0.007
0.161
0.003
1.141
0.007
0.004
0.012
0.014
0.029
percall filename:lineno(function)
1.154 <string>:1(?)
1.155 profile:0(for i in range(100): xmlini.fromxml(data))
profile:0(profiler)
0.000 xmlini.py:105(end_group)
0.000 xmlini.py:109(start_item)
0.000 xmlini.py:115(end_item)
0.000 xmlini.py:125(start_val)
0.000 xmlini.py:129(end_val)
0.000 xmlini.py:134(finish_starttag)
0.000 xmlini.py:143(finish_endtag)
0.000 xmlini.py:152(handle_proc)
0.000 xmlini.py:162(handle_charref)
0.000 xmlini.py:167(handle_entityref)
0.000 xmlini.py:172(handle_data)
0.000 xmlini.py:182(handle_comment)
0.011 xmlini.py:60(fromxml)
0.000 xmlini.py:70(__init__)
0.000 xmlini.py:80(getdict)
0.000 xmlini.py:86(start_top)
0.000 xmlini.py:92(end_top)
40
0.000 xmlini.py:99(start_group)
Measuring raw speed
# Here's one way
import time
def timing(func, arg, ncalls=100):
r = range(ncalls)
t0 = time.clock()
for i in r:
func(arg)
t1 = time.clock()
dt = t1-t0
print "%s: %.3f ms/call (%.3f seconds / %d calls)" % (
func.__name__, 1000*dt/ncalls, dt, ncalls)
41
How to hand-optimize code
import string, types
def dictser(dict, ListType=types.ListType, isinstance=isinstance):
L = []
group = dict.get("main")
if group:
for key in group.keys():
value = group[key]
if isinstance(value, ListType):
for item in value:
L.extend([" ", key, " = ", item, "\n"])
else:
L.extend([" ", key, " = ", value, "\n"])
...
return string.join(L, "")
42
When NOT to optimize code
• Usually
• When it's not yet working
• If you care about maintainability!
• Premature optimization is the
root of all evil (well, almost :)
43
THREAD PROGRAMMING
44
Which API?
• thread - traditional Python API
import thread
thread.start_new(doit, (5,))
# (can't easily wait for its completion)
• threading - resembles Java API
from threading import Thread # and much more...
t = Thread(target=doit, args=(5,))
t.start()
t.join()
45
Atomic operations
• Atomic:
i = None
a.extend([x, y, z])
x = a.pop()
v = dict[k]
• Not atomic:
i = i+1
if not dict.has_key(k): dict[k] = 0
46
Python lock objects
• Not reentrant:
lock.acquire(); lock.acquire() # i.e. twice!
– blocks another thread calls
lock.release()
• No "lock owner"
• Solution:
– threading.RLock class
• (more expensive)
47
Critical sections
lock.acquire()
try:
"this is the critical section"
"it may raise an exception..."
finally:
lock.release()
48
"Synchronized" methods
class MyObject:
def __init__(self):
self._lock = threading.RLock()
# or threading.Lock(), if no reentrancy needed
def some_method(self):
self._lock.acquire()
try:
"go about your business"
finally:
self._lock.release()
49
Worker threads
• Setup:
def consumer(): ...
def producer(): ...
for i in range(NCONSUMERS):
thread.start_new(consumer, ())
for i in range(NPRODUCERS):
thread.start_new(producer, ())
"now wait until all threads done"
50
Shared work queue
• Shared:
import Queue
Q = Queue.Queue(0) # or maxQsize
• Producers:
while 1:
job = make_job()
Q.put(job)
• Consumers:
while 1:
job = Q.get()
finish_job(job)
51
Using a list as a queue
• Shared:
Q = []
• Producers:
while 1:
job = make_job()
Q.append(job)
• Consumers:
while 1:
try:
job = Q.pop()
except IndexError:
time.sleep(...)
continue
finish_job(job)
52
Using a condition variable
• Shared:
Q = []
cv = Condition()
• Producers:
while 1:
job = make_job()
cv.acquire()
Q.append(job)
cv.notify()
cv.release()
• Consumers:
while 1:
cv.acquire()
while not Q:
cv.wait()
job = Q.pop()
cv.release()
finish_job(job)
53
TIME FOR DISCUSSION
54