An Introduction to Pythonx - Observatoire Astronomique de
Download
Report
Transcript An Introduction to Pythonx - Observatoire Astronomique de
Observatoire Astronomique de Strasbourg
March 28, 2011 – One Day Session
http://astro.unistra.fr/~morgan/python/M11.pptx
Morgan Fouesneau
[email protected]
“Meals are not provided!”
why python? and good references
What, Why?
In astronomy
Should I throw away my C, Fortran codes?
Language power
Starting points when stuck in python
Language
Application
Not to scale…
“Hello World!” …
A programmer can be 5-10 times more productive than others
Java:
Fortran (90):
Python:
As easy as IDL (or Yorick):
public class HelloWorld{
public static void main(string[] args){
System.out.println(“Hello World!”);
}
}
PROGRAM HelloWorld
PRINT *, "Hello World!"
END PROGRAM HelloWorld
print “Hello World!”
print, “Hello World!”
Easy and Quick coding
Already a reference in
astronomy
Faster than IDL (especially
for data access)
More flexible than any
other language
Between fully compiled and
pure script program
Free
Quantitative study based on 80 programmers
implementing the same program
http://page.mi.fu-berlin.de/prechelt/Biblio/jccpprtTR.pdf
no reason to replace them (yet), rather cooperate
Working horses of science…
zillions of codes existing
optimized for runtime performance
no reason to replace them (at this time)
but...
rather cooperate (f2py, swig)
not really general purpose
relatively primitive datatypes
manual memory management
slow edit/compile/test cycle
Extremely popular these days
Interactive, great visualization, good libraries
but...
Not really general purpose
Not really ready for large scale programs
Not free, even expensive
We can still call them from Python
but you will never do that unless you have a gun over your head
Perl very strong at text manipulation
Ruby seems to be popular in scientific community in Japan
Shell scripts... you will not want to do shell scripting any more
once you know Python (unless forced)!
Easy to learn but yet powerful
Fast and easy solutions for everyday problems
Allows development of large and sophisticated software packages
Several astronomical program packages implemented using python
Excellent tool for scientists
Easy parallel programming
official Python website www.python.org
Simple interactive example:
>>> t
array([
# print all t-values (assumed to be already defined)
0.
, 0.8975979 ,
1.7951958, 2.6927937,
3.5903916, 4.48798951, 5.38558741, 6.28318531 ])
>>> t[2:4] # get sub-array with the elements at the indexes 2,3
array([ 1.7951958, 2.6927937])
>>> p = 1.8 # set our point
>>> abs(t-p) # how much do the t[:]-values differ from p?
array([ 1.8
, 0.9024021 , 0.0048042 , 0.8927937,
1.7903916, 2.68798951, 3.58558741, .48318531 ])
>>> abs(t-p) # how much do the t[:]-values differ from p?
array([ 1.8
, 0.9024021 , 0.0048042 , 0.8927937,
1.7903916, 2.68798951, 3.58558741, .48318531 ])
<...>
>>> # how large is the 4-th largest absolute distance
between the t[:]-values and p
>>> dt_m = sort(abs(t-p))[3]
>>> # where are the four elements of t[:]closest to p ?
>>> abs(t-p) <= dt_m # corresponds to a “where” command in IDL
array([False, True, True, True, True, False, False, False])
>>> # construct the sub-arrays; (1) get the 4 t[:]-values
>>> t_p = t[abs(t-p)<=dt_m]
>>> t_p
array([ 0.8975979,
1.7951958,
2.6927937,
3.5903916])
Programs and libraries implemented (mainly in Python):
BoA (Bolometer Analysis Package)
APECS (APEX control software)
Astro-WISE (widefield imaging system)
PyFITS (access to FITS files)
SAMPy (Samp communications)
...
Wrappers for existing programs and libraries:
PYGILDAS (GILDAS)
ParselTongue (AIPS)
PyRAF (IRAF)
PyMIDAS (MIDAS)
PyDS9 (DS9 via XPA messaging system)
…
Keep it simple
Embed “basic” languages were really needed (e.g. brut/intensive
comp.)
C/C++
• Jython
Java
Fortran
• Pyrex
• Weave
• Cython
• f2py
• pyfort
Python
Communicate with almost any application
Python reference book
Python from novice to pro
Available online
available in multiple languages.
http://diveintopython.org
Websites
Python
http://www.python.org
Ipython
http://ipython.scipy.org
SciPy
http://www.scipy.org
NumPy
http://numpy.scipy.org
Matplotlib http://matplotlib.sourceforge.net
Astropython http://www.astropython.org
AstroPy
http://www.astro.washington.edu/users/rowen/AstroPy.html
Mathesaurus http://mathesaurus.sourceforge.net quick reference for switching
F2py
http://www.scipy.org/F2py c / f77 / f90 glue (is part of NumPy)
PyTables
http://www.pytables.org/ all around hdf5 format
PyFITS
http://www.stsci.edu/resources/software_hardware/pyfits/ FITS
SAMPy
http://cosmos.iasf-milano.inaf.it/pandora/sampy SAMP in python
Mayavi
http:// code.enthought.com/projects/mayavi/ SAMP in python
“Daily Specials”
Who, where, what, how?
Audience
Observatoire: Toddler, grown up, elderly… anyone curious
Must have at least basic experience in one or more computer languages
Content: Give you what you need for everyday problems asap
First half of course: Basic language elements and basic visualization
some modules from standard library, elements of NumPy, SciPy and matplotlib
Second half of course: Advanced topics and installing packages
Complex scripting, exceptions, classes… finding packages and how to install them…
Avoid any personal installation
SSH connection to `ASTROMASTER`
Username: `guestpython`
Password: `python2011`
This account will be deleted within the next few days
Create your own folder in this user’s home directory
Installed packages
numpy, matplotlib, pyfits, python-tables, setuptools, tkinter, ipython
IDE: Interactive Development Environment
i.e.: Very assisted
Examples: Eclipse (ext. Python), Spyder…
Spyder*
Editor, Console,
History Log,
Variable explorer,
Project manager,
online help
* Scientific PYthon
Development EnviRonment
http://code.google.com/p/spyderlib/
“comprehensive environment for interactive
and exploratory computing.”
http://ipython.scipy.org/
“comprehensive environment for interactive and exploratory
computing.”
In [ ]: myVariable? prints documentation of the variable
In [ ]: myFunction?? prints the source code (if accessible)
“Classic stuffs and some specials.”
Numbers, variables, math and while
Python's numerical types
Operators for numerical types
Variables, variable names, and assignment to variables
The math module from the Standard Library
Loops with the while statement
>>> 2, 0
>>> -4711
>>> 07, 022
# Octal literals
>>> 0x9, 0xa, 0XF
# Hex literals
>>> 17 + 4
# expression
>>> 0xa - 2
>>> 23 ** (2+3)
# 23⁵
>>> 7 / 2, 7 / -2
# Int division
>>> 2.3
>>> -4.
>>> 0.1, .1
>>> 2.99E10, 6.62607e-27, -1e10
>>> 1.7 + .4
>>> 17. + 4
>>> 7./2., 7./2., 7./2.
>>> 1 + .4
>>> 2 - 3
>>> 25 * -17
>>> 2.**100
>>> 2/3, 2/-3
# Int division
>>> 2./3
# Float division
>>> 2.//3, 2.//3
# Floor division
>>> 7 % 3, 7 % -3
# Int Modulus
>>> 7. % 3, 7.2 % 2.5
# Modulus
Nb: (n//m)*m + (n%m) = n
>>> 17 & 5
# Bitwise AND
>>> 17 | 5
# Bitwise OR
>>> 17 ^ 5
# Bitwise EXOR
>>> ~17
# Bitwise complement
>>> 17 << 3
# Bitshift left
>>> 0x11 >> 2
# Bitshift right
>>> 2.+3j, 2-3J
# complex literals
>>> j
# will not work
>>> 1J
# but this will
>>> complex(1,2)
>>> 1+1j*2, (1+1j)*2
>>> (2.+3j).real, (2+3j).imag
>>> x = 2
# Assign variable
>>> x
# Display
>>> x + 3
# Use variable
>>> y = x + 3
# New variable
>>> x = x + 1
# Assign new value
>>> x += 1
# Shorthand; no x++
>>> x, y = 2, 3
>>> x, y = y, x
>>> x = y = z = 0
# !! WARNING !!
Case sensitive
>>> xy, Xy = 2, 3
Must begin w. letter
>>> 9x = 2
>>> x9 = 2
>>> _x = 2
# not working
# ok
# ok, but special
must not be keyword
>>> if = 2
and
is
try
while
with
yield
pass
in
finally
return
or
import
exec
continue
def
not
if
except
class
raise
None
global
else
break
lambda
from
elif
assert
del
as
for
print
>>> abs(-2.)
>>> abs(1+1j)
>>> max(1,2,3,4)
>>> min(1,2,3,4)
>>> hex(17), oct(-5)
>>> round(1.23456, 2)
>>> x = 2
# Assignment
>>> x == 2
# Testing equality
>>> x != 0
# True
>>> x < 1
# False
>>> (x >= 2.) == False # False
>>> 1 < x <= 3
>>> 3 < x <= 1
>>> 1 < x >= 0
# (1<x) and (x<=3)
>>> x = 0
>>> while x <= 10:
... print x, x**2
... x += 1
... <RETURN>
# Bool. expr.
# Indentation
>>> print x
# Unindented again
Top level must not be indented
It does not matter how much blanks you use, but:
Uniform within each block
Most people use 4 blanks per level but tabs work too.
http://www.python.org/doc/essays/styleguide.html
How much is the smallest / largest positive float?
Help: Use a while loop, compare to 0 and float(‘inf’)
>>> x = 1.
>>> while x > 0:
... x /= 2.
... print x
>>> x = 2.** 32
>>> while x != float(‘inf’):
... x *= 2.
... print x
How precise is your installation?
Help: Add smaller and smaller numbers to 1 while comparing the sum to 1
>>> x = 1.
>>> while 1. + x > 1.:
... x /= 2.
... print x
How much is the smallest / largest positive float?
Help: Use a while loop, compare to 0 and float(‘inf’)
>>> x = 1.
>>> while x > 0:
... x /= 2.
... print x
>>> x = 2.** 32
>>> while x != float(‘inf’):
... x *= 2.
... print x
4.94065645841e-324
How precise is your installation?
Help: Add smaller and smaller numbers to 1 while comparing the sum to 1
>>> x = 1.
>>> while 1. + x > 1.:
... x /= 2.
... print x
How much is the smallest / largest positive float?
Help: Use a while loop, compare to 0 and float(‘inf’)
>>> x = 1.
>>> while x > 0:
... x /= 2.
... print x
>>> x = 2.** 32
>>> while x != float(‘inf’):
... x *= 2.
... print x
4.94065645841e-324
8.98846567431e+307
How precise is your installation?
Help: Add smaller and smaller numbers to 1 while comparing the sum to 1
>>> x = 1.
>>> while 1. + x > 1.:
... x /= 2.
... print x
1.11022302463e-16
“A complex system that works is invariably found to
have evolved from a simple system that worked”
– John Gall
Large scale program
Python's package notion (‘import’)
Module, Package
Differences between import and execfile or %run
Documentation
Run script from Unix shell with $> python myscript.py
Alternative: Make script executable
$> chmod u+x myscript.py
Add “ShiBang” line at the top of the file:
Execute: $> ./myscript.py
Or run script from within Python
>>> execfile(“myscript.py”)
Or within IPython:
In [1]: %run myscript[.py]
#!/usr/bin/python
Multiline codes (functions, classes, …) or interactive command list
Documentation
Usually in the __doc__ function within the “module” (file)
Ex:
def __doc__:
... “””
this is my documentation for this module.
usage: >helloWorld()
... “””
Directly in each block:
def helloWorld():
... “”” This is a dummy function “””
... print “hello World!”
Usage:
>>> import myscript
In[1]: %run myscript
>>> help(myscript)
In[2]: helloWorld?
>>> help(myscript.helloWorld) In[3]: helloWorld??
__doc__ : introduce documentation
In general, any __<name>__ is a special function/variable
Select instructions that are executed only when running the
script and not when import is called
everything in the “__main__” block
__main__ : Top-level script environment
if __name__ == “__main__”:
<...>
Reading a photometric dataset to generate HR diagram
Python's package notion (‘import’)
Mandatory packages for doing science: numpy, matplotlib
File access: reading from / writing to files
Lists, array and indexes, slices
For loop statements: while, for
Functions: classical and lambda functions
Basic visualization: plot, scatter, contours, colormaps
First python scripts
Read the file
Manipulate the data
Build the figures
File description: data.ascii
made of a header (lines start with #)
6424 lines into 9 columns (space as separator)
First solution: elbow-grease (like in C)
f = open('data.ascii', 'r') # 'r' by default
Reading: f.read(), f.readline(), f.readlines()
f.close()
Reading is done by simply looping over the lines
Long solution, you rarely need to be at this low-level.
Learning interest…
#include <stdio>
#include <stdlib>
If you had to do it with C code…
int main(void)
{
int ncols = 9, nlines = 6424, i, j;
string filename = “data.ascii”;
FILE *file;
float *data = (float *) malloc( ncols * nlines * sizeof(float) );
file = fopen(filename, "r");
for ( i = 0; i < nlines; i++ ){
for ( j=0; j < ncols; j++ ){
fscanf( file, "%f ", data[j + ncols*i] );
}
}
fclose(file);
...
free(data);
}
…I am not sure this works though
>>>
>>>
>>>
>>>
>>>
>>>
import numpy
# low-level does not mean very-low
ncol = 9
nrow = 6424
data = numpy.array(ncol, nrow) # common array (one type of data)
l = '#'
while l[0] == '#': #Skip the comments
... l = f.readline()
"""Warning: at the end of the loop, l contains the first data row """
>>> for i in range(nrow): # <=> i in [0, ..., 6423]
... """ We should check whether ‘\n’ is present at the end of l
... and cut it out. In any other language, we may have something like:
...
if (l[len(l)-1] == '\n') { l = l[0:len(l)-2] }
... But not in python: """
... if l[-1] == '\n':
... l = l[:-1]
... l = l.split()
# split according to spaces
... for k in range(ncol): # Now fill data array
... """Python only reads strings that have to be converted then """
... data[k,i] = float(l[k])
... l = f.readline() # go on to the next line
>>> f.close()
A bit more in a reptile way:
>>> data = [] # declare an empty list
>>> for l in f: #loop over each line of f (don’t care about #lines)
... if l[0] != '#':
... # Magical `inline` for loop, faster than a classic `for`
... cl = [ float(k) for k in l[:-1].split()]
... data.append( cl ) # add processed line to data
>>> f.close()
""" data constains n-cells of m-values. In order to access the
data by columns, we either transpose the "matrix", or create a
sub-list:"""
>>> d = numpy.transpose(data)
>>> bflux = d[0]
>>> blux = [ k[0] for k in data ]
Finally in python: (think functions)
>>> def readAscii(fichier, transpose = False):
""" Function that will read the file """
... f = open('data.ascii')
... # lambda function <=> math function, x-->f(x)
... convertToFloat = lambda x: [ float(k) for k in x ]
... data = [ convertToFloat( l[:-1].split() ) \
for l in f if l[0] != '#' ]
... f.close()
... if transpose:
... return numpy.transpose(data)
... else:
... return data
>>> data = readAscii(f)
>>> d
= readAscii(f, transpose = True) # pédagoqique ;)
BUT: Someone already coded that for you
Either numpy or scipy or matplotlib or pylab
>>> data = numpy.loadtxt('data.ascii')
Variations:
>>> data = numpy.loadtxt('data.ascii', comments='#',
skiprows=0, delimiter=None
)
>>> bflux, bmag, bmerr, sep, vflux, vmag, vmerr, x, y =
numpy.loadtxt('data.ascii', unpack=True)
loadtxt() is usually fine
But sometimes can be limited for big files
Other options are available
e.g. ATpy, asciitable, …
http://astropython.org/resources for other references
ATpy is also able to read and write fits, votable, sql …
Calibrate the dataset:
We apply operations to the whole arrays
>>> bmag = bmag - 1.3
>>> vmag -= 2.5
# classic
# better for memory space
sereval possibilities for visualization
2D: matplotlib/pylab (other exists)
3D, matplotlib (still young), mayavi2 (openGL), …
We will use matplotlib/pylab in this workshop
>>> from numpy import loadtxt # only what we need
>>> import pylab as plt
# import & alias to plt
>>> bflux, bmag, bmerr, sep, vflux, vmag, vmerr, x, y =
loadtxt('data.ascii', unpack=True)
>>> bmag += 1.3; vmag += 2.5
# zeropoint correction
Plot star positions:
>>> plt.figure() # declare a new figure
>>> plt.plot(x, y, ',', color='0.0') # do the plot
# do not hesitate with help cmds: `plt.plot?`
>>> plt.xlabel('X (in pixels)') # set labels
>>> plt.ylabel('Y (in pixels)')
>>> plt.grid()
>>> plt.show()
# display (when not auto)
Plot star positions:
>>> plt.figure() # declare a new figure
>>> plt.plot(x, y, ',', color='0.0') # do the plot
# do not hesitate with help cmds: `plt.plot?`
>>> plt.xlabel('X (in pixels)') # set labels
>>> plt.ylabel('Y (in pixels)')
>>> plt.grid()
>>> plt.show()
# display (when not auto)
Build HR diagram of the stars:
>>> plt.figure()
>>> plt.scatter(bmag-vmag, vmag)
# `plt.plot` would also work ;)
>>> plt.xlabel(r'B$-$V')
>>> plt.ylabel(r'V')
>>> plt.xlim(1.9,3.4)
>>> plt.ylim(6.7,15)
>>> plt.ylim(plt.ylim()[::-1])
# reverse y-axis, or also `plt.ylim(15, 6.7)`
>>> plt.savefig(‘myfigure.png’)
# can export into eps,ps,pdf,png,svg…
>>> plt.show()
Build HR diagram of the stars:
>>> plt.figure()
>>> plt.scatter(bmag-vmag, vmag)
# `plt.plot` would also work ;)
>>> plt.xlabel(r'B$-$V')
>>> plt.ylabel(r'V')
>>> plt.xlim(1.9,3.4)
>>> plt.ylim(6.7,15)
>>> plt.ylim(plt.ylim()[::-1])
# reverse y-axis, or also `plt.ylim(15, 6.7)`
>>> plt.savefig(‘myfigure.png’)
# can export into eps,ps,pdf,png,svg…
>>> plt.show()
Now explore matplotlib/pylab
Generate a density plot of this HR diagram
numpy.where,
pylab.histogram2d
pylab.contour and/or pylab.contourf
pylab.figure and/or pylab.subplot
import numpy, pylab
# get the data
bflux, bmag, bmerr, sep, vflux, vmag, vmerr, x, y =
numpy.loadtxt('data.ascii', unpack=True)
bmag += 1.3
vmag += 2.5
# Select space region of interest
ind = numpy.where( ( (bmag-vmag) > -0.5 ) &
( (bmag-vmag) < 2.
) &
(
vmag
> 10. ) &
(
vmag
< 22. )
)
#do the histogram
h = pylab.histogram2d( bmag[ind]-vmag[ind], vmag[ind],
bins=100
)
image = h[0]
extent=[ h[1][0], h[1][-1], h[2][0], h[2][-1] ] #axes limits
#display part
ax = pylab.subplot(121) #Create a subplot, 1 row 2 cols, #1
ax.imshow( image.transpose(), extent=extent,
origin='lower', aspect='auto‘, cmap=pylab.cm.Blues)
ax.set_xlim(-0.5,1.0) ; ax.set_ylim(20, 11)
ax.set_ylabel(r'V‘)
; ax.set_xlabel(r'B$-$V')
ax.set_title('histogram2d')
ax = pylab.subplot(122) #Create a subplot, 1 row 2 cols, #2
levels = [0.1, 3.,
6.,
9., 12., 15., 18., 21.]
ax.contourf( image.transpose(), extent=extent,
levels=levels,cmap=pylab.cm.Blues )
ax.set_xlim(-0.5,1.0); ax.set_ylim(20, 11)
ax.set_ylabel(r'V') ; ax.set_xlabel(r'B$-$V')
ax.set_title('contourf')
pylab.savefig('m3_ex2.png')
ax = pylab.gcf().get_axes() # get subplots
ax1 = ax[1]
# second subplot (contourf)
ax1.set_ytitle('')
# delete tick labels on the y-axis
pylab.setp(ax1.get_yticklabels(), visible=False)
# shrink space between the plots
pylab.subplots_adjust(wspace=0.001)
Use some more magic and add minor ticks, change fonts, ...
(battery included in figure.py)
“It’s a trap!”
Data: values, pointers and references
Location, affectation in C
Location, affectation in Python
Differences between equality and identity
int x, y;
Memory location x
type: int
Memory location y
type: int
?
?
Memory location x
type: int
Memory location y
type: int
int x, y;
?
?
x = 2;
2
?
Memory location x
type: int
Memory location y
type: int
int x, y;
?
?
x = 2;
2
?
x = 3;
3
?
Memory location x
type: int
Memory location y
type: int
int x, y;
?
?
x = 2;
2
?
x = 3;
3
?
y = x;
3
3
Value copied
Variables
Objects
(names)
(data)
>>> planet = “Pluto”
>>
Planet
“Pluto”,
type: str
Variables
Objects
(names)
(data)
>>> planet = “Pluto”
>>> planet = “Tiamat”
Planet
“Pluto”,
type: str
“Tiamat”,
type: str
Variables
Objects
(names)
>>> planet = “Pluto”
(data)
>>> planet = “Tiamat”
Planet
>>> legend = planet
legend
“Tiamat”,
type: str
Variables
Objects
(names)
>>> planet = "Pluto“
(data)
>>> planet = “Tiamat“
Planet
>>> legend = planet
legend
>>> del planet
“Tiamat”,
type: str
Variables
Objects
(names)
>>> planet = "Pluto“
(data)
>>> planet = “Tiamat“
>>> legend = planet
legend
>>> del planet
>>> del legend
“Tiamat”,
type: str
>>> l = [1,2,3]
>>> m = [1,2,3] # l and m different objects
>>> l == m # Equality (of values) tested with ==
>>> l is m # Identity (of objects) tested with is
>>> l[0] = 42
>>> print l
>>> print m # m[0] = ?
>>> l = [1,2,3]; m = l # l == m ? l is m ?
>>> l = numpy.array([1,2,3])
>>> m = l.copy() # l and m different objects
>>> l == m # Equality (of values) tested with ==
>>> l is m # Identity (of objects) tested with is
>>> l[0] = 42
>>> print l
>>> print m # m[0] = ?
Particular variables in Python: tuples, lists, arrays, & dictionaries
tuples, lists, arrays
dictionaries
Looping philosophy
Use case: Find out which the 10 most frequent words in a text are.
“It feels like somebody is using your head for origami, but it
doesn't bend that way…”
Tuples and lists are ordered sequences of values of different types
Tuples are immutable: does not support item assignment
Lists are mutable
In [1]: t = (1,2,3); u = [1,2,3]
In [2]: t[0], t[-1], t[1:], u[0], u[-1], u[1:]
Out[2]: (1, 3, (2, 3), 1, 3, [2, 3])
In [3]: t[0] = 10
TypeError : 'tuple' object does not support item
assignment
In [4]: u[0] = 10; u.append(4)
#ok
In [5]: v = list(t)
In [6]: del u[0]
# v is now a list of t-values
# delete one element
“It feels like somebody is using your head for origami, but it
doesn't bend that way…”
In [1]: t = ([1,2,3],[4,5])
In [2]: t[0] = 42 # list is immutable
TypeError : ‘tuple’ object does not support item
assignment
In [3]: t[0][0] = 42; print t
Out[3]: ([42,2,3],[4,5])
In [4]: t+t
# same result with lists
Out[4]: ([42,2,3],[4,5],[42,2,3],[4,5])
In [4]: t*2
# duplicate!! (also with lists)
Out[4]: ([42,2,3],[4,5],[42,2,3],[4,5])
Kind of named lists or like c-structures
Dictionaries are unordered, mutable set of (key, value) pairs
In [1]: d = {'spam': 2, 'eggs': 3}
Order is not preserved
Use of keys instead of indexes to access data (kind of named array)
In [2]: d[‘spam’]
Out[2]: 2
In [3]: d[‘spam’] = 10000
In [4]: d[‘bacon’]
KeyError: ...
Kind of named lists or like c-structures
In [2]: d[‘spam’]
Out[2]: 2
In [3]: d[‘spam’] = 10000
In [4]: d[‘bacon’]
KeyError: ...
Either check in advance: >>> d.has_key('bacon')
Or use get method with def. val. >>> d.get('bacon', 42)
Adding items: >>> d['bacon'] = 13
Removing items: >>> del d['spam']
Keys may be of any immutable type but unique (value are any)
[1,2]
(1,2) ‘str’, val…
To be or not to be iterable…
Loops use while or for constructions
while constructions are “classical”
For construction aims at iterating over an object
Lists, tuples and dictionaries are iterable objects
Any object with a __iter__ function is iterable
>>> for i in range(10): … # range(10) = [0,…,9]
>>> d = {'spam': 2, 'eggs': 3, 'bacon': 5}
>>> for k in d:
... print k, d[k]
>>> for k in d.keys(): # Clearer equivalent version
... print k, d[k]
>>> for k in sorted(d.keys()): # Sorted loop
... print k, d[k]
Special for-loop construction
>>> d = {'spam': 2, 'eggs': 3, 'bacon': 5}
>>> val = []:
>>> for k in d.keys():
... if d[k] > 2.5:
... val.append( (k,d[k]) )
>>> v = [ (k,d[k]) for k in d.keys() if d[k] > 2.5 ]
Automatic and optimized allocation, hence faster!
Find out which the 10 most frequent words in a text are.
“The First Men in the Moon” by H. G. Wells (Herbert Georges)
http://www.gutenberg.org/ebooks/1013 (UTF-8 plain text format)
Help: use dictionaries or lists or tuples, and for-loops
Useful help:
import re – this let you use regular expressions (see in particular: re.sub)
Open the text file, read the full text into a variable
Filter for `\r\n` sequences
Be careful of case sensitivity, punctuation and so on… during word counts
Some commands (not all needed):
▪ <str>.split, <str>.join, <str>.upper, <str>.lower, <str>.replace …
▪ <list>.count, re.sub, numpy.argsort, numpy.unique ...
codesources/textCount.py
Subplots & linked axes
<…>.subplot(…,sharex=ax)
Power Law !!!
codesources/textCount.py
def count():
f
= open('firstMenOnTheMoon.txt') #open file
# =============== Get all lines and set to lower case ================
txt
= [ l.lower() for l in f if l != '\r\n']
f.close() #file can be closed now
txt
= ''.join(txt)
# ================== filter & replace punctuation ====================
txt
= txt.replace('\r\n', ' ').replace('--', ' ')
w
= re.sub('[.,:\'"!?;()]',' ', txt)
w
= w.split()
# ========================== Do the stats ============================
#words = set(w) # also works, generate a sequence of unique words
words = numpy.unique(w)
counts = [ w.count(kwd) for kwd in words ]
sind
= numpy.argsort(counts)[::-1]
sc
= numpy.array(counts)[sind]
sw
= numpy.array(words)[sind]
print “Statistics:\ncontains %d words\n“ % (len(w)
for k in range(10):
print "Word '%s' used '%d' times" % (sw[k], sc[k])
return sw,sc
Fitting a black body model on the Sun’s spectrum
Fitting data with arbitrary functions
Using scipy:
scipy.optimize, scipy.optimize.leastsq
reading FITS files with pyfits
Lambda functions
We will fit a blackbody model onto the solar spectrum.
First, we need to code the blackbody model
Planck constant:
speed of light:
Boltzmann constant:
h = 6.626069e-34 (J.s)
c = 2.99792458e8 (m.s)
k = 1.3807e-23 (J/K)
1 parameter: T
Read the Solar spectrum: sun.fits
Optimize using a least square method
import numpy
from numpy import exp
#
h
c
k
Module global variables
= 6.626069e-34 # Planck (J.s)
= 2.99792458e8 # speed of light (m.s)
= 1.3807e-23
# Boltzmann (J/K)
def blackbody(wave, temp):
""" wave in angstroms, """
BBCONST = h*c/(k * 1.e-10)
exparg = BBCONST/(numpy.array(wave)*temp)
return 2*h*c**2*1e50/( wave**5 *
(numpy.exp(exparg)-1.) )
from scipy import optimize
import pylab
Def fakeData():
#Pure Black body with noise
dx = numpy.linspace(1000,20000, 1000)
dy = blackbody(dx, 7000)
dy /= dy.sum()
dy += numpy.random.normal(0,1e-4, len(dy))
return dx, dy
# The target function
fitfunc = lambda
p, x: p[0]*blackbody(x,p[1])
# Distance to the target function
errfunc = lambda p, x, y: fitfunc(p,x)-y
dx, dy = fakeData() # Test case
p0 = [1., 6000] # Initial guess for the parameters
p1, success = optimize.leastsq(errfunc, p0[:],
args=(dx, dy)
)
pylab.figure()
pylab.plot(dx,dy, ',')
pylab.plot(dx,fitfunc(p1,dx), color='#ff0000')
pylab.xlabel(r'Wavelength ($\rm{\AA}$)')
pylab.ylabel(r'Flux (Arbitrary Units)')
A short digression using pyfits
In [1]: h = pyfits.open('sun.fits')
In [2]: h
Out[2]:
[<pyfits.core.PrimaryHDU object at 0xae03bcc>,
<pyfits.core.BinTableHDU object at 0xb9f57ec>]
In [3]: print h[1].header
XTENSION= 'BINTABLE'
/ binary table extension
<...>
In [4]: h[1].data.field('flux')
Out[4]:
array([ 6.19000000e-05, 5.61400000e-04, 4.90100000e-03, ...,
2.95000000e-10, 1.01000000e-10, 3.38000000e-12])
In [5]: x = h[1].data.field('wavelength')
In [6]: y = h[1].data.field('flux')
Using pyfits
def getSunData():
import pyfits
t = pyfits.open(‘sun.fits’)
x = h[1].data.field('wavelength')
y = h[1].data.field('flux')
t.close()
return x, y
Now we can go back to our problem and fit the Solar spectrum
Adapt the previous method to the new dataset
Plot the result
Look at the statistics
What is the official temperature of the Sun?
x, y = getSunData() # Sun spectrum Case
ind = numpy.where((x>1e3) & (y< 2.0e4))
p1, success = optimize.leastsq(errfunc, p0[:],
args=(x[ind], y[ind]))
e1 = errfunc(p1, x[ind], y[ind])
e1 *= e1
print " Norm = %g, Temperature = %f, error=%f " %
(p1[0],p1[1], numpy.sqrt(e1.sum()) )
pylab.figure()
pylab.plot(x[ind], y[ind])
pylab.plot(x[ind], fitfunc(p1, x[ind]), color='#ff0000')
pylab.xlabel(r'Wavelength ($\rm{\AA}$)')
pylab.ylabel(r'Flux (Arbitrary Units)')
Norm = 8.91261e-14, Temperature = 5466.278346, error=4.390300
Norm = 6.65493e-14, Temperature = 5800.000000, error=5.359789
Glue It All Together With Python
With SAMPy
Simple example: (requires mytables, mysamp and dependencies)
In [1]: import mysamp, mytables
In [2]: c = mysamp.Client()
[SAMP] Warning (2011-03-21T11:08:53.322297): Another
SAMP Hub is already running. Start-up process
aborted.
<=> Aladin & Topcat launched before already providing a hub
In [3]: t = mytables.load('toto.csv')
Auto-detected input type: csv
In [4]: c['t'] = t
Auto-detected input type: fits
Overwrite existing file 't.fits'.
Data exported into t.fits
Auto-detected input type: fits
Objects oriented codes, graphical interface, Fortran-C interaction
Everything is object in python
int, float, …, variables, functions, classes, modules …
“a class is a construct that is used as a blueprint to create
objects of that class.” – Hum…
A class is a definition of a set of variables associated with functions
Equivalent to c/fortran structures + functions
An object is an instance of one class
Example: a class “Vector”
A vector of 3 coordinates
Function/”methods”: addition, scalar product, projection …
Coding a Vector class in Python:
class Vector(object): # “object” is the top-lvl type
def __init__(self, x,y,z): # constructor
self.x, self.y, self.z = x,y,z
def sum(self, p):
# addition: p1+p2
return Vector(self.x+p.x, self.y+p.y, self.z+p.z)
def scalarProd(self,p):
# scalar product p1*p2
return self.x*p.x + self.y*p.y + self.z*p.z
Instanciate Vector objects:
>>>
>>>
>>>
>>>
a = Vector(1,0,0)
b = Vector(0,1,0)
c = a.scalarProd(b)
print c # ???
What about print a?
Print is equivalent to a.__str__ (“convert a to string”)
Hence you need to define (overload) a __str__ method
Example:
def __str__(self):
return “Vect(%f,%f,%f)” % (self.x, self.y, self.z)
Sometimes it is better to use ‘+’ and ‘*’
Reading and usage are easier
You need to define operator functions:
__add__, __sub__, __mul__, __div__,
__pow__ …
More at http://docs.python.org/reference/datamodel.html
Now comes the good part: exercise
Try to implement a complete “Vector class” of your own
Manage a string representation of the vectors
Manage the operators ‘+’, ‘– ’ as common sense
Implement a scalar product solution based on ‘*’
Implement vector product based on ‘**’ (power operator)
class Vector(object):
def __init__(self, x, y=None, z=None):
# Class constructor: from list/tuple or 3 values
if hasattr(x,'__len__'):
assert len(x) == 3
self.x, self.y, self.z = x
else:
assert((y != None) & (z != None))
self.x, self.y, self.z = x,y,z
def __repr__(self):
# String repr, called when >>> Vector([1,2,3]) <CR>
s = self.__str__()+", "
s += object.__repr__(self)
return s
def __str__(self):
# String Conversion
return "Vect(%s)" % [self.x,self.y,self.z]
def __add__(self, p):
if isinstance(p, Vector): # if p Vector, sum of p1*p2
return Vector(self.x+p.x, self.y+p.y, self.z+p.z)
else: # if not p Vector
return Vector(self.x+p, self.y+p, self.z+p)
def __mul__(self,p):
if isinstance(p, Vector): # scalar product p1*p2
return self.x*p.x + self.y*p.y + self.z*p.z
else:
return Vector(self.x*p, self.y*p, self.z*p)
def __pow__(self,p):
if isinstance(p, Vector): # vector product p1**p2
x = self.y*p.z-self.z*p.y
y = self.z*p.x-self.x*p.z
z = self.x*p.y-self.y*p.x
return Vector(x,y,z)
else:
return Vector(self.x**p, self.y**p, self.z**p)
In [1]:
In [2]:
In [3]:
In [4]:
Out[4]:
import vector
a = vector.Vector(1,0,0)
b = vector.Vector([0,1,0])
a+1
Vect([2, 1, 1]), <vector.Vector object at 0x9544bcc>
In [5]: print a+b
Vect([1, 1, 0])
In [6]: print (a+1)**2
Vect([4, 1, 1])
In [7]: print a**b, b**a
Vect([0, 0, 1]) Vect([0, 0, -1])
Graphical User Interface from python
btw JK, may I be cannibal? ;)
Many classical libraries are available
Tcl/TK, QT, Wx, WxGTK …
An example is better than words…
Example taken from Joachim K. homeworks
“Applet” to show the physics of a 2 body gravitational interaction
One body is fixed and the second “orbits” around
Trajectory of the second body is plotted
▪ Depending on its initial position and velocity
Source in `OneBody.py`
C/C++, Fortran interaction
Python presents many advantages
However, too slow for highly intensive numerical computations
You can interface Python with C or Fortran codes
Best of all worlds
Pre- and post- processing in Python
Intensive part in Fortran (or in C)
F2py is one solution (many others exists)
Requires NumPy (> = 1.3) and Scipy_distutils (>= 0.2.2)
A few lines in your fortran source code
Interface file (F2py is able to auto-generate simple interface that you can edit)
Example (F90/95):
subroutine norme(a,b,c)
real(8), intent(in) :: a,b
Real(8), intent(out) :: c
c = sqrt( a*a + b*b )
end subroutine norme
Generate the interface with f2py
$> f2py -c norme.f90 -m vect \
--fcompiler=gfortran --f90flags=-03
<norme.f90>
From a python shell:
>>> import vect
>>> c = vect.norme(3,4)
>>> print c
5.0
Documentation is also generated by f2py:
>>> print vect.norme.__doc__
norme – Function signature:
c = norme(a,b)
Required arguments:
a : input float
b : input float
Return objects:
c: float
Works with lists and numpy.arrays
subroutine norme(a,c,n)
integer :: n, i
real(8), dimension(n), intent(in) :: a
!f2py optional, depend(a)::n=len(a)
real(8), intent(out) :: c
real(8) :: sumc
sumc = 0.
do i=1,n
sumc = sumc + a(i)*a(i)
end do
c = sqrt( sumc )
end subroutine norme
From a python shell:
>>> from vect import *
>>> a = [2,3,4] # a given list
>>> type(a)
<type ‘list’>
>>> norme(a)
5.3851648071345037
>>> from numpy import array
>>> a = array([2,3,4]) # an equivalent numpy.array
>>> type(a)
<type ‘numpy.ndarray’>
>>> norme(a)
5.3851648071345037
You can do much more
Callbacks are available (call python from fortran)
Encapsulate fortran within python objects
Compile fortran code inside python scripts
…
Visit http://www.scipy.org/F2py
C/C++ wrapper for python and many more (19 languages!)
Please don’t shoot… I haven’t use it yet!
Just trying to answer some of your questions
http://www.swig.org
SWIG is an interface compiler
connects programs written in C and C++
with scripting languages such as Perl, Python …
works by taking the declarations found in C/C++ header files
generates the wrapper code that scripting languages need to access
the underlying C/C++ code.
SWIG currently generates wrapper code for nineteen
different target languages:
Allegro CL, C#, CFFI, CLISP, Chicken, D, Go, Guile, Java, Lua, Modula-3,
Mzscheme, OCAML, Octave, Perl, PHP, Python, R, Ruby, Tcl, UFFI
So you want to get going in a hurry? To illustrate the use of SWIG,
suppose you have some C functions you want added to Python
/* File : example.c */
#include <time.h>
double My_variable = 3.0;
int fact(int n) {
if (n <= 1) return 1;
else return n*fact(n-1);
}
int my_mod(int x, int y) {
return (x%y);
}
char *get_time() {
time_t ltime;
time(<ime);
return ctime(<ime);
}
Interface file
Now, in order to add these files to your favorite language, you
need to write an "interface file" which is the input to SWIG. An
interface file for these C functions might look like this :
/* example.i */
%module example
%{ /* Put header files here or function declarations */
extern double My_variable;
extern int fact(int n);
extern int my_mod(int x, int y);
extern char *get_time();
%}
extern double My_variable;
extern int fact(int n);
extern int my_mod(int x, int y);
extern char *get_time();
Building a Python module
Turning C code into a Python module is also easy.
Simply do the following:
$> swig -python example.i
$> gcc -c example.c example_wrap.c \
-I/usr/local/include/python2.1
$> ld -shared example.o example_wrap.o -o _example.so
We can now use the Python module as follows :
>>> import example
>>> example.fact(5)
120
>>> example.my_mod(7,3)
1
>>> example.get_time()
'Sun Feb 11 23:01:07 1996’
More at http://www.swig.org
Mayavi2, Blender
A short example from the official website:
def getData():
import numpy as np
# Create data, z Gaussian function of x and y.
np.random.seed(12345)
x = 4*(np.random.random(500) – 0.5)
y = 4*(np.random.random(500) – 0.5)
f = lambda (x, y): np.exp(-(x**2 + y**2))
z = f(x, y)
return (x,y,z)
from enthought.mayavi import mlab
mlab.figure(1, fgcolor=(0, 0, 0), bgcolor=(1, 1, 1))
# get Dataset
x, y, z = getData()
# Visualize the points
pts = mlab.points3d(x, y, z, z, scale_mode=‘none’,
scale_factor=0.2)
# Create and visualize the mesh
mesh = mlab.pipeline.delaunay2d(pts)
surf = mlab.pipeline.surface(mesh)
mlab.view(47, 57, 8.2, (0.1, 0.15, 0.14))
mlab.show()
This example: how you can use Mayavi interactively to visualize
numpy arrays while doing numerical work with scipy.
It assumes that you are now familiar with numerical Python tools.
Use-case: study the trajectories of a particle in a potential.
very common problem: orbits, particles…
visualization of the potential and the trajectories
The potential:
a periodic lattice, immersed in a parabolic confinement.
We will shake this potential and see how the particle jumps from a
hole of the lattice to another.
The parabolic confinement is there to limit the excursions of the
particle
More at http://code.enthought.com/projects/mayavi
Code in the appendixes
Who said science is not fun?
Blender is a free open-source 3D content creation suite
Movies are being made with it (e.g. Sintel, available on youtube)
A suite like 3D Studio Max: 3D scenes, movies, animations
Full integration of python: scripting scenes and movies.
2.5x version: Full UI in python (calculations might be in C)
And you can even fly-in !!
Non scientific model of M31 (Andromeda Galaxy)
https://bitbucket.org/quiark/galaxy/.../galaxy2.blend
Just imagine…
N-body simulation of the Antennae Galaxy from Florent Renaud
Python to build the scene
Blender to fly-in
Setuptools mainly! (easy_install, pip)
How to install python?
Apt-get install python / fink install python / python-x.xx.msi(ish?)
About the “packages”:
First install `setuptools` (or `pip` equiv., python package index)
▪ Apt-get / fink / windows binary
Then easy_install [opts] <package name>
▪ E.g. easy_install numpy, easy_install pyfits…
▪ Is also able to update packages (-U option)
Quid if I am not root of my computer?
VirtualEnv is the fix. http://pypi.python.org/pypi/virtualenv
▪ Isolated python envir. (bottle)
You got out of the egg ...
You have now the materials to go on by yourself…
Soon you will dance to the flute songs
One day is quite short to show everything…
This example: how you can use Mayavi interactively to visualize
numpy arrays while doing numerical work with scipy.
It assumes that you are now familiar with numerical Python tools.
Use-case: study the trajectories of a particle in a potential.
very common problem: orbits, particles…
visualization of the potential and the trajectories
The potential:
a periodic lattice, immersed in a parabolic confinement.
We will shake this potential and see how the particle jumps from a
hole of the lattice to another.
The parabolic confinement is there to limit the excursions of the
particle
More at http://code.enthought.com/projects/mayavi
Use the mlab module to interactively visualize the data.
For this it is best to use an interactive Python shell
Mayavi2 application has a built-in interactive shell also
Let us visualize the 3D isosurfaces of the potential:
>>> from enthought.mayavi import mlab
>>> mlab.contour3d(X, Y, Z, V)
We can interact with the visualization
Rotation, zoom, …
In particular vary the iso-surfaces (maya-tree)
We want to study the motion of a particle in this potential
We derive the corresponding forces from the gradient of the potential.
We create a gradient function
def gradient(f, x, y, z, d=0.01):
""" Return the gradient of f in (x, y, z). """
fx = f(x+d, y, z); fx_ = f(x-d, y, z)
fy = f(x, y+d, z); fy_ = f(x, y-d, z)
fz = f(x, y, z+d); fz_ = f(x, y, z-d)
return ( (fx-fx_)/(2*d), (fy-fy_)/(2*d),
(fz-fz_)/(2*d) )
Let us visualize the corresponding vector field.
only along a cut for X=50, and every three points on our grid
Vx, Vy, Vz = gradient( V, X[50,::3,::3], Y[50,::3,::3],
Z[50,::3,::3]
)
mlab.quiver3d( X[50,::3,::3], Y[50,::3,::3], Z[50,::3,::3],
Vx, Vy, Vz, scale_factor=-0.2, color=(1,1,1) )
$> ipython -wthread
import numpy as np
def V(x, y, z):
""" Potential: A 3D sinusoidal lattice with a
parabolic confinement. """
return np.cos(10*x) + np.cos(10*y) + p.cos(10*z)
+ 2*(x**2 + y**2 + z**2)
We would like to see what it looks like in 3D.
To do this we create a 3D grid of points, and sample it on these later
X, Y, Z = np.mgrid[-2:2:100j, -2:2:100j, -2:2:100j]
A good view of the potential:
turning off auto contours
choosing -0.5 and 15
We can also change the colors used
by selecting a different LUT (Look Up Table).
select a paired as it separates well levels.
To get a better view of the potential:
display more contours, but closed contours hide their interior.
One solution: a cut plane.
Right-click on the IsoSurface node and add a ScalarCutPlane. You can
move the cut plane by clicking on it and dragging.
Add a colorbar:
mlab.colorbar(title=“Potential”)
We can use scipy to integrate the trajectories.
first define a dynamical flow of the system
The flow is used by every ODE (ordinary differential equation) solver
The dynamics is made of the force deriving from the potential
def flow(r, t):
""" The dynamical flow of the system """
x, y, z, vx, vy, vz = r
fx, fy, fz = gradient(V, x-.2*np.sin(6*t),
y-.2*np.sin(6*t+1),
z-.2*np.sin(6*t+2))
return np.array( (vx, vy, vz, -fx - 0.3*vx,
-fy - 0.3*vy,
-fz - 0.3*vz)
)
Now we can integrate the trajectory:
from scipy.integrate import odeint
R0 = (0, 0, 0, 0, 0, 0)
# Initial conditions
t = np.linspace(0, 50, 500) # Time Sampling
R = odeint(flow, R0, t)
Finally we can plot the trajectories:
x, y, z, vx, vy, vz = R.T
trajectory = mlab.plot3d(x, y, z, t,
colormap=“hot”,tube_radius=None)
mlab.colorbar(trajectory, title=“Time”)