An Introduction to Pythonx - Observatoire Astronomique de

Download Report

Transcript An Introduction to Pythonx - Observatoire Astronomique de

Observatoire Astronomique de Strasbourg
March 28, 2011 – One Day Session
http://astro.unistra.fr/~morgan/python/M11.pptx
Morgan Fouesneau
[email protected]
“Meals are not provided!”
why python? and good references
 What, Why?
In astronomy
 Should I throw away my C, Fortran codes?
 Language power
 Starting points when stuck in python

Language

Application
Not to scale…
“Hello World!” …

A programmer can be 5-10 times more productive than others

Java:

Fortran (90):

Python:

As easy as IDL (or Yorick):
public class HelloWorld{
public static void main(string[] args){
System.out.println(“Hello World!”);
}
}
PROGRAM HelloWorld
PRINT *, "Hello World!"
END PROGRAM HelloWorld
print “Hello World!”
print, “Hello World!”

Easy and Quick coding

Already a reference in
astronomy

Faster than IDL (especially
for data access)

More flexible than any
other language

Between fully compiled and
pure script program

Free
Quantitative study based on 80 programmers
implementing the same program
http://page.mi.fu-berlin.de/prechelt/Biblio/jccpprtTR.pdf
no reason to replace them (yet), rather cooperate
Working horses of science…

zillions of codes existing

optimized for runtime performance

no reason to replace them (at this time)
but...

rather cooperate (f2py, swig)

not really general purpose

relatively primitive datatypes

manual memory management

slow edit/compile/test cycle

Extremely popular these days

Interactive, great visualization, good libraries
but...

Not really general purpose

Not really ready for large scale programs

Not free, even expensive

We can still call them from Python
but you will never do that unless you have a gun over your head

Perl very strong at text manipulation

Ruby seems to be popular in scientific community in Japan

Shell scripts... you will not want to do shell scripting any more
once you know Python (unless forced)!

Easy to learn but yet powerful

Fast and easy solutions for everyday problems

Allows development of large and sophisticated software packages

Several astronomical program packages implemented using python

Excellent tool for scientists

Easy parallel programming
official Python website www.python.org

Simple interactive example:
>>> t
array([
# print all t-values (assumed to be already defined)
0.
, 0.8975979 ,
1.7951958, 2.6927937,
3.5903916, 4.48798951, 5.38558741, 6.28318531 ])
>>> t[2:4] # get sub-array with the elements at the indexes 2,3
array([ 1.7951958, 2.6927937])
>>> p = 1.8 # set our point
>>> abs(t-p) # how much do the t[:]-values differ from p?
array([ 1.8
, 0.9024021 , 0.0048042 , 0.8927937,
1.7903916, 2.68798951, 3.58558741, .48318531 ])
>>> abs(t-p) # how much do the t[:]-values differ from p?
array([ 1.8
, 0.9024021 , 0.0048042 , 0.8927937,
1.7903916, 2.68798951, 3.58558741, .48318531 ])
<...>
>>> # how large is the 4-th largest absolute distance
between the t[:]-values and p
>>> dt_m = sort(abs(t-p))[3]
>>> # where are the four elements of t[:]closest to p ?
>>> abs(t-p) <= dt_m # corresponds to a “where” command in IDL
array([False, True, True, True, True, False, False, False])
>>> # construct the sub-arrays; (1) get the 4 t[:]-values
>>> t_p = t[abs(t-p)<=dt_m]
>>> t_p
array([ 0.8975979,
1.7951958,
2.6927937,
3.5903916])
Programs and libraries implemented (mainly in Python):

BoA (Bolometer Analysis Package)

APECS (APEX control software)

Astro-WISE (widefield imaging system)

PyFITS (access to FITS files)

SAMPy (Samp communications)

...
Wrappers for existing programs and libraries:

PYGILDAS (GILDAS)

ParselTongue (AIPS)

PyRAF (IRAF)

PyMIDAS (MIDAS)

PyDS9 (DS9 via XPA messaging system)

…

Keep it simple

Embed “basic” languages were really needed (e.g. brut/intensive
comp.)
C/C++
• Jython
Java
Fortran
• Pyrex
• Weave
• Cython
• f2py
• pyfort
Python

Communicate with almost any application

Python reference book
 Python from novice to pro
 Available online
 available in multiple languages.
 http://diveintopython.org
Websites













Python
http://www.python.org
Ipython
http://ipython.scipy.org
SciPy
http://www.scipy.org
NumPy
http://numpy.scipy.org
Matplotlib http://matplotlib.sourceforge.net
Astropython http://www.astropython.org
AstroPy
http://www.astro.washington.edu/users/rowen/AstroPy.html
Mathesaurus http://mathesaurus.sourceforge.net quick reference for switching
F2py
http://www.scipy.org/F2py c / f77 / f90 glue (is part of NumPy)
PyTables
http://www.pytables.org/ all around hdf5 format
PyFITS
http://www.stsci.edu/resources/software_hardware/pyfits/ FITS
SAMPy
http://cosmos.iasf-milano.inaf.it/pandora/sampy SAMP in python
Mayavi
http:// code.enthought.com/projects/mayavi/ SAMP in python
“Daily Specials”
Who, where, what, how?
Audience
 Observatoire: Toddler, grown up, elderly… anyone curious
 Must have at least basic experience in one or more computer languages
Content: Give you what you need for everyday problems asap
 First half of course: Basic language elements and basic visualization
some modules from standard library, elements of NumPy, SciPy and matplotlib
 Second half of course: Advanced topics and installing packages
Complex scripting, exceptions, classes… finding packages and how to install them…

Avoid any personal installation
 SSH connection to `ASTROMASTER`
 Username: `guestpython`
 Password: `python2011`
 This account will be deleted within the next few days
 Create your own folder in this user’s home directory

Installed packages
 numpy, matplotlib, pyfits, python-tables, setuptools, tkinter, ipython



IDE: Interactive Development Environment
i.e.: Very assisted
Examples: Eclipse (ext. Python), Spyder…
Spyder*

Editor, Console,
History Log,
Variable explorer,
Project manager,
online help
* Scientific PYthon
Development EnviRonment
http://code.google.com/p/spyderlib/
“comprehensive environment for interactive
and exploratory computing.”
http://ipython.scipy.org/

“comprehensive environment for interactive and exploratory
computing.”

In [ ]: myVariable? prints documentation of the variable
In [ ]: myFunction?? prints the source code (if accessible)

“Classic stuffs and some specials.”
Numbers, variables, math and while
 Python's numerical types
Operators for numerical types
 Variables, variable names, and assignment to variables
 The math module from the Standard Library
 Loops with the while statement
>>> 2, 0
>>> -4711
>>> 07, 022
# Octal literals
>>> 0x9, 0xa, 0XF
# Hex literals
>>> 17 + 4
# expression
>>> 0xa - 2
>>> 23 ** (2+3)
# 23⁵
>>> 7 / 2, 7 / -2
# Int division
>>> 2.3
>>> -4.
>>> 0.1, .1
>>> 2.99E10, 6.62607e-27, -1e10
>>> 1.7 + .4
>>> 17. + 4
>>> 7./2., 7./2., 7./2.
>>> 1 + .4
>>> 2 - 3
>>> 25 * -17
>>> 2.**100
>>> 2/3, 2/-3
# Int division
>>> 2./3
# Float division
>>> 2.//3, 2.//3
# Floor division
>>> 7 % 3, 7 % -3
# Int Modulus
>>> 7. % 3, 7.2 % 2.5
# Modulus
Nb: (n//m)*m + (n%m) = n
>>> 17 & 5
# Bitwise AND
>>> 17 | 5
# Bitwise OR
>>> 17 ^ 5
# Bitwise EXOR
>>> ~17
# Bitwise complement
>>> 17 << 3
# Bitshift left
>>> 0x11 >> 2
# Bitshift right
>>> 2.+3j, 2-3J
# complex literals
>>> j
# will not work
>>> 1J
# but this will
>>> complex(1,2)
>>> 1+1j*2, (1+1j)*2
>>> (2.+3j).real, (2+3j).imag
>>> x = 2
# Assign variable
>>> x
# Display
>>> x + 3
# Use variable
>>> y = x + 3
# New variable
>>> x = x + 1
# Assign new value
>>> x += 1
# Shorthand; no x++
>>> x, y = 2, 3
>>> x, y = y, x
>>> x = y = z = 0
# !! WARNING !!

Case sensitive
>>> xy, Xy = 2, 3

Must begin w. letter
>>> 9x = 2
>>> x9 = 2
>>> _x = 2

# not working
# ok
# ok, but special
must not be keyword
>>> if = 2

and











is

try

while

with

yield
pass
in
finally
return
or
import
exec
continue
def


not
if
except
class



raise
None
global
else
break




lambda
from
elif
assert






del
as

for
print
>>> abs(-2.)
>>> abs(1+1j)
>>> max(1,2,3,4)
>>> min(1,2,3,4)
>>> hex(17), oct(-5)
>>> round(1.23456, 2)
>>> x = 2
# Assignment
>>> x == 2
# Testing equality
>>> x != 0
# True
>>> x < 1
# False
>>> (x >= 2.) == False # False
>>> 1 < x <= 3
>>> 3 < x <= 1
>>> 1 < x >= 0
# (1<x) and (x<=3)
>>> x = 0
>>> while x <= 10:
... print x, x**2
... x += 1
... <RETURN>
# Bool. expr.
# Indentation
>>> print x
# Unindented again

Top level must not be indented

It does not matter how much blanks you use, but:

Uniform within each block

Most people use 4 blanks per level but tabs work too.
http://www.python.org/doc/essays/styleguide.html

How much is the smallest / largest positive float?
 Help: Use a while loop, compare to 0 and float(‘inf’)
>>> x = 1.
>>> while x > 0:
... x /= 2.
... print x

>>> x = 2.** 32
>>> while x != float(‘inf’):
... x *= 2.
... print x
How precise is your installation?
 Help: Add smaller and smaller numbers to 1 while comparing the sum to 1
>>> x = 1.
>>> while 1. + x > 1.:
... x /= 2.
... print x

How much is the smallest / largest positive float?
 Help: Use a while loop, compare to 0 and float(‘inf’)
>>> x = 1.
>>> while x > 0:
... x /= 2.
... print x
>>> x = 2.** 32
>>> while x != float(‘inf’):
... x *= 2.
... print x
4.94065645841e-324

How precise is your installation?
 Help: Add smaller and smaller numbers to 1 while comparing the sum to 1
>>> x = 1.
>>> while 1. + x > 1.:
... x /= 2.
... print x

How much is the smallest / largest positive float?
 Help: Use a while loop, compare to 0 and float(‘inf’)
>>> x = 1.
>>> while x > 0:
... x /= 2.
... print x
>>> x = 2.** 32
>>> while x != float(‘inf’):
... x *= 2.
... print x
4.94065645841e-324

8.98846567431e+307
How precise is your installation?
 Help: Add smaller and smaller numbers to 1 while comparing the sum to 1
>>> x = 1.
>>> while 1. + x > 1.:
... x /= 2.
... print x
1.11022302463e-16
“A complex system that works is invariably found to
have evolved from a simple system that worked”
– John Gall
Large scale program
 Python's package notion (‘import’)
 Module, Package
 Differences between import and execfile or %run
 Documentation

Run script from Unix shell with $> python myscript.py

Alternative: Make script executable
 $> chmod u+x myscript.py
 Add “ShiBang” line at the top of the file:
 Execute: $> ./myscript.py

Or run script from within Python
 >>> execfile(“myscript.py”)

Or within IPython:
 In [1]: %run myscript[.py]
#!/usr/bin/python

Multiline codes (functions, classes, …) or interactive command list

Documentation
 Usually in the __doc__ function within the “module” (file)
Ex:
def __doc__:
... “””
this is my documentation for this module.
usage: >helloWorld()
... “””
 Directly in each block:
def helloWorld():
... “”” This is a dummy function “””
... print “hello World!”
 Usage:
>>> import myscript
In[1]: %run myscript
>>> help(myscript)
In[2]: helloWorld?
>>> help(myscript.helloWorld) In[3]: helloWorld??

__doc__ : introduce documentation

In general, any __<name>__ is a special function/variable

Select instructions that are executed only when running the
script and not when import is called
 everything in the “__main__” block
 __main__ : Top-level script environment
 if __name__ == “__main__”:
<...>
Reading a photometric dataset to generate HR diagram
 Python's package notion (‘import’)
 Mandatory packages for doing science: numpy, matplotlib
 File access: reading from / writing to files
 Lists, array and indexes, slices
 For loop statements: while, for
 Functions: classical and lambda functions
 Basic visualization: plot, scatter, contours, colormaps
 First python scripts

Read the file

Manipulate the data

Build the figures

File description: data.ascii
 made of a header (lines start with #)
 6424 lines into 9 columns (space as separator)

First solution: elbow-grease (like in C)
 f = open('data.ascii', 'r') # 'r' by default
 Reading: f.read(), f.readline(), f.readlines()
 f.close()
 Reading is done by simply looping over the lines
 Long solution, you rarely need to be at this low-level.
 Learning interest…
#include <stdio>
#include <stdlib>
If you had to do it with C code…
int main(void)
{
int ncols = 9, nlines = 6424, i, j;
string filename = “data.ascii”;
FILE *file;
float *data = (float *) malloc( ncols * nlines * sizeof(float) );
file = fopen(filename, "r");
for ( i = 0; i < nlines; i++ ){
for ( j=0; j < ncols; j++ ){
fscanf( file, "%f ", data[j + ncols*i] );
}
}
fclose(file);
...
free(data);
}
…I am not sure this works though
>>>
>>>
>>>
>>>
>>>
>>>
import numpy
# low-level does not mean very-low
ncol = 9
nrow = 6424
data = numpy.array(ncol, nrow) # common array (one type of data)
l = '#'
while l[0] == '#': #Skip the comments
... l = f.readline()
"""Warning: at the end of the loop, l contains the first data row """
>>> for i in range(nrow): # <=> i in [0, ..., 6423]
... """ We should check whether ‘\n’ is present at the end of l
... and cut it out. In any other language, we may have something like:
...
if (l[len(l)-1] == '\n') { l = l[0:len(l)-2] }
... But not in python: """
... if l[-1] == '\n':
... l = l[:-1]
... l = l.split()
# split according to spaces
... for k in range(ncol): # Now fill data array
... """Python only reads strings that have to be converted then """
... data[k,i] = float(l[k])
... l = f.readline() # go on to the next line
>>> f.close()

A bit more in a reptile way:
>>> data = [] # declare an empty list
>>> for l in f: #loop over each line of f (don’t care about #lines)
... if l[0] != '#':
... # Magical `inline` for loop, faster than a classic `for`
... cl = [ float(k) for k in l[:-1].split()]
... data.append( cl ) # add processed line to data
>>> f.close()
""" data constains n-cells of m-values. In order to access the
data by columns, we either transpose the "matrix", or create a
sub-list:"""
>>> d = numpy.transpose(data)
>>> bflux = d[0]
>>> blux = [ k[0] for k in data ]

Finally in python: (think functions)
>>> def readAscii(fichier, transpose = False):
""" Function that will read the file """
... f = open('data.ascii')
... # lambda function <=> math function, x-->f(x)
... convertToFloat = lambda x: [ float(k) for k in x ]
... data = [ convertToFloat( l[:-1].split() ) \
for l in f if l[0] != '#' ]
... f.close()
... if transpose:
... return numpy.transpose(data)
... else:
... return data
>>> data = readAscii(f)
>>> d
= readAscii(f, transpose = True) # pédagoqique ;)

BUT: Someone already coded that for you
 Either numpy or scipy or matplotlib or pylab
>>> data = numpy.loadtxt('data.ascii')
 Variations:
>>> data = numpy.loadtxt('data.ascii', comments='#',
skiprows=0, delimiter=None
)
>>> bflux, bmag, bmerr, sep, vflux, vmag, vmerr, x, y =
numpy.loadtxt('data.ascii', unpack=True)

loadtxt() is usually fine
 But sometimes can be limited for big files

Other options are available
 e.g. ATpy, asciitable, …
 http://astropython.org/resources for other references
 ATpy is also able to read and write fits, votable, sql …

Calibrate the dataset:

We apply operations to the whole arrays
>>> bmag = bmag - 1.3
>>> vmag -= 2.5
# classic
# better for memory space

sereval possibilities for visualization
 2D: matplotlib/pylab (other exists)
 3D, matplotlib (still young), mayavi2 (openGL), …

We will use matplotlib/pylab in this workshop
>>> from numpy import loadtxt # only what we need
>>> import pylab as plt
# import & alias to plt
>>> bflux, bmag, bmerr, sep, vflux, vmag, vmerr, x, y =
loadtxt('data.ascii', unpack=True)
>>> bmag += 1.3; vmag += 2.5
# zeropoint correction

Plot star positions:
>>> plt.figure() # declare a new figure
>>> plt.plot(x, y, ',', color='0.0') # do the plot
# do not hesitate with help cmds: `plt.plot?`
>>> plt.xlabel('X (in pixels)') # set labels
>>> plt.ylabel('Y (in pixels)')
>>> plt.grid()
>>> plt.show()
# display (when not auto)

Plot star positions:
>>> plt.figure() # declare a new figure
>>> plt.plot(x, y, ',', color='0.0') # do the plot
# do not hesitate with help cmds: `plt.plot?`
>>> plt.xlabel('X (in pixels)') # set labels
>>> plt.ylabel('Y (in pixels)')
>>> plt.grid()
>>> plt.show()
# display (when not auto)

Build HR diagram of the stars:
>>> plt.figure()
>>> plt.scatter(bmag-vmag, vmag)
# `plt.plot` would also work ;)
>>> plt.xlabel(r'B$-$V')
>>> plt.ylabel(r'V')
>>> plt.xlim(1.9,3.4)
>>> plt.ylim(6.7,15)
>>> plt.ylim(plt.ylim()[::-1])
# reverse y-axis, or also `plt.ylim(15, 6.7)`
>>> plt.savefig(‘myfigure.png’)
# can export into eps,ps,pdf,png,svg…
>>> plt.show()

Build HR diagram of the stars:
>>> plt.figure()
>>> plt.scatter(bmag-vmag, vmag)
# `plt.plot` would also work ;)
>>> plt.xlabel(r'B$-$V')
>>> plt.ylabel(r'V')
>>> plt.xlim(1.9,3.4)
>>> plt.ylim(6.7,15)
>>> plt.ylim(plt.ylim()[::-1])
# reverse y-axis, or also `plt.ylim(15, 6.7)`
>>> plt.savefig(‘myfigure.png’)
# can export into eps,ps,pdf,png,svg…
>>> plt.show()

Now explore matplotlib/pylab

Generate a density plot of this HR diagram
 numpy.where,
 pylab.histogram2d
 pylab.contour and/or pylab.contourf
 pylab.figure and/or pylab.subplot
import numpy, pylab
# get the data
bflux, bmag, bmerr, sep, vflux, vmag, vmerr, x, y =
numpy.loadtxt('data.ascii', unpack=True)
bmag += 1.3
vmag += 2.5
# Select space region of interest
ind = numpy.where( ( (bmag-vmag) > -0.5 ) &
( (bmag-vmag) < 2.
) &
(
vmag
> 10. ) &
(
vmag
< 22. )
)
#do the histogram
h = pylab.histogram2d( bmag[ind]-vmag[ind], vmag[ind],
bins=100
)
image = h[0]
extent=[ h[1][0], h[1][-1], h[2][0], h[2][-1] ] #axes limits
#display part
ax = pylab.subplot(121) #Create a subplot, 1 row 2 cols, #1
ax.imshow( image.transpose(), extent=extent,
origin='lower', aspect='auto‘, cmap=pylab.cm.Blues)
ax.set_xlim(-0.5,1.0) ; ax.set_ylim(20, 11)
ax.set_ylabel(r'V‘)
; ax.set_xlabel(r'B$-$V')
ax.set_title('histogram2d')
ax = pylab.subplot(122) #Create a subplot, 1 row 2 cols, #2
levels = [0.1, 3.,
6.,
9., 12., 15., 18., 21.]
ax.contourf( image.transpose(), extent=extent,
levels=levels,cmap=pylab.cm.Blues )
ax.set_xlim(-0.5,1.0); ax.set_ylim(20, 11)
ax.set_ylabel(r'V') ; ax.set_xlabel(r'B$-$V')
ax.set_title('contourf')
pylab.savefig('m3_ex2.png')
ax = pylab.gcf().get_axes() # get subplots
ax1 = ax[1]
# second subplot (contourf)
ax1.set_ytitle('')
# delete tick labels on the y-axis
pylab.setp(ax1.get_yticklabels(), visible=False)
# shrink space between the plots
pylab.subplots_adjust(wspace=0.001)
Use some more magic and add minor ticks, change fonts, ...
(battery included in figure.py)
“It’s a trap!”
Data: values, pointers and references
 Location, affectation in C
 Location, affectation in Python
 Differences between equality and identity

int x, y;
Memory location x
type: int
Memory location y
type: int
?
?

Memory location x
type: int
Memory location y
type: int
int x, y;
?
?
x = 2;
2
?


Memory location x
type: int
Memory location y
type: int
int x, y;
?
?

x = 2;
2
?

x = 3;
3
?

Memory location x
type: int
Memory location y
type: int
int x, y;
?
?

x = 2;
2
?

x = 3;
3
?

y = x;
3
3
Value copied
Variables
Objects
(names)

(data)
>>> planet = “Pluto”
>>
Planet
“Pluto”,
type: str
Variables
Objects
(names)


(data)
>>> planet = “Pluto”
>>> planet = “Tiamat”
Planet
“Pluto”,
type: str
“Tiamat”,
type: str
Variables
Objects
(names)
>>> planet = “Pluto”


(data)
>>> planet = “Tiamat”
Planet

>>> legend = planet
legend
“Tiamat”,
type: str
Variables
Objects
(names)
>>> planet = "Pluto“


(data)
>>> planet = “Tiamat“
Planet

>>> legend = planet
legend

>>> del planet
“Tiamat”,
type: str
Variables
Objects
(names)
>>> planet = "Pluto“


(data)
>>> planet = “Tiamat“

>>> legend = planet
legend

>>> del planet

>>> del legend
“Tiamat”,
type: str
>>> l = [1,2,3]
>>> m = [1,2,3] # l and m different objects
>>> l == m # Equality (of values) tested with ==
>>> l is m # Identity (of objects) tested with is
>>> l[0] = 42
>>> print l
>>> print m # m[0] = ?
>>> l = [1,2,3]; m = l # l == m ? l is m ?
>>> l = numpy.array([1,2,3])
>>> m = l.copy() # l and m different objects
>>> l == m # Equality (of values) tested with ==
>>> l is m # Identity (of objects) tested with is
>>> l[0] = 42
>>> print l
>>> print m # m[0] = ?
Particular variables in Python: tuples, lists, arrays, & dictionaries
 tuples, lists, arrays
 dictionaries
 Looping philosophy
 Use case: Find out which the 10 most frequent words in a text are.
“It feels like somebody is using your head for origami, but it
doesn't bend that way…”

Tuples and lists are ordered sequences of values of different types

Tuples are immutable: does not support item assignment

Lists are mutable

In [1]: t = (1,2,3); u = [1,2,3]
In [2]: t[0], t[-1], t[1:], u[0], u[-1], u[1:]
Out[2]: (1, 3, (2, 3), 1, 3, [2, 3])
In [3]: t[0] = 10
TypeError : 'tuple' object does not support item
assignment
In [4]: u[0] = 10; u.append(4)
#ok
In [5]: v = list(t)
In [6]: del u[0]
# v is now a list of t-values
# delete one element
“It feels like somebody is using your head for origami, but it
doesn't bend that way…”

In [1]: t = ([1,2,3],[4,5])
In [2]: t[0] = 42 # list is immutable
TypeError : ‘tuple’ object does not support item
assignment
In [3]: t[0][0] = 42; print t
Out[3]: ([42,2,3],[4,5])
In [4]: t+t
# same result with lists
Out[4]: ([42,2,3],[4,5],[42,2,3],[4,5])
In [4]: t*2
# duplicate!! (also with lists)
Out[4]: ([42,2,3],[4,5],[42,2,3],[4,5])
Kind of named lists or like c-structures

Dictionaries are unordered, mutable set of (key, value) pairs

In [1]: d = {'spam': 2, 'eggs': 3}

Order is not preserved

Use of keys instead of indexes to access data (kind of named array)

In [2]: d[‘spam’]
Out[2]: 2
In [3]: d[‘spam’] = 10000
In [4]: d[‘bacon’]
KeyError: ...
Kind of named lists or like c-structures

In [2]: d[‘spam’]
Out[2]: 2
In [3]: d[‘spam’] = 10000
In [4]: d[‘bacon’]
KeyError: ...

Either check in advance: >>> d.has_key('bacon')

Or use get method with def. val. >>> d.get('bacon', 42)

Adding items: >>> d['bacon'] = 13

Removing items: >>> del d['spam']

Keys may be of any immutable type but unique (value are any)

[1,2]
(1,2) ‘str’, val…
To be or not to be iterable…

Loops use while or for constructions

while constructions are “classical”

For construction aims at iterating over an object

Lists, tuples and dictionaries are iterable objects

Any object with a __iter__ function is iterable
>>> for i in range(10): … # range(10) = [0,…,9]
>>> d = {'spam': 2, 'eggs': 3, 'bacon': 5}
>>> for k in d:
... print k, d[k]
>>> for k in d.keys(): # Clearer equivalent version
... print k, d[k]
>>> for k in sorted(d.keys()): # Sorted loop
... print k, d[k]

Special for-loop construction
>>> d = {'spam': 2, 'eggs': 3, 'bacon': 5}
>>> val = []:
>>> for k in d.keys():
... if d[k] > 2.5:
... val.append( (k,d[k]) )
>>> v = [ (k,d[k]) for k in d.keys() if d[k] > 2.5 ]

Automatic and optimized allocation, hence faster!

Find out which the 10 most frequent words in a text are.

“The First Men in the Moon” by H. G. Wells (Herbert Georges)
 http://www.gutenberg.org/ebooks/1013 (UTF-8 plain text format)

Help: use dictionaries or lists or tuples, and for-loops

Useful help:

import re – this let you use regular expressions (see in particular: re.sub)




Open the text file, read the full text into a variable
Filter for `\r\n` sequences
Be careful of case sensitivity, punctuation and so on… during word counts
Some commands (not all needed):
▪ <str>.split, <str>.join, <str>.upper, <str>.lower, <str>.replace …
▪ <list>.count, re.sub, numpy.argsort, numpy.unique ...
codesources/textCount.py

Subplots & linked axes
<…>.subplot(…,sharex=ax)
Power Law !!!
codesources/textCount.py
def count():
f
= open('firstMenOnTheMoon.txt') #open file
# =============== Get all lines and set to lower case ================
txt
= [ l.lower() for l in f if l != '\r\n']
f.close() #file can be closed now
txt
= ''.join(txt)
# ================== filter & replace punctuation ====================
txt
= txt.replace('\r\n', ' ').replace('--', ' ')
w
= re.sub('[.,:\'"!?;()]',' ', txt)
w
= w.split()
# ========================== Do the stats ============================
#words = set(w) # also works, generate a sequence of unique words
words = numpy.unique(w)
counts = [ w.count(kwd) for kwd in words ]
sind
= numpy.argsort(counts)[::-1]
sc
= numpy.array(counts)[sind]
sw
= numpy.array(words)[sind]
print “Statistics:\ncontains %d words\n“ % (len(w)
for k in range(10):
print "Word '%s' used '%d' times" % (sw[k], sc[k])
return sw,sc
Fitting a black body model on the Sun’s spectrum
 Fitting data with arbitrary functions
 Using scipy:
scipy.optimize, scipy.optimize.leastsq
 reading FITS files with pyfits
 Lambda functions

We will fit a blackbody model onto the solar spectrum.

First, we need to code the blackbody model
 Planck constant:
 speed of light:
 Boltzmann constant:
h = 6.626069e-34 (J.s)
c = 2.99792458e8 (m.s)
k = 1.3807e-23 (J/K)
 1 parameter: T

Read the Solar spectrum: sun.fits

Optimize using a least square method
import numpy
from numpy import exp
#
h
c
k
Module global variables
= 6.626069e-34 # Planck (J.s)
= 2.99792458e8 # speed of light (m.s)
= 1.3807e-23
# Boltzmann (J/K)
def blackbody(wave, temp):
""" wave in angstroms, """
BBCONST = h*c/(k * 1.e-10)
exparg = BBCONST/(numpy.array(wave)*temp)
return 2*h*c**2*1e50/( wave**5 *
(numpy.exp(exparg)-1.) )
from scipy import optimize
import pylab
Def fakeData():
#Pure Black body with noise
dx = numpy.linspace(1000,20000, 1000)
dy = blackbody(dx, 7000)
dy /= dy.sum()
dy += numpy.random.normal(0,1e-4, len(dy))
return dx, dy
# The target function
fitfunc = lambda
p, x: p[0]*blackbody(x,p[1])
# Distance to the target function
errfunc = lambda p, x, y: fitfunc(p,x)-y
dx, dy = fakeData() # Test case
p0 = [1., 6000] # Initial guess for the parameters
p1, success = optimize.leastsq(errfunc, p0[:],
args=(dx, dy)
)
pylab.figure()
pylab.plot(dx,dy, ',')
pylab.plot(dx,fitfunc(p1,dx), color='#ff0000')
pylab.xlabel(r'Wavelength ($\rm{\AA}$)')
pylab.ylabel(r'Flux (Arbitrary Units)')
A short digression using pyfits
In [1]: h = pyfits.open('sun.fits')
In [2]: h
Out[2]:
[<pyfits.core.PrimaryHDU object at 0xae03bcc>,
<pyfits.core.BinTableHDU object at 0xb9f57ec>]
In [3]: print h[1].header
XTENSION= 'BINTABLE'
/ binary table extension
<...>
In [4]: h[1].data.field('flux')
Out[4]:
array([ 6.19000000e-05, 5.61400000e-04, 4.90100000e-03, ...,
2.95000000e-10, 1.01000000e-10, 3.38000000e-12])
In [5]: x = h[1].data.field('wavelength')
In [6]: y = h[1].data.field('flux')
Using pyfits
def getSunData():
import pyfits
t = pyfits.open(‘sun.fits’)
x = h[1].data.field('wavelength')
y = h[1].data.field('flux')
t.close()
return x, y

Now we can go back to our problem and fit the Solar spectrum
 Adapt the previous method to the new dataset
 Plot the result
 Look at the statistics
 What is the official temperature of the Sun?
x, y = getSunData() # Sun spectrum Case
ind = numpy.where((x>1e3) & (y< 2.0e4))
p1, success = optimize.leastsq(errfunc, p0[:],
args=(x[ind], y[ind]))
e1 = errfunc(p1, x[ind], y[ind])
e1 *= e1
print " Norm = %g, Temperature = %f, error=%f " %
(p1[0],p1[1], numpy.sqrt(e1.sum()) )
pylab.figure()
pylab.plot(x[ind], y[ind])
pylab.plot(x[ind], fitfunc(p1, x[ind]), color='#ff0000')
pylab.xlabel(r'Wavelength ($\rm{\AA}$)')
pylab.ylabel(r'Flux (Arbitrary Units)')


Norm = 8.91261e-14, Temperature = 5466.278346, error=4.390300
Norm = 6.65493e-14, Temperature = 5800.000000, error=5.359789
Glue It All Together With Python
With SAMPy

Simple example: (requires mytables, mysamp and dependencies)
In [1]: import mysamp, mytables
In [2]: c = mysamp.Client()
[SAMP] Warning (2011-03-21T11:08:53.322297): Another
SAMP Hub is already running. Start-up process
aborted.
<=> Aladin & Topcat launched before already providing a hub
In [3]: t = mytables.load('toto.csv')
Auto-detected input type: csv
In [4]: c['t'] = t
Auto-detected input type: fits
Overwrite existing file 't.fits'.
Data exported into t.fits
Auto-detected input type: fits
Objects oriented codes, graphical interface, Fortran-C interaction

Everything is object in python
 int, float, …, variables, functions, classes, modules …

“a class is a construct that is used as a blueprint to create
objects of that class.” – Hum…
 A class is a definition of a set of variables associated with functions
 Equivalent to c/fortran structures + functions
 An object is an instance of one class

Example: a class “Vector”
 A vector of 3 coordinates
 Function/”methods”: addition, scalar product, projection …

Coding a Vector class in Python:
class Vector(object): # “object” is the top-lvl type
def __init__(self, x,y,z): # constructor
self.x, self.y, self.z = x,y,z
def sum(self, p):
# addition: p1+p2
return Vector(self.x+p.x, self.y+p.y, self.z+p.z)
def scalarProd(self,p):
# scalar product p1*p2
return self.x*p.x + self.y*p.y + self.z*p.z

Instanciate Vector objects:
>>>
>>>
>>>
>>>
a = Vector(1,0,0)
b = Vector(0,1,0)
c = a.scalarProd(b)
print c # ???

What about print a?
 Print is equivalent to a.__str__ (“convert a to string”)
 Hence you need to define (overload) a __str__ method
 Example:
def __str__(self):
return “Vect(%f,%f,%f)” % (self.x, self.y, self.z)

Sometimes it is better to use ‘+’ and ‘*’
 Reading and usage are easier
 You need to define operator functions:
__add__, __sub__, __mul__, __div__,
__pow__ …
More at http://docs.python.org/reference/datamodel.html

Now comes the good part: exercise

Try to implement a complete “Vector class” of your own
 Manage a string representation of the vectors
 Manage the operators ‘+’, ‘– ’ as common sense
 Implement a scalar product solution based on ‘*’
 Implement vector product based on ‘**’ (power operator)
class Vector(object):
def __init__(self, x, y=None, z=None):
# Class constructor: from list/tuple or 3 values
if hasattr(x,'__len__'):
assert len(x) == 3
self.x, self.y, self.z = x
else:
assert((y != None) & (z != None))
self.x, self.y, self.z = x,y,z
def __repr__(self):
# String repr, called when >>> Vector([1,2,3]) <CR>
s = self.__str__()+", "
s += object.__repr__(self)
return s
def __str__(self):
# String Conversion
return "Vect(%s)" % [self.x,self.y,self.z]
def __add__(self, p):
if isinstance(p, Vector): # if p Vector, sum of p1*p2
return Vector(self.x+p.x, self.y+p.y, self.z+p.z)
else: # if not p Vector
return Vector(self.x+p, self.y+p, self.z+p)
def __mul__(self,p):
if isinstance(p, Vector): # scalar product p1*p2
return self.x*p.x + self.y*p.y + self.z*p.z
else:
return Vector(self.x*p, self.y*p, self.z*p)
def __pow__(self,p):
if isinstance(p, Vector): # vector product p1**p2
x = self.y*p.z-self.z*p.y
y = self.z*p.x-self.x*p.z
z = self.x*p.y-self.y*p.x
return Vector(x,y,z)
else:
return Vector(self.x**p, self.y**p, self.z**p)
In [1]:
In [2]:
In [3]:
In [4]:
Out[4]:
import vector
a = vector.Vector(1,0,0)
b = vector.Vector([0,1,0])
a+1
Vect([2, 1, 1]), <vector.Vector object at 0x9544bcc>
In [5]: print a+b
Vect([1, 1, 0])
In [6]: print (a+1)**2
Vect([4, 1, 1])
In [7]: print a**b, b**a
Vect([0, 0, 1]) Vect([0, 0, -1])
Graphical User Interface from python
btw JK, may I be cannibal? ;)

Many classical libraries are available
 Tcl/TK, QT, Wx, WxGTK …

An example is better than words…
 Example taken from Joachim K. homeworks
 “Applet” to show the physics of a 2 body gravitational interaction
 One body is fixed and the second “orbits” around
 Trajectory of the second body is plotted
▪ Depending on its initial position and velocity
 Source in `OneBody.py`
C/C++, Fortran interaction

Python presents many advantages

However, too slow for highly intensive numerical computations

You can interface Python with C or Fortran codes
 Best of all worlds
 Pre- and post- processing in Python
 Intensive part in Fortran (or in C)

F2py is one solution (many others exists)
 Requires NumPy (> = 1.3) and Scipy_distutils (>= 0.2.2)
 A few lines in your fortran source code
 Interface file (F2py is able to auto-generate simple interface that you can edit)

Example (F90/95):
subroutine norme(a,b,c)
real(8), intent(in) :: a,b
Real(8), intent(out) :: c
c = sqrt( a*a + b*b )
end subroutine norme

Generate the interface with f2py
$> f2py -c norme.f90 -m vect \
--fcompiler=gfortran --f90flags=-03
<norme.f90>

From a python shell:
>>> import vect
>>> c = vect.norme(3,4)
>>> print c
5.0

Documentation is also generated by f2py:
>>> print vect.norme.__doc__
norme – Function signature:
c = norme(a,b)
Required arguments:
a : input float
b : input float
Return objects:
c: float

Works with lists and numpy.arrays
subroutine norme(a,c,n)
integer :: n, i
real(8), dimension(n), intent(in) :: a
!f2py optional, depend(a)::n=len(a)
real(8), intent(out) :: c
real(8) :: sumc
sumc = 0.
do i=1,n
sumc = sumc + a(i)*a(i)
end do
c = sqrt( sumc )
end subroutine norme

From a python shell:
>>> from vect import *
>>> a = [2,3,4] # a given list
>>> type(a)
<type ‘list’>
>>> norme(a)
5.3851648071345037
>>> from numpy import array
>>> a = array([2,3,4]) # an equivalent numpy.array
>>> type(a)
<type ‘numpy.ndarray’>
>>> norme(a)
5.3851648071345037

You can do much more
 Callbacks are available (call python from fortran)
 Encapsulate fortran within python objects
 Compile fortran code inside python scripts
 …

Visit http://www.scipy.org/F2py
C/C++ wrapper for python and many more (19 languages!)
Please don’t shoot… I haven’t use it yet!
Just trying to answer some of your questions
http://www.swig.org

SWIG is an interface compiler
 connects programs written in C and C++
 with scripting languages such as Perl, Python …
 works by taking the declarations found in C/C++ header files
 generates the wrapper code that scripting languages need to access
the underlying C/C++ code.

SWIG currently generates wrapper code for nineteen
different target languages:
 Allegro CL, C#, CFFI, CLISP, Chicken, D, Go, Guile, Java, Lua, Modula-3,
Mzscheme, OCAML, Octave, Perl, PHP, Python, R, Ruby, Tcl, UFFI

So you want to get going in a hurry? To illustrate the use of SWIG,
suppose you have some C functions you want added to Python
/* File : example.c */
#include <time.h>
double My_variable = 3.0;
int fact(int n) {
if (n <= 1) return 1;
else return n*fact(n-1);
}
int my_mod(int x, int y) {
return (x%y);
}
char *get_time() {
time_t ltime;
time(&ltime);
return ctime(&ltime);
}
Interface file
 Now, in order to add these files to your favorite language, you
need to write an "interface file" which is the input to SWIG. An
interface file for these C functions might look like this :
/* example.i */
%module example
%{ /* Put header files here or function declarations */
extern double My_variable;
extern int fact(int n);
extern int my_mod(int x, int y);
extern char *get_time();
%}
extern double My_variable;
extern int fact(int n);
extern int my_mod(int x, int y);
extern char *get_time();
Building a Python module
 Turning C code into a Python module is also easy.
 Simply do the following:
$> swig -python example.i
$> gcc -c example.c example_wrap.c \
-I/usr/local/include/python2.1
$> ld -shared example.o example_wrap.o -o _example.so

We can now use the Python module as follows :
>>> import example
>>> example.fact(5)
120
>>> example.my_mod(7,3)
1
>>> example.get_time()
'Sun Feb 11 23:01:07 1996’
More at http://www.swig.org
Mayavi2, Blender

A short example from the official website:
def getData():
import numpy as np
# Create data, z Gaussian function of x and y.
np.random.seed(12345)
x = 4*(np.random.random(500) – 0.5)
y = 4*(np.random.random(500) – 0.5)
f = lambda (x, y): np.exp(-(x**2 + y**2))
z = f(x, y)
return (x,y,z)
from enthought.mayavi import mlab
mlab.figure(1, fgcolor=(0, 0, 0), bgcolor=(1, 1, 1))
# get Dataset
x, y, z = getData()
# Visualize the points
pts = mlab.points3d(x, y, z, z, scale_mode=‘none’,
scale_factor=0.2)
# Create and visualize the mesh
mesh = mlab.pipeline.delaunay2d(pts)
surf = mlab.pipeline.surface(mesh)
mlab.view(47, 57, 8.2, (0.1, 0.15, 0.14))
mlab.show()

This example: how you can use Mayavi interactively to visualize
numpy arrays while doing numerical work with scipy.

It assumes that you are now familiar with numerical Python tools.

Use-case: study the trajectories of a particle in a potential.
 very common problem: orbits, particles…
 visualization of the potential and the trajectories

The potential:
 a periodic lattice, immersed in a parabolic confinement.
 We will shake this potential and see how the particle jumps from a
hole of the lattice to another.
 The parabolic confinement is there to limit the excursions of the
particle
More at http://code.enthought.com/projects/mayavi
Code in the appendixes
Who said science is not fun?

Blender is a free open-source 3D content creation suite

Movies are being made with it (e.g. Sintel, available on youtube)

A suite like 3D Studio Max: 3D scenes, movies, animations

Full integration of python: scripting scenes and movies.

2.5x version: Full UI in python (calculations might be in C)
And you can even fly-in !!
Non scientific model of M31 (Andromeda Galaxy)
https://bitbucket.org/quiark/galaxy/.../galaxy2.blend
Just imagine…

N-body simulation of the Antennae Galaxy from Florent Renaud

Python to build the scene

Blender to fly-in
Setuptools mainly! (easy_install, pip)

How to install python?
 Apt-get install python / fink install python / python-x.xx.msi(ish?)

About the “packages”:
 First install `setuptools` (or `pip` equiv., python package index)
▪ Apt-get / fink / windows binary
 Then easy_install [opts] <package name>
▪ E.g. easy_install numpy, easy_install pyfits…
▪ Is also able to update packages (-U option)

Quid if I am not root of my computer?
 VirtualEnv is the fix. http://pypi.python.org/pypi/virtualenv
▪ Isolated python envir. (bottle)
You got out of the egg ...
You have now the materials to go on by yourself…
Soon you will dance to the flute songs
One day is quite short to show everything…

This example: how you can use Mayavi interactively to visualize
numpy arrays while doing numerical work with scipy.

It assumes that you are now familiar with numerical Python tools.

Use-case: study the trajectories of a particle in a potential.
 very common problem: orbits, particles…
 visualization of the potential and the trajectories

The potential:
 a periodic lattice, immersed in a parabolic confinement.
 We will shake this potential and see how the particle jumps from a
hole of the lattice to another.
 The parabolic confinement is there to limit the excursions of the
particle
More at http://code.enthought.com/projects/mayavi

Use the mlab module to interactively visualize the data.
 For this it is best to use an interactive Python shell
 Mayavi2 application has a built-in interactive shell also

Let us visualize the 3D isosurfaces of the potential:
>>> from enthought.mayavi import mlab
>>> mlab.contour3d(X, Y, Z, V)

We can interact with the visualization
 Rotation, zoom, …
 In particular vary the iso-surfaces (maya-tree)

We want to study the motion of a particle in this potential
 We derive the corresponding forces from the gradient of the potential.
 We create a gradient function
def gradient(f, x, y, z, d=0.01):
""" Return the gradient of f in (x, y, z). """
fx = f(x+d, y, z); fx_ = f(x-d, y, z)
fy = f(x, y+d, z); fy_ = f(x, y-d, z)
fz = f(x, y, z+d); fz_ = f(x, y, z-d)
return ( (fx-fx_)/(2*d), (fy-fy_)/(2*d),
(fz-fz_)/(2*d) )

Let us visualize the corresponding vector field.
 only along a cut for X=50, and every three points on our grid
Vx, Vy, Vz = gradient( V, X[50,::3,::3], Y[50,::3,::3],
Z[50,::3,::3]
)
mlab.quiver3d( X[50,::3,::3], Y[50,::3,::3], Z[50,::3,::3],
Vx, Vy, Vz, scale_factor=-0.2, color=(1,1,1) )


$> ipython -wthread
import numpy as np

def V(x, y, z):
""" Potential: A 3D sinusoidal lattice with a
parabolic confinement. """
return np.cos(10*x) + np.cos(10*y) + p.cos(10*z)
+ 2*(x**2 + y**2 + z**2)

We would like to see what it looks like in 3D.

To do this we create a 3D grid of points, and sample it on these later
 X, Y, Z = np.mgrid[-2:2:100j, -2:2:100j, -2:2:100j]

A good view of the potential:
 turning off auto contours
 choosing -0.5 and 15

We can also change the colors used
 by selecting a different LUT (Look Up Table).
 select a paired as it separates well levels.

To get a better view of the potential:
 display more contours, but closed contours hide their interior.
 One solution: a cut plane.
 Right-click on the IsoSurface node and add a ScalarCutPlane. You can
move the cut plane by clicking on it and dragging.

Add a colorbar:
mlab.colorbar(title=“Potential”)

We can use scipy to integrate the trajectories.
 first define a dynamical flow of the system
 The flow is used by every ODE (ordinary differential equation) solver
 The dynamics is made of the force deriving from the potential
def flow(r, t):
""" The dynamical flow of the system """
x, y, z, vx, vy, vz = r
fx, fy, fz = gradient(V, x-.2*np.sin(6*t),
y-.2*np.sin(6*t+1),
z-.2*np.sin(6*t+2))
return np.array( (vx, vy, vz, -fx - 0.3*vx,
-fy - 0.3*vy,
-fz - 0.3*vz)
)

Now we can integrate the trajectory:
from scipy.integrate import odeint
R0 = (0, 0, 0, 0, 0, 0)
# Initial conditions
t = np.linspace(0, 50, 500) # Time Sampling
R = odeint(flow, R0, t)

Finally we can plot the trajectories:
x, y, z, vx, vy, vz = R.T
trajectory = mlab.plot3d(x, y, z, t,
colormap=“hot”,tube_radius=None)
mlab.colorbar(trajectory, title=“Time”)