NumPy, SciPy, Mpi4Py

Transcript NumPy, SciPy, Mpi4Py

NumPy, SciPy, Mpi4Py
Shepelenko Olha
History of NumPy
Originally, Python was not developed as a language for
numerical computing. However, due to its simplicity it
attracted attention of scientists.
- Numeric matrix package was designed in 1995. It is slow
for large arrays (still operating, but outdated);
- Numarray was designed as a replacement for Numeric
package. It is fast for large arrays, but slow for small arrays.
(also outdated);
- Numpy is a combination of Numeric and Numarray was
released in 2006 (Numpy 1.0).
NumPy package
NumPy is the fundamental package for scientific computing with
Python. It contains among other things:
• a powerful N-dimensional array object
• sophisticated (broadcasting) functions
• tools for integrating C/C++ and Fortran code
• useful linear algebra, Fourier transform, and random number
capabilities
Besides its obvious scientific uses, NumPy can also be used as an
efficient multi-dimensional container of generic data. Arbitrary
data-types can be defined. This allows NumPy to seamlessly and
speedily integrate with a wide variety of databases.
from www.numpy.org/
NumPy arrays
The main feature of NumPy is an array object.
• All array elements have to be the same type (usually float
or integer);
• Array elements can be accessed, sliced, and manipulated
in the same way as the lists;
• Arrays can be N-dimensional;
• The number of elements in the array is fixed;
• Shape of the array can be changed.
History of SciPy
SciPy is a package for scientific computing that provides a
standard collection of common numerical operations on top of
the Numeric array data structure. SciPy is a product of merging
three pieces of codes (based on Numeric) that were
developed by Travis Oliphant, Eric Jones and Pearu Peterson in
one package. SciPy was released in 2001.
https://en.wikipedia.org/wiki/SciPy#History_of_SciPy
SciPy as a scientific tool
Modules for:
– statistics, optimization;
– integration, interpolation;
– linear algebra, solving non-linear equations;
– Fourier transforms;
– ODE solvers, special functions;
– signal and image processing.
Library overview
The SciPy package of key algorithms and functions core to
Python's scientific computing capabilities. Available subpackages include:
 constants: physical constants and conversion factors (since
version 0.7.0)
 cluster: hierarchical clustering, vector quantization, K-means
 fftpack: Discrete Fourier Transform algorithms
 integrate: numerical integration routines
 interpolate: interpolation tools
 io: data input and output
 lib: Python wrappers to external libraries
from https://en.wikipedia.org/wiki/SciPy#The_SciPy_Library.2FPackage
Library overview
 linalg: linear algebra routines
 misc: miscellaneous utilities (e.g. image reading/writing)
 ndimage: various functions for multi-dimensional image
processing
 optimize: optimization
programming
algorithms
including
linear
 signal: signal processing tools
 sparse: sparse matrix and related algorithms
 spatial: KD-trees, nearest neighbors, distance functions
 special: special functions
 stats: statistical functions
 weave: tool for writing C/C++ code as Python multiline
strings
Linear algebra
Python’s mathematical libraries, NumPy and SciPy, have
extensive tools for numerically solving problems in linear
algebra.
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
Basic computations in linear algebra
SciPy has a number of routines for performing basic
operations with matrices.
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
Solving systems of linear equations
Solving systems of equations is nearly as simple as
constructing a coefficient matrix and a column vector.
Suppose you have the following system of linear equations
to solve:
The first task is to recast this set of equations as a matrix
equation of the form Ax=b. In this case, we have:
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
Next we construct the array A and vector b as NumPy arrays and find vector x:
Eigenvalue problems
One of the most common problems in science and
engineering is the eigenvalue problem, which in matrix form is
written as Ax=λx,
where A is a square matrix, x is a column vector, and λ is a
scalar (number). Given the matrix A, the problem is to find the
set of eigenvectors x and their corresponding eigenvalues λ
that solve this equation.
We can solve eigenvalue equations like this using
scipy.linalg.eig. the outputs of this function is an array whose
entries are the eigenvalues and a matrix whose rows are the
eigenvectors. Let’s return to the matrix we were using
previously and find its eigenvalues and eigenvectors.
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
Numerical integration
When a function cannot be integrated analytically, or is very
difficult to integrate analytically, one generally turns to
numerical integration methods. SciPy has a number of
routines for performing numerical integration. Most of them
are found in the same scipy.integrate library. List some of
them:
quad - single integration
dblquad - double integration
tplquad - triple integration
fixed_quad - Gaussian quadrature, order n
trapz - trapezoidal rule
polyint - analytical polynomial integration (NumPy)
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
Single integrals
The function quad is the workhorse of SciPy’s integration
functions. Numerical integration is sometimes called
quadrature, hence the name. It is normally the default choice
for performing single integrals of a function f(x) over a given
fixed range from a to b.
The general form of quad is scipy.integrate.quad(f, a, b),
where f is the name of the function to be integrated and a
and b are the lower and upper limits, respectively. The routine
uses adaptive quadrature methods to numerically evaluate
integrals, meaning it successively refines the subintervals
(makes them smaller) until a desired level of numerical
precision is achieved. For the quad routine, this is about 10^{8}, although it usually does even better.
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
As an example, let’s integrate a Gaussian function over the
range from 0 to 1
We first need to define the function
which we do
using a lambda expression, and then we call the function
quad to perform the integration.
The function call scipy.integrate.quad(f, 0, 1) returns two
numbers. The first is 0.7468..., which is the value of the
integral, and the second is 8.29...e-15, which is an estimate of
the absolute error in the value of the integral, which we see is
quite small compared to 0.7468
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
Double integrals
The scipy.integrate function dblquad can be used to
numerically evaluate double integrals of the form.
The general form of dblquad is
scipy.integrate.dblquad(func, a, b, gfun, hfun),
where func if the name of the function to be integrated, a
and b are the lower and upper limits of the x variable,
respectively, and gfun and hfun are the names of the
functions that define the lower and upper limits of the y
variable.
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
As an example, let’s perform the double integral
We define the functions f, g, and h, using lambda
expressions. Note that even if g, and h are constants, as
they may be in many cases, they must be defined as
functions, as we have done here for the lower limit.
Once again, there are two outputs: the first is the value of
the integral (0.5) and the second is its absolute uncertainty
(5.551…e-15).
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
Solving ODEs
The scipy.integrate library has two powerful routines, ode
and odeint, for numerically solving systems of coupled first
order ordinary differential equations (ODEs). While ode is
more versatile, odeint (ODE integrator) has a simpler Python
interface works very well for most problems.
A typical problem is to solve a second or higher order ODE
for a given set of initial conditions. Here we illustrate solving
the equation for a driven damped pendulum using odeint.
The equation of motion for the angle Θ that the pendulum
makes with the vertical is given by
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
where t is time, Q is the quality factor, d is the forcing
amplitude, and Ω is the driving frequency of the forcing.
Reduced variables have been used such that the natural
(angular) frequency of oscillation is 1. The ODE is nonlinear
owing to the sin Θ term. Of course, it’s precisely because
there are no general methods for solving nonlinear ODEs
that one employs numerical techniques, so it seems
appropriate that we illustrate the method with a nonlinear
ODE.
The first step is always to transform any nth-order ODE into a
system of n first order ODEs of the form:
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
We also need n initial conditions, one for each variable yi.
Here we have a second order ODE so we will have two
coupled ODEs and two initial conditions.
We start by transforming our second order ODE into two
coupled first order ODEs. The transformation is easily
accomplished by defining a new variable ω=dΘ/dt. With this
definition, we can rewrite our second order ODE as two
coupled first order ODEs:
In this case the functions on the right hand side of the
equations are
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
Note that there are no explicit derivatives on the right hand
side of the functions fi; they are all functions of t and the
various yi, in this case Θ and ω.
The initial conditions specify the values of Θ and ω at t=0.
SciPy’s ODE solver scipy.integrate.odeint has three required
arguments and many optional keyword arguments, of which
we only need one, args, for this example. So in this case,
odeint has the form odeint(func, y0, t, args=())
The first argument func is the name of a Python function that
returns a list of values of the n functions fi(t, y1, ..., yn) at a
given time t. The second argument y0 is an array (or list) of the
values of the initial conditions of y1, ..., yn). The third argument
is the array of times at which you want odeint to return the
values of y1, ..., yn). The keyword argument args is a tuple that
is used to pass parameters (besides y0 and t) that are needed
to evaluate func. Our example should make all of this clear.
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
After having written the nth-order ODE as a system of n firstorder ODEs, the next task is to write the function func. The
function func should have three arguments: (1) the list (or
array) of current y values, the current time t, and a list of any
other parameters params needed to evaluate func. The
function func returns the values of the derivatives dyi/dt = fi(t,
y1, ..., yn) in a list (or array). Lines 5-11 illustrate how to write
func for our example of a driven damped pendulum.
The only other tasks remaining are to define the parameters
needed in the function, bundling them into a list (see line 22
below), and to define the initial conditions, and bundling
them into another list (see line 25 below). After defining the
time array in lines 28-30, the only remaining task is to call
odeint with the appropriate arguments and a variable, psoln
in this case to store output.
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
The output psoln is an n element array where each element
is itself an array corresponding the values of yi for each time
in the time t array that was an argument of odeint. For this
example, the first element psoln[:,0] is the y0 or theta array,
and the second element psoln[:,1] is the y1 or omega array.
The remainder of the code simply plots out the results in
different formats. The resulting plots are shown in the figure
Pendulum trajectory after the code.
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
The plots above reveal that for the particular set of input parameters
chosen, Q = 2.0, d = 1.5, and Omega = 0.65, the pendulum trajectories
are chaotic.
from http://www.physics.nyu.edu/pine/pymanual/html/chap9/chap9_scipy.html
Parallel programming with MPI
MPI - Message passing interface. MPI is a library that used
for running programs in parallel.
To use MPI library in Python, we need to install MPI4py
module. It provides standard functions, for instance, to get
the rank of processors, send/receive messages from various
nodes in the clusters. It allows the program to be executed
in parallel while messages passing between nodes.
Simple examples
Hello World program for multiple processes:
5 python processes will be executed (as we specified 5), they
can all communicate with each other. When each program
runs, it will print ‘hello world from process’, and display its rank:
MPI.COMM_WORLD is a static reference to a Comm
object, and comm is just a reference to it for our
convenience.
A communicator is a logical unit that defines which
processes are allowed to send and receive messages
A group of independent processes called ranks share a
communicator.
http://materials.jeremybejarano.com/MPIwithPython/introMPI.html
When an MPI program is run, each process receives the
same code. However, each process is assigned a different
rank. This allows us to embed a seperate code for each
process into one file. In the following code, all processes are
given the same two numbers. However, though there is only
one file, 3 processes are given completely different
instructions for what to do with them. Process 0 sums them,
process 1 multiplies them, and process 2 takes the
maximum of them:
http://materials.jeremybejarano.com/MPIwithPython/introMPI.html
Note the difference between upper/lower case!
 send/recv: general Python objects, slow
 Send/Recv: continuous arrays, fast
Computing Pi
https://mpi4py.scipy.org/docs/usrman/tutorial.html
Thank you

NumPy, SciPy, Mpi4Py

Transcript NumPy, SciPy, Mpi4Py

Directory