Transcript Slide 1
Numerical Linear Algebra
By: David McQuilling;
Jesus Caban
Deng Li
Solving Linear Equations
2u 2v 8
4u 9v 21
A
x
b
2 2 u 8
4 9 v 21
Gaussian Elimination
Start with the matrix representation of Ax = b by adding the solution vector b
to the matrix A
2 2 8
4 9 21
Now perform Gaussian elimination by row reductions to transform the matrix Ab
into an upper triangular matrix that contains the simplified equations.
Find the first multiplier by dividing the first entry in the second row by the first
entry in the first row. This will allow you to transform the first entry in the
second row to a zero by a row reduction, thus reducing the system of
equations.
Gaussian Elimination (cont.)
Multiply the first row by the multiplier (4 / 2 = 2) and then subtract that from
the second row to obtain the new reduced equation.
4 9 21
2(2 2 8)
0 5 5
2 2 8 2 2 8
4 9 21 0 5 5
A = LU
The matrix resulting from the Gaussian elimination steps is actually U
concatenated with the new right hand side c.
L is obtained from the multipliers used in the Gaussian elimination steps.
L
U
x
c
1 0 2 2 u 8
2 1 0 5 v 5
Solving Ux=c by back substitution
U
x
c
2 2 u 8
0 5 v 5
2u 2v 8
5v 5
2u 2 *1 8
v 1
2u 6
u3
u 3
x
v 1
Solution to Ax=b
The solution vector to Ux=c is actually the
solution to the original problem Ax=b.
A
x
b
2 2 u 8
4 9 v 21
2 2 3 8
4 9 1 21
Gaussian algorithm to find LU
After this algorithm completes, the diagonal and upper triangle part
of A contains U while the lower triangle contains L. This algorithm
runs in O(n3) time.
Gram-Schmidt Factorization
Yields the factorization A=QR
Starts with n linearly independent vectors in A
Constructs n orthonormal vectors
A vector x is orthonormal if xTx = 1 and ||x|| = 1
Any vector is made orthonormal by making it a unit vector in the same
direction, i.e. the length is 1.
2
a 1
2
Dividing a vector x by its length (||x||) creates a unit vector in the direction of x
|| a || 22 12 22 3
2 / 3
a / || a || 1 / 3
2 / 3
Composition of Q and R
Q is the matrix with n orthonormal column vectors, which were
obtained from the n linearly independent vectors of A.
Orthonormal columns implies that QTQ = I
R is obtained by multiplying QT on both sides of A = QR
This equation becomes QTA = QTQR = I R = R
R becomes an upper triangular matrix because later q’s are orthogonal to
earlier a’s, i.e. their dot product is zero. This creates zeroes in the entries
below the main diagonal.
Obtaining Q from A
Start with the first column vector from A and use that as your first
vector q1 for Q (have to make it a unit vector before adding it to Q)
To obtain the second vector q2, subtract from the second vector in A,
a2, its projection along the previous qi vectors.
The projection of one vector onto another is defined as (xTy / xTx ) x
where the vector y is being projected onto the vector x.
Once all qi have been found this way, divide them by their lengths so
that they are unit vectors. Now those are the orthonormal vectors
which make up the columns of Q
Obtaining R from QTA
Once Q has been found, simply multiply A by the
transpose of Q to find R
Overall, the Gram-Schimdt algorithm is O(mn2)
for an mxn matrix A.
Eigenvalues and Eigenvectors
Eigenvectors are unique vectors such that when
multiplied by a matrix A, they do not change their
direction only their magnitude, i.e. Ax=λx
Eigenvalues are the scalar multiples of the eigenvectors,
i.e. the λ’s
Solving for eigenvalues of A
Solving for the eigenvalues involves solving det(A–λI)=0
where det(A) is the determinant of the matrix A
For a 2x2 matrix the determinant is a quadratic equation,
for larger matrices the equation increases in complexity
so lets solve a 2x2
Some interesting properties of eigenvalues are that the
product of them is equal to the det(A) and the sum of
them is equal to the trace of A which is the sum of the
entries on the main diagonal of A
Solving det(A-λI)=0
2
3
A I
1
2
3 2
A
1
2
det(A I ) (3 )(2 ) 2 *1 5 4
2
5 4 0
2
( 4)( 1) 0
1 4
2 1
Solving for the eigenvectors
Once you have the eigenvalues, λ’s, you can solve for
the eigenvectors by solving (A-λI)x=0 for each λi
In our example we have two λ’s, so we need to solve for
two eigenvectors.
Now solve (A-λI)x = 0 for λ=4
( A 4I ) x 0
1 2 x1
1 2 x 0
2
x1 2 x2 0
x1 2 x2 0
1
x1 2 x2 x
2
Solve (A-λI)x=0 for λ=1
( A I )x 0
2 2 x1
1 1 x 0
2
2 x1 2 x2 0
x1 x2 0
1
x1 x2 x
1
Positive definite matrices
Positive definite:
all eigenvalues are positive
All pivots are positive
xTAx > 0 is positive except
at x=0
Applications
Ellipse: find the axes of the
a tilted ellipse
ax 2bxy cy 1
2
Cholesky Factorization
•Cholesky factorization factors an N*N , symmetric, positive-definite
matrix into the product of a lower triangular matrix and its transpose.
•A = LLT
Least Square
•It often happens that Ax = b has no solution
•Too many equations (more equations than unknowns)
e b Ax
•When the length of e is as small as possible, x is a least square solution
•Example: Find the closest straight line to three points (0,6), (1,0), and (2,0)
•Line: b = C + Dt
•We are asking for two numbers C and D that satisfy three equations
•No straight line goes through those points
•Choose the one that minimize the error
Homework
Make a example for A=LU, show me L and
U.
Matrix Multiplication
Matrix Multiplication
Matrix multiplication is commonly used in
graph theory
numerical algorithms
computer graphics
signal processing
Multiplication of large matrices requires a lot of
computation time as its complexity is O(n3), where n is
the dimension of the matrix
Most parallel matrix multiplication algorithms use matrix
decomposition based on the number of processors
available
Matrix Multiplication
1 2
1 2 3 4 4 5
5 6 7 8 7 8
10 11
3
6 70 80 90
9 158 184 210
12
Example (a11):
1
4
a11 1 2 3 4 1*1 2 * 4 3 * 7 4 *10 70
7
10
Sequential Matrix Multiplication
The product C of the two matrices A and B is
defined by
n 1
cij aik bkj
k 0
aij, bij, and cij is the element in ith row and jth column of
the matrix A, B, and C respectively.
Strassen’s Algorithm
The Strassen’s algorithm is a sequential
algorithm reduced the complexity of matrix
multiplication algorithm to O(n2.81).
In this algorithm, n n matrix is divided into 4
n/2 n/2 sub-matrices and multiplication is done
recursively by multiplying n/2 n/2 matrices.
Parallel Matrix Partitioning
Parallel computers can be used to process large
matrices.
In order to process a matrix in parallel, it is necessary to
partition the matrix so that the different partitions can be
mapped to different processors.
Partitions:
Block row-wise/column-wise striping
Cyclic row-wise/colum-wise striping
Block checkerboard striping
Cyclic checkerboard
Striped Partitioning
Matrix is divided into groups of complete rows or
columns, and each processor is assigned one such
group.
Striped Partitioning
Block-Striped: each processor is assigned
contiguous rows or columns
Cyclic-Striped: is when rows or columns are
sequentially assigned to processors in a
wraparound manner.
Block Stripping
(Application examples: (a) is in the case of car front crash,
(b) is in the case of car side crash simulation.)
From: http://www.iitk.ac.in/cc/param/tpday21.html
Checkerboard Partition
The matrix is divided into smaller square or
rectangular blocks or sub-matrices that are
distributed among processors
Checkerboard Partitioning
Block-Checkerboard:
partitioning splits both the
rows and the columns of the matrix, so no processor
is assigned any complete row or column .
Cyclic-Checkerboard: when it wraparound.
Checkerboard Partitioning
From: http://www.iitk.ac.in/cc/param/tpday21.html
Parallel Matrix-Vector Product
Algorithm:
For each processor I
Broadcast X(i)
Compute A(i)*x
A(i) refers to the n by n/p block row that processor i owns
Algorithm uses the formula
Y(i) = Y(i) + A(i)*X = Y(i) + Sj A(i)*X(j)
Parallel Matrix Multiplication
Algorithms:
systolic
algorithm
Cannon's algorithm
Fox and Otto's algorithm
PUMMA (Parallel Universal Matrix Multiplication)
SUMMA (Scalable Universal Matrix Multiplication)
DIMMA (Distribution Independent Matrix
Multiplication)
Systolic Communication
Communication design
where the data exchange
and communication
occurs between the
nearest-neighbors.
From: http://www-unix.mcs.anl.gov
Matrix Decomposition
To implement the matrix multiplication, the A and
B matrices are decomposed into several submatrices.
Four methods of matrix decomposition and
distribution:
one-dimensional decomposition
two-dimensional
square decomposition
general decomposition
scattered decomposition.
One-dimensional Decomposition
The
matrix
is
horizontally
decomposed.
The ith processor
holds ith Asub and Bsub and
communicates them to two neighbor
processors, i.e., to the (i-1)th and
(i+1)th processors.
The first and last processor
communicate with each other as in a
ring topology.
p0 p1 p2 p3 p4 p5 p6 p7
Two-Dimensional Decomposition
Square decomposition: The twodimensional square decomposition
Matrix is decomposed into square
processor template.
General decomposition: General
Two-Dimensional Decomposition
allow having 2*3, 4*3, etc.
configurations.
Two-Dimensional Decomposition
Scattered decomposition: the
matrix is divided into several sets
of blocks. Each set of blocks
contains as many elements as
the number of processors, and
every element in a set of blocks is
scattered according to the twodimensional processor templates.
Applications
Computer Graphics
In computer graphics all the transformation are
represented as matrices and a set of transformations as
a matrix multiplication.
X Rotation
Transformations:
Rotation
Translation
Scaling
Shearing
© Pixar
2D/3D Mesh
•Laplacian matrix L(G) of the graph G
•If you take the eigenvalues and
eigenvectors of L(G), you can get
interesting properties as vibration of
the graph.
From: http://www.cc.gatech.edu/
Computer Vision
In computer vision we can calibrate a camera to determine the relationship
between what appears on the image (or retinal) plane and where it is
located in the 3D world by using and representing the parameters as
matrices and vectors.
Software for Numerical
Linear Algebra
LAPACK
LAPACK: Routines for solving systems of
simultaneous linear equations, least-squares
solutions of linear systems of equations,
eigenvalue problems, and singular value
problems.
Link:
http://www.netlib.org/lapack/
LINPACK
LINPACK is a collection of Fortran subroutines
that analyze and solve linear equations and
linear least-squares problems. The package
solves linear systems whose matrices are
general, banded, symmetric indefinite,
symmetric positive definite, triangular, and tridiagonal square.
Link:
http://www.netlib.org/linpack/
ATLAS
ATLAS (Automatically Tuned Linear Algebra
Software) provides portable performance and
highly optimized Linear Algebra kernels for
arbitrary cache-based architectures.
Link:
https://sourceforge.net/projects/math-atlas/
MathWorks
MATLAB: provide
engineers, scientists,
mathematicians, and
educators with an
environment for technical
computing applications
NAG
NAG Numerical Library:
solve complex problems in
areas such as research,
engineering, life and earth
sciences, financial
analysis and data mining .
NMath Matrix
NMath Matrix is an advanced matrix manipulation
library. Full-featured structured sparse matrix
classes, including triangular, symmetric, Hermitian,
banded, tridiagonal, symmetric banded, and
Hermitian banded.
Nastran
program participates in NASA-sponsored project
to develop a unified approach to computerized
structural analysis. The program became known
as NASTRAN (NASA Structural Analysis
Program).
NX Nastran,
http://www.ugs.com/index.shtml
MSC Nastran
http://www.mscsoftware.com/
Homework
Collect more information about LAPACK,
LINPACK, ATLAS, MathWorks,NAG,
NMath Matrix and Nastran from their
website then choose one which you feel
most interesting.