Maximum entropy methods (MAXENT)

Download Report

Transcript Maximum entropy methods (MAXENT)

An Information-theoretic Tool for
Property Prediction Of Random
Microstructures
Sethuraman Sankaran and Nicholas Zabaras
Materials Process Design and Control Laboratory
Sibley School of Mechanical and Aerospace Engineering
188 Frank H. T. Rhodes Hall
Cornell University
Ithaca, NY 14853-3801
Email: [email protected], [email protected]
URL: http://mpdc.mae.cornell.edu/
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
RESEARCH SPONSORS
U.S. AIR FORCE PARTNERS
Materials Process Design Branch, AFRL
Computational Mathematics Program, AFOSR
ARMY RESEARCH OFFICE
Mechanical Behavior of Materials Program
NATIONAL SCIENCE FOUNDATION (NSF)
Design and Integration Engineering Program
CORNELL THEORY CENTER
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
An overview

Mathematical representation of random microstructures

Extraction of higher order features from limited microstructural
information : the MAXENT approach

MAXENT optimization schemes

Evaluation of homogenized elastic properties from microstructures

Effect of varying information content on property statistics

Numerical examples

Summary and future work
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Idea Behind Information Theoretic Approach
Basic Questions:
1. Microstructures are realizations of a random field. Is there a principle by which the
underlying pdf itself can be obtained.
2. If so, how can the known information about microstructure be incorporated in the solution.
3. How do we obtain actual statistics of properties of the microstructure characterized at macro
scale.
Information
Theory
Statistical
Mechanics
CORNELL
U N I V E R S I T Y
Rigorously quantifying
and modeling
uncertainty, linking
scales using criterion
derived from
information theory, and
use information
theoretic tools to predict
parameters in the face
of incomplete
Information etc
Linkage?
Information Theory
Materials Process Design and Control Laboratory
Representation of random microstructures
Indicator functions used to represent microstructure at different regions
in the physical domain
 Indicator functions take values over a binary alphabet
 Statistical features of microstructure are mathematically tractable in
terms of expected values over indicator functions

Two-phase material
if
if
n-phase material
if
Define Ii as the set comprising
(Ii(x1), Ii(x2), … Ii(xn)). Ii
represents a random field of
indicator functions over the
domain. Microstructures are
hierarchically characterized over
a set of random variables of this
field
if
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Defining correlation vectors using indicator functions
Two-point probability functions
Lineal Path Functions
n-point probability functions
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Microstructure Reconstruction Schemes
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Reconstruction of microstructures
Correlation features of desired microstructures is provided.
• Aim to reconstruct microstructures that satisfy these ensemble statistical
properties.
• Ill-posed problem with many distributions that satisfy given ensemble
properties.
•
Pb-Sn microstructures
CORNELL
U N I V E R S I T Y
High strength steel
microstructures obtained
by thermal processing
Media with short range
interactions
Materials Process Design and Control Laboratory
Current schemes for microstructure reconstruction
D. Cule and S. Torquato ’99 Reconstruction of porous media using Stochastic Optimization
C. Manwart, S. Torquato and R. Hilfer ’00, Reconstruction of sandstone structures using stochastic optimization
N. Zabaras et.al. ’05 Reconstruction of microstructures using SVM’s
T.C.Baroni et al. ’02, Reconstruction of microstructures using contrast imaging techniques
A.P. Roberts ’97, Reconstruction of porous media using image mapping techniques from 2d planar images.
Stochastic Optimization Procedure
Input: Given statistical correlation or lineal path functions
Obtain: microstructures that satisfy the given properties
 Start from a random configuration over the specified problem domain
such that the volume fraction information is satisfied.
Randomly choose two locations (pixels) and define a move by interchanging
the intensities of the two pixels.
 If the error norm defined as the deviation of the correlation features from
target features reduces, accept the move, otherwise reject it.
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
A MAXENT viewpoint
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Information Theoretic Scheme: the MAXENT principle
Input: Given statistical correlation or lineal path functions
Obtain: microstructures that satisfy the given properties
 Constraints are viewed as expectations of features over a random field.
Problem is viewed as finding that distribution whose ensemble properties
match those that are given.
Since, problem is ill-posed, we choose the distribution that has the
maximum entropy.
 Additional statistical information is available using this scheme.
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
The MAXENT Principle
E.T. Jaynes 1957
The principle of maximum entropy (MAXENT) states that amongst the probability
distributions that satisfy our incomplete information about the system, the probability
distribution that maximizes entropy is the least-biased estimate that can be made. It
agrees with everything that is known but carefully avoids anything that is unknown.
A MAXENT viewpoint
Trivial case: no information is
available about microstructure.
From MAXENT, the equiprobable
case is the case with maximum
entropy for an unconstrained
problem. This agrees with intuition
as to the most unbiased case
CORNELL
U N I V E R S I T Y
Information about volume
fraction given.
Higher order information
provided
The MAXENT distribution is one Correlation between material points
wherein we sample from the
to be taken into account. Result is
volume fraction distribution itself at
not trivial and needs to be
all material points
numerically computed
Materials Process Design and Control Laboratory
MAXENT as a feature matching tool
 D. Pietra et al. ‘96, MAXENT principle for language processing.
Features of language extracted and MAXENT principle is used to
develop a language translator
 Zhu et al. ‘98, MAXENT principle for texture processing
Texture features from images in the form of histograms is extracted and
MAXENT principle used to reconstruct texture images
Sobczyk ’03
MAXENT used for obtaining distributions of grain sizes from macro
constraints in the form of expected grain size.
Koutsourelakis ‘05,
MAXENT for generation of random media. Correlation features of
random media used as constraints to generate samples of random
media.
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
MAXENT for microstructure reconstruction
•
MAXENT is essentially a way of generating a PDF on a hypothesis space which,
given a measure of entropy, is guaranteed to incorporate only known
constraints.
•
MAXENT cannot be derived from Bayes theorem. It is fundamentally different,
as Bayes theorem concerns itself with inferring a-posteriori probability once the
likelihood and a-priori probability are known, while MAXENT is a guiding
principle to construct the a-priori PDF.
•
We associate the PDF with a microstructure image and generate samples of the
image.
•
MAXENT produces images with features (information) that are consistent with
the known constraints. Another way of stating this is that MAXENT produces
the most uniform distribution consistent with the data.
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
MAXENT optimization schemes
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
MAXENT as an optimization problem
Find
feature constraints
Subject to
features of image I
Lagrange Multiplier optimization
Lagrange Multiplier optimization
Partition Function
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Equivalent log-linear model
Equivalent log-likelihood problem
Find
that maximizes
Kuhn-Tucker theorem: The
that maximizes the dual function L also
maximizes the system entropy and satisfies the constraints posed by
the problem
A
Comparison
CORNELL
U N I V E R S I T Y
Direct models
Log-linear models
Concave
Concave
Constrained (simplex)
Unconstrained
“Count and normalize”
(closed form solution)
Iterative methods
Materials Process Design and Control Laboratory
Optimization Schemes
•
•
•
•
Generalized Iterative Scaling
Improved Iterative Scaling
Gradient Ascent
Newton/Quasi-Newton Methods
– Conjugate Gradient
– BFGS
– …
 Start from a
equal to 0. This is equivalent to uniform distribution
over sample space.
 Evaluate gradient at this point.
 Perform a line search on a direction based on the gradient
information.
 Evaluate the gradient information at the next point and continue the
procedure till it is within tolerance limit.
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Gradient Evaluation
• Objective function and its gradients:
stochastic function
stochastic function
• Infeasible to compute at all points in one conjugate gradient iteration
• Use sampling techniques to sample from the distribution evaluated
at the previous point. (Gibbs Sampler)
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Sampling techniques
• Sample from an exponential distribution using the Gibbs algorithm
 Choose a random point.
 Evaluate the effective “energy” for various phases at that point
using the updation algorithm to estimate “energy”.
 Draw a sample from the given distribution and replace the pixel
value at the material point.
 Continue the procedure till a sufficiently large number of samples
are drawn.
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Updation Scheme
A scheme to update correlation function of an image when the
phase of a single pixel is changed
Two point Correlation Function
Lineal Path Function
r
r
Material point whose
intensity is changed
zone of influence (region
where correlation function is affected)
zone of influence
Rozman,Utz ‘01
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Line search and conjugate directions
 Brent’s parabolic interpolation used for line search.
 Stabilization in conjugate gradient machinery (Schraudolph ’02)
 Add a correction term so that as line search becomes increasingly
inaccurate, its effect on the conjugate direction is also subdued.
Stabilization term
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Optimization Schemes
Convergence analysis with stabilization
Convergence analysis w/o stabilization
Noise in function evaluation increases as step size for the next minima increases. This ensures that the
impact on the next evaluation is reduced.
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Entropy variation during MAXENT algorithmic scheme
100
90
80
Entropy(bits)
70
60
50
40
30
20
10
0
CORNELL
U N I V E R S I T Y
0
20
40
60
80
100
120
Iteration
140
160
180
200
Materials Process Design and Control Laboratory
Evaluation of effective
elastic properties
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Effective elastic property of microstructures
Variational Principle: Subject to applied
loads and other boundary conditions,
minimize the energy stored in the
microstructure.
Pixel based mesh with a single phase
inside each pixel (E. Garboczi, NIST
’98). Each pixel attributed the property
of that particular phase.
Homogenization: The effective
homogenized property of the
microstructure is obtained by equating
energy of microstructure with that of a
specimen with uniform properties
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Consolidated Algorithm
Experimental
images
Analytical Correlation
functions
Extract features and rephrase as
mathematical constraints
Pose as a MAXENT problem and use
gradient-based schemes for obtaining
solution
Use Gibbs
sampling algorithm
for sampling from
underlying
distribution
Generate samples and interrogate using
FEM
Obtain property statistics and use them
for further analysis
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Numerical Examples
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Example 1
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Reconstruction of 1d hard disks
Reconstruct one-dimensional hard disk microstructures based on two different
kinds of information: (a) two-point correlation functions (b) two point correlation
and Lineal path function. Obtain elastic property statistics and compare for the
two schemes.
Input: Analytical two-point and lineal path functions (Torquato et.al. ’99)
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Microstructures based on two-point correlation function
MAXENT
distribution
20
18
16
No. of samples
14
12
10
8
6
4
2
0
150
CORNELL
U N I V E R S I T Y
160
170
180
190
200
210
220
Effective young's modulus(GPa)
230
240
250
Materials Process Design and Control Laboratory
Microstructures based on two-point and lineal path function
MAXENT
distribution
35
30
No. of Samples
25
20
15
10
5
0
140
CORNELL
U N I V E R S I T Y
160
180
200
220
Effective young's modulus (GPa)
240
260
Materials Process Design and Control Laboratory
Comparison of property statistics between two schemes
35
20
18
30
16
25
No. of Samples
No. of samples
14
12
10
8
6
20
15
10
4
5
2
0
150
160
170
180
190
200
210
220
Effective young's modulus(GPa)
CORNELL
U N I V E R S I T Y
230
240
250
0
140
160
180
200
220
Effective young's modulus (GPa)
240
260
Materials Process Design and Control Laboratory
Example 2
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Porous Media with short range order
To generate microstructures of porous media which exhibit short range orders of given
specific structure. (S2 is the two point correlation function, k and ro depend depend on
characteristic length scales chosen)
Input: Analytical two-point correlation functions (Torquato et.al. ’99)
Problem Parameters
correlation length ro= 32
2
k
ao
oscillation parameter
ao= 8
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Property statistics for media with short range order
MAXENT
distribution
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Example 3
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Reconstruction using heterogeneous graded materials
Heterogeneous Graded Materials
Given a description of the gradation of phase-distribution in a graded material,
reconstruct microstructures compatible with the given information, estimate statistics of
microstructure properties from this set.
Input: Analytical volume fraction information throughout sample (Koutsourelakis ’04)
Applications
 Tools with desirable properties at
tips.
 Artificial joints for implants in
humans
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Samples of bilinearly graded heterogeneous materials
at smooth resolution levels
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Elastic properties of bilinear graded materials
Effective elastic properties for a tungsten-silver
bilinear graded material at 25oC
35
30
No. of samples
25
20
15
10
5
0
CORNELL
U N I V E R S I T Y
203 210 217 224 231 238 245 252 259 266 273 280 287 294
Effective Young's Modulus(GPa)
Materials Process Design and Control Laboratory
Conclusions and future work
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Conclusions

Microstructures were characterized stochastically and scheme for
obtaining samples based on a MAXENT and time efficient update
scheme implemented.

Gradient based schemes and property of system entropy were
analyzed in detail.

Elastic properties were obtained using FEM and property statistics
developed

Schemes were discussed for numerical microstructures and effect of
incorporation of higher information on property statistics studied.
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
Future Work

Extend the method for polycrystal materials incorporating
information in the form of odf’s.

Couple the scheme with pixel based methods for obtaining plastic
properties.

Extend the method to physical deformation processes taking into
account the evolution of microstructure.
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory
References
1. E.T. Jaynes, Information Theory and Statistical Mechanics I, Physical Review
106(4)(1957) 620—630.
2. D. Cule and S. Torquato, Generating random media from limited microstructural
information via stochastic optimization, Journal of Applied Physics 86(6)(1999) 3428—
3437
3.P.S. Koutsourelakis, A general framework for simulating random multi-phase media,
NSF Workshop-Probability and Materials: From Nano to Macro scale (2005)
4. K. Sobczyk, Reconstruction of random material microstructures: patterns of Maximum
Entropy, Probabilistic Engineering Mechanics 18(2003) 279—287
5. S.C.Zhu et al, Filters, Random Fields and Maximum Entropy (FRAME): Towards a
Unified Theory for Texture Modeling, IJCV 27(1998) 107-126
6. A.Berger et.al., A maximum entropy approach to natural language modeling, (1996),
Computational Linguistics 22 (1996),39-71
CORNELL
U N I V E R S I T Y
Materials Process Design and Control Laboratory