Cellular Automata Generalized to an Inferential System

Download Report

Transcript Cellular Automata Generalized to an Inferential System

Cellular Automata Generalized To
An Inferential System
27th International Workshop on Bayesian Inference and
Maximum Entropy Methods in Science and Engineering,
Saratoga Springs, New York, 11 July 2007
David J. Blower
Cogon Systems
Pensacola FL
Motivation
Why is it
impossible to
predict the
behaviour of
a cellular
automaton?
The Motivating Question
But isn’t the quintessential feature
of probability theory and
inferential systems the ability to
predict future events?
Proposed Solution
Jaynes used
probability theory
to generalize
classical logic
functions.
Treat CA from an
inferential and
informational
point of view.
Why Examine Cellular Automata?
Cellular Automata are a stand-in for
any sufficiently detailed complicated
ontological explanation for how the
world works.
Issues in Order Addressed
Boolean Algebra
Logic Functions
Cellular Automata
Why Start with Boolean Algebra?
The following few slides on
Boolean Algebra are solely to set
the stage for analogous operations
with classical logic functions (and
cellular automata).
Some Basic Boolean Questions
1.How are Boolean functions defined?
2.How are syntactically correct Boolean
formulas produced?
3.What is a good canonical expression for
Boolean functions?
Why Is It Helpful?
1.Useful for both logic functions and cellular
automata
2.Axioms of Boolean Algebra used in
Bayes’s Theorem
3.Canonical expressions substituted for
complicated logic and CA rules
Formal Rules, Boolean Algebra,
and Probability Theory
Boolean Algebra on a finite carrier
set is a “closed” system, that is,
there is always an answer.
Moreover, neither numbers nor
arithmetic operations are required
to find that answer.
Perfect choice for discussing formal manipulation
rules of probability theory where the actual
numerical assignments are not the issue.
Boolean Algebra
The carrier set
Special Elements
Characterized by
the quintuple
Binary operators
Function definition: A mapping from the set of ordered
pairs of the carrier to an element in the carrier set.
Boolean Algebra
An example of a carrier set with four elements
All 16 ordered pairs from the carrier set
A mapping from an element of B x B into an element of B
Example of Boolean Function
Function Table
Boolean Formula
Substitute specific arguments
Boole’s Expansion Theorem
Any Boolean function can be expanded in the
following manner. Applying this theorem in a
recursive manner yields the disjunctive
normal form (DNF).
Disjunctive Normal Form
For example, here is the expansion of any Boolean
function
f(x,y)
with n = 2 arguments.
These are called the discriminants.
Disjunctive Normal Form
Repeating generic expansion from previous slide
Calculate discriminants and then substitute
Issues in Order Addressed
Boolean Algebra
Logic Functions
Cellular Automata
Logic Functions
A special case of Boolean Algebra
Different Notation
Functions with two arguments written in
generic Boolean Algebra notation and
then in Classical Logic notation.
Boole’s Expansion Theorem
Using Boole’s expansion theorem, the DNF
for any logic function now looks like,
The DNF for Logic Functions
Since the discriminants always take a
functional assignment of either TRUE or
FALSE, a typical expansion might look
something like this,
The logic function that returns B when arguments A and B given. f10 (A,B) on next slide.
All 16 Logic Functions
Classical Syllogism
A is TRUE
A implies B
Therefore, B is TRUE
Classical Syllogism
Recapitulate Jaynes’s demonstration generalizing
classical logic with probability theory (but here I emphasize
the Boolean Algebra aspects). *
* Chapter 2, pp. 35--36
Generalizing Classical Logic
Bayes’s Theorem
Substitute DNF expansion of Implication function A  B
Boolean operations reduce above to
Modus Ponens
Substitute the shortened DNF expansion for the
implication function in Bayes’s Theorem
Bayes’s Theorem now looks like this
Modus Ponens
Boolean operations on denominator
Boolean operations on numerator
Bayes’s Theorem
Different Approach
Now solve modus ponens using a
joint probability table.
The answer should be the same.
Joint Probability Table
Two statements A and B each take on only two values. There
are four cells in the joint probability table. The model assigning
these numerical values is the implication function AB.
Joint Probability Table
Cell 1
Cell 1
Cell 3
The same answer as before.
The model Mk assigns legitimate numerical
values to the joint statements in the four cells
of the jpt. The model is the implication
function.
Probability Theory Generalizes
Classical Logic
Here is an “invalid’’ logical argument, but one that is easily solved
using probability theory in exactly the same manner as before.
Assume the same logic function, but now B is TRUE.
What is the impact on A?
The “Invalid” Logical Argument
Solved by Probability Theory
Use the numerical assignments from the joint probability table.
Probability theory as a generalization of logic returns an answer, while
classical logic refuses to address the issue calling it “undecidable.”
Placement of 0s in JPT
Cell 3 indexes joint statement: A is TRUE and B is FALSE.
f13 (A, B) has functional assignment of FALSE if A is TRUE and B is FALSE
by very definition of  operator.
Therefore, cell 3 MUST HAVE a numerical assignment of 0
under this model.
Issues in Order Addressed
Boolean Algebra
Logic Functions
Cellular Automata
Elementary One-Dimensional
Cellular Automata
Wolfram’s famous example of a CA proven to
be a Universal Turing Machine
Elementary One-Dimensional
Cellular Automata
First few steps of a CA following Rule 110
Many Steps of CA Following
Rule 110
Interacting Localized Structures.
They compute anything that can
be computed!!
Our stand-in for a complicated detailed ontological model of the world.
Logic Functions with Three Variables
But Rule 110 is simply a logic function with three arguments instead
of the two arguments as we have examined previously in discussing
classical logic. There are a total of 256 possible logic functions with
three variables and Rule 110 is one of those 256 functions.
Here is Rule 110’s function table for all eight possible settings of the
three variables.
Black  T
White  F
The DNF for Rule 110
And, like any logic function, Rule
110 can be expanded via the DNF.
The DNF expansion of a three variable logic function, Rule 110.
The DNF and 0s in the JPT
And, just as we did when examining
modus ponens, we can use the DNF
expansion of a logic function to
locate the 0s in a joint probability
table.
We will employ the jpt as an easier alternative to the formal
Boolean operations in solving Bayes’s Theorem applied to CA.
Joint Probability Table with
Numerical Assignments Following
Rule 110
Placement of
0s dictated
by model
following
logic function
f110 (A,B,C)
Joint Probability Table with
Numerical Assignments Following
Rule 110
BN+1 cannot be
TRUE (black) if
AN, BN and CN
are also TRUE
(black)
Joint Probability Table with
Numerical Assignments Following
Rule 110
The DNF expansion of Rule
110 also tells us where the
non-0s must be placed.
If functional assignment
BN+1 is TRUE at these
five terms, non-zero
probability is assigned.
Or, 0s are placed where
functional assignment is
FALSE.
Bayes’s Theorem for Rule 110
Write out the generic template for updating
a state of knowledge (Bayes’s Theorem, of course) about
the cell to be updated given the colors of
three cells at the previous time step and the
numerical assignment following Rule 110.
Updating Color of Cell in CA
Using Bayes’s Theorem
Bayes’s Theorem with denominator expanded. Numerical
assignment follows from model implementing Rule 110.
Locate and insert values from jpt. Insert the numerical
assignment from Cell 12 in the numerator and the numerical
assignment from Cells 12 and 4 in the denominator.
What is the Point?
Jaynes generalized logic functions (Boolean functions) by
treating them from a probabilistic and inferential standpoint.
CA are composed of Boolean functions, so think of them not
from the deductive viewpoint, but rather from an
informational standpoint.
For example, let a wide range of models insert legitimate
numerical values into the cells of a joint probability table for a
CA. Suppose there is a lack of information about which
model correctly governs the evolution of the CA.
Then, as a consequence ensuing for all informational systems,
prediction at least becomes a feasible concept to explore.
Different Models Assign
Different Numerical Values
“Almost” the  model.
How Do We Predict Future
Events in an Inferential System?
We use the same formal manipulation
rules we always use in probability theory.
For example, to update a state of
knowledge about some future event FE
conditioned on some observed data D and
involving models assigning numerical
values Mk
Predicting Future Events in CA
The inferential approach provides a
quantitative way for the information
processor to update its state of knowledge
about the color of any cell in the cellular
automaton.
Here, we have lost information about which
model (rule) is governing the evolution of the CA.
We must average over the predictions made by 256 models
Predicting Many Cells of the CA
The prediction formula for updating arbitrarily many cells of the CA.
Joint probability table is
impossibly huge! (And any summation
is hard too!)
Any Existing Solutions?
We have to make the difficult
transition from micro-events to
macro-events in trying to predict the
far future for CA.
Historically, Statistical Physics faced
the same problem.
Macro-Statements Are Predicted
by Inferential Approach
Macro-Statements in the form of future frequency counts
can be predicted conditioned on past frequency counts.*
* See Jaynes, Chapter 18 , for discussion.
Useful for CA?
The same analytical apparatus could be
applied to predict frequency counts of
black and white cells for each of n cells at
some future Mth time step.
But is this type of macro-statement one Wolfram would accept
as meaningful? (What has been voluntarily discarded might be
viewed as very important!)
Can Techniques from Information
Geometry Help?
 Use concepts from Information
Geometry.
 Try a- projection from complicated
probability distribution at point P in
general manifold to computationally
tractable distribution at point Q within
sub-manifold. (Amari)
 Maximum Entropy Principle with
measure function m(x) (Jaynes)
 Mean Field Approximation in
Statistical Physics
 Belief Propagation algorithms
Sine qua non of predicting
In order to predict,
micro-information must be
voluntarily discarded!
Implications for CA
What kind of microinformation should be
discarded?
What kind of macrostatements make sense for
CA?
My Main Argument
 Wolfram is pessimistic about being able to predict the behaviour of
deductive systems like cellular automata into the far future.
 He is correct.
 However, (following Jaynes) if CA are treated from the perspective of
inference and information, then a limited form of prediction becomes
possible.
 This means that we must employ probability theory as the
generalization of classical logic, that is, we use inference instead of
deduction.
 It’s a repetition of the old analogy from statistical mechanics. We
voluntarily give up information about the micro details of molecular dynamics
in exchange for the ability to predict macrovariables.
 The moral: We MUST approach the problem of predicting the
consequences of any sufficiently detailed explanatory model of the
world through inference and information.
Summary of Technical Points




Boolean functions f(x1, x2, …xn) = f: Bn  B
for any number of variables.
Abstract, general, closed, do not involve
numbers or arithmetic operations, and are
easy to comprehend.
Talk focused on n=2 (CL) and n=3 (CA).
Perfect foil for probability theory’s formal
manipulation rules.
Summary of Technical Points




Any such Boolean function can be expanded
into the Disjunctive Normal Form.
A classical logic function is a Boolean
function and can be expressed via the DNF.
Logic problems (classical syllogisms) can be
solved by probability theory.
Joint probability tables illustrate how some
model assigns numerical values to joint
statements.
Summary of Technical Points




The DNF points to where 0s should be
placed in a jpt.
Different numerical assignments via
Information Geometry as dictated by
different models.
A CA is comprised from logic functions of
n=3 variables. There is a DNF expansion for
each of the 256 possible functions.
The DNF points to where 0s should be
inserted in a jpt for cellular automata.
Summary of Technical Points



CA can be solved and generalized using
Bayes’s Theorem, DNF and the jpt.
Therefore, all the relative advantages of
inferencing vs. deduction can be brought to
bear on the problem of PREDICTING the
future behaviour of CA.
For example, formal manipulation rules lead
to the prediction formula.
Summary of Technical Points
Use technical notions from Information
Geometry.
 Find computationally tractable
distributions that are “close” to
original complicated distribution.
 What is an appropriate macro-event
for a CA amenable to prediction?

Final Thought
Initial Conditions
Candidate Macro-Event: Wolfram’s localized
structures in Rule 110 CA interacting over time.
Isn’t it interesting to speculate that such macroevents conspire to produce an unexpected event
in the far future?
Surprising answer
My Book
(in progress)