A High Security Information System (Joe Johnson)

Download Report

Transcript A High Security Information System (Joe Johnson)

DARPA
Oasis PI Meeting
Santa Rosa, CA
August 2002
General Progress Report
Complex Problems Group
University of South Carolina
Joseph E. Johnson, PhD
DARPA OASIS PI Meeting
Santa Rosa CA - Hilton
Monday August 19, 2002
Overview


System replication for non-network threats
Network behavior sought: normal / aberrant



Network analysis – wavelets & dimensionality
Network classification – Markov type groups
Markov type group representations




New variables that are measures of the network topology
Especially entropy/information measures on graphs




As a basis for logical & numerical uncertainty
To extend Shannon information measure
To seek clusters
To define optimal wavelet basis
Some Details
Our current activity
Objectives

To utilize new network parameters that characterize the
network topology, especially new entropy/information
measures that will:







Help characterize networks flows: statics & dynamics in real time
Define clusters & structures for static graphs
Suggest the best basis for wavelet decompositions
By helping define the critical network dimensions (variables)
Which with multiple time scales defined by complexity theory
Provide sensitivity to aberrant network dynamics
And thus which will allow a network to be monitored in real time
with very rapid detection of and isolation of intrusion.
System replication
for non-network threats


Our group built IRIS, the emergency management
information system used by SC
Replication of IRIS has been studied & performed



USC, UU, Maui
IRIS is now being totally rewritten in a Java
environment (esp. struts & multi-tier) to achieve
isolation of components (ready by Dec 2002)
IRIS has modules for:

Donated Goods, Critical Facilities, WMD, Voice Recognition,
Mapping, Administration, Security, Damage Tracking, and
Person Tracking I
Network behavior sought:
normal / aberrant
Network analysis:
wavelets & dimensionality
Network classification
Markov type groups
Markov type group representations


As a basis for logical & numerical uncertainty
To extend Shannon information measure
New variables that are measures of
the network topology
Especially entropy/information
measures on graphs


To seek clusters
To define optimal wavelet basis
Some of the Details
Lie Groups and Lie Algebras
1.
2.
3.
Group theory provides a representation of
both exact and approximate symmetries using
continuous and discrete transformations.
Groups have a multiplication operation with
closure, associatively, a unit, and an inverse.
Lie Groups / Lie Algebras represent
transformations that are continuously
connected to the identity.
Examples:






Rotation Group x2 + y2 + z2 is invariant
Lorentz Group c2t2 - x2 - y2 - z2 is invariant (relativity).
Poincare Group c2dt2 - dx2 - dy2 - dz2 is invariant thus
allowing translations.
Unitary Group x*x + y*y + …. Or <a|b> invariant
including SU3, SUn (quantum mechanics)
Heisenberg Lie Algebra [x,p] = I
Creation & annihilation algebra [a+r,as] = Irs
Introduction to
Markov Processes
Markov Transformations





A Markov matrix transformation M preserves the sum of the
non-negative components of a vector upon which it acts with
no negative transformed components (thus x + y + z + w +…
is invariant) .
It follows that a Markov transformation must itself have nonnegative components with the sum of elements in each column
equal to 1.
Markov transformations have no inverse and thus they do not
form a group.
Markov transformations are useful in a variety of problems: such
as economic and population redistribution.
Markov transformation represent diffusion of the conserved
component such as information and thus increases entropy.
Markov Type Lie Groups
(Johnson J.E., Jour. Math. Phys.1985)





In the paper above, I suggested relaxing the non-negative condition to get a
‘Markov like’ Lie Group that preserves Sum xi but which allows for
unphysical components.
A careful choice of the associated Lie Algebra generators Lij gives Markov
transformations when non-negative combinations of Lij are used to generate
the transformation.
Thus the term: Markov-Type Lie Algebra or Group
One can see the connection to diffusion. The lack of an inverse results in a
loss of ‘information’.
Lij along with Lii diagonal generators form a basis of GL(n,R) = Markov
Subgroup M(n,R) + an Abelian Scale (growth) group A(n,r) – a new
decomposition of GL(n,R).
Network or Graph Theory
Adjacency (or Connectivity)
Matrix for a Network (graph)





An undirected graph consists of a set of points or
nodes (numbered 1, 2, …) connected by lines.
If node i is connected to node j then we represent this
with a matrix Lij = 1 else =0. Thus Lij contains the
topological connectivity of the graph.
The diagonal Lii is taken as 0 or 1 if a node is
considered to be (or not to be) connected to itself.
It also contains the superfluous numbering of nodes.
It is an unsolved problem to tell, from Lij, if two graphs
are topologically equivalent.
A Network as a MarkovDynamical Informational Flow




If we take the diagonal elements of the connectivity matrix so
that the sum over each column is ‘0’ then Lij is a Markov Lie
algebra generator for a transformation that preserves a vector
sum xi.
We could take this vector to be the information stored at the
node xi or water or electrical charge or whatever.
This gives us a dynamical system where information is moved
from node to node by equal bandwidth by all connections. L is
the time evolution operator with x(t) = exp(tL).
The eigenvectors are linear combinations of nodes with
information content decreasing at the rate of exp(tz) where z is
the associated eigenvalue. There is a strong analogy with exp(tH)
in quantum theory.
Network Dynamics




This work extends the customary graph theory with
the associated model of information (or water or
population) flow of a conserved quantity.
Thus one now has a dynamical physical model and
interpretation for the connectivity matrix as well as the
power of Lie group theory that can be applied to
network dynamics and topology.
The choice of 0 or 1 as diagonal elements can also be
achieved within this theory and represents exponential
growth or decay of the quantity at that node.
This also has the interpretation of forbidding or
inducing a revisit to a given node in the transitions.
Asymmetric & Directed Graphs &
Information Nonconservation


We can readily extend this work to dynamical problems on:
 Different data transfer rates between nodes where Lij is not
equal to 1 but is still symmetric.
 Directed graphs where Lij is not symmetric (but rather has
the values 0 and 1 indicating flow direction).
 Graphs which allow for the creation and annihilation of
information at the nodal points.
With these collective generalizations, one obtains the
transformations of GL(n,R) with a restricted Lie algebra
parameter space.
An Approach to
Network Classification
Self Connectivity Defined




It is known that these eigenvalue sets are almost but not
quite isomorphic to the topologically different graphs
as some graphs are isospectral.
Define L1ij = 1 or 0 as before when i and j are not
equal.
Save the diagonal vector and reset the diagonal terms
=0 because we do not want to allow a transition from a
node back to itself.
Now define L2 = L1 L1 via normal matrix multiplication
and set the diagonal terms =0. Then define L3 = L1 L2
+ L2 L1 , etc up to L2n-2 where n is the number of
nodes.


This gives a sequence of 2n-2 matrices. We
require 2n-2 possible transitions to ‘feel out’ the
paths from a node and back to that node again.
We now extract the diagonal components to
construct an (2n-2) x n matrix with columns
labeled by node number and the rows labeled by
the power of the matrix.
Self-Connectivity Matrix Defined



We now reorder the columns by sorting the values (ascending) in
order of the first row. Then for each set of identical values in
the first row, we resort the columns in the second row
(ascending).
The final matrix gives a relatively unique order to the nodes but
it is not proven that it is unique. To the extent that the new
column order is unique, one obtains a natural numbering of the
nodes.
This matrix will constitute the first part of identification of the
topology. We call this the Self-Connectivity Matrix.
Interconnectivity Matrices
Defined





Return now to the first n powers of L including the
first power.
Note that each power is a symmetrized product of
matrices and thus is symmetric and has real eigenvalues.
We call this the n^2 matrix of eigenvalues, the
interconnectivity eigenvalue matrix as it describes
interconnectivities.
It is independent of the numbering of the nodes.
Each of the eigenvalue sets represents the normal
nodes of a Markov transformation that takes n steps as
an infinitesimal motion.



Set all diagonal values equal to the negative of the sum
of the remaining column elements in that matrix thus
giving a set of Markov Lie generators.
Find the eigenvalues and eigenvectors of these matrices
and group these n eigenvectors as rows in a new matrix
and sort each row by ascending associated eigenvalues
(placed in an associated column ‘0’) for each power.
This gives an (n+1) x n^2 matrix of the eigenvalues and
eigenvectors that correspond to dynamic evolution of
the system when one dynamically evolves with multiple
node connectivity.
Graph Description




Neither the n x (2n-1) self connectivity matrix nor the n x n interconnectivity
eigenvalues are dependent upon the ordering of the nodes but only on the
topology.
It is our hope that this removes much of the isospectral aspects of a graphs
description.
We would adjoin the sequence of eigenvector matrices at each of the n levels
where one sorts the rows by ascending order of the associated eigenvalues,
for each power, and sorts the columns by the node ordering prescribed by the
final ordering that results from the self connectivity matrix as described
above.
Any degeneracy that remains from the self-connectivity matrix is to be
resolved by ordering the components of the eigenvectors by the lowest order
values that differ.


We are studying the extent to which this method
removes the degeneracy's and identifies the
topology of the graph.
We are also studying the utility of the
connectivity matrix as a Lie generator for
transformation flows of information on a
network.
Current Activity



Study the self connectivities and eigenvalues of
the interconnectivity matrix as possible
dimensions sensitive to network dynamics
Study these variables as a way to identify the
optimal wavelet basis functions for describing
network dynamics
Identify variables with the maximum
sensitivities to normal verses aberrant behavior.

Thank You
The Importance of Lie Algebras




We note that the foundations of each of the important
physical theories are related to a Lie algebra (or group).
Is there another Lie algebra that is critical and
foundational ?
Maybe general relativity, string theory, a theory of
everything,
Let us consider a set of problems that is somewhat
intriguing:
Information





Information is closely related to entropy – and is often called
negentropy I = - ln P.
Information is related to order just as entropy is related to
disorder thus to study one is to study another.
Entropy gives a direction to time by the second law of
thermodynamics.
But according to physical laws we have time reversal invariance.
But we all know that this is no conflict because it is just
‘improbable’ to reverse a system and achieve decreased entropy.
Information - problems




All scientific measurements contain error and are represented by
‘numbers’, actually pairs of numbers, that represent a mean value
and the error. But these distributions do not close under +-*/.
The human mind is so incredible because it can manage fuzzy
logic, fuzzy calculations, and numerical uncertainty. But there is
no clear mathematical structure on how to do this.
How does a living organism advance in time, increasing order,
while fighting entropy.
Do we really know how to define information below 1/0 ?
Information – more problems




If so why cannot we just program a computer to compute a
series expansion or subroutine ‘as long as it contributes to the
information of the term being computed?
How much information is in the value: An 81% chance of rain.
How do we effectively computer the cost or value ($) of
information.
Do we really have a grasp on how mathematical operations
degrade information when acting on uncertain numbers? Are
there optimal computational paths?
Information – still more problems


Consider the measurement problem in quantum
mechanics: Do we all feel confident that we exactly and
mathematically understand the information gained in a
measurement?
To me it seems intractably difficult to get to the bottom
of the entanglement between a quantum system
interacting with a classical measuring system that
extracts ‘information’ and forces it into a state.
Information from a Lie group?




Obviously group theory cannot help because a group
always has an inverse.
Since one cannot undo entropy and diffusion then a
group is not a reasonable avenue.
But I will contend that a Lie group might give the same
insight into Information and all of these problems as
was the case elsewhere in physics.
A Lie Algebra of ‘Information’


When we look at diffusion, we perform a
transformation that shifts things around but
maintains the total sum of things.
My talk will center on transformations that leave
the sum of non-negative values invariant: x + y
+ z + …. The Markov “Group”