Transcript Document

Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Detecting and Tracking Hostile Plans in the Hats World
Aram Galstyan and Paul Cohen
Center for Research on Unexpected Events (CRUE)
Information Sciences Institute
University of Southern California
U.S. Army Conference on Applied Statistics, Houston, 2007
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Outline
•
•
•
•
Detection and tracking in symbolic spaces
Plan recognition in a virtual domain: The Hats Game
Probabilistic framework for plan recognition
Scalability issues
– Approximate methods for tracking
– Hybrid probabilistic/incremental search approach
• Summary and future work
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Detection and Tracking of Plans
• Goals define plans, plan execution results in observations
Intentions
(Goals)
Plans
Actions
Observations
• Inference problem: Given a stream of observations, and some
behavioral model of the agent that generates those observation
– What are their most likely intentions of the agent?
– What kind of plan is he pursuing?
– Stage of the plan execution?
• Detect and track interesting plans/activities
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Classical Detection Theory and Plan Recognition
•
•
•
Large body of work on detection in metric spaces
Classical approach: detect and track
– Applications:Tracking missiles, aircraft, submarines, etc.
– Detection usually happens at relatively short time scales
– Linear (Kalman) filters and various extension
– Non-linear methods (HMM-s, particle filters)
Goal: Develop similar theory for symbolic spaces (plans, intentions)
– Plans/intentions are not directly observable must be inferred
– Detection might require long observation sequences
– Interesting activities are often obscured by benign activities
• Extremely low Signal to Noise Ratio (SNR)
– In adversarial settings the agents might engage in deliberately
deceptive behavior
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Plan Recognition for Intelligence Analysis
• Requirements for Intelligence Analysis scenarios:
– Should not assume known identities of hostile agents
– Must handle clutter generated by large number of benign
agents
– Scalable
• Terabytes of streaming data
• Tens of thousands to millions of agents
– Probabilistic, on-line, and recursive
• Incorporating new evidence incrementally
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
The Hats World
•
•
•
•
•
A simulation of ~105 agents (called hats) engaged in individual and collective
activities. Most agents are benign, very few (e.g., 20) are known adversaries, some
(e.g., 500) are covert adversaries.
Hats have shareable capabilities, passed from one hat to another at meetings.
Beacons have vulnerabilities. When a task force of hats possessing capabilities
that match a beacon's vulnerabilities meet at the beacon, it is destroyed.
Adversarial plans are sequences of meetings of many hats to pass capabilities from
one to another, culminating in destruction of a beacon
Hats belong to one or more organizations, which are not known and must be inferred
from meetings. Some are adversarial organizations, most are not.
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
The Hats Game
•
Goal: Given the record of meetings between hats, find and neutralize
the adversarial task force before the attack, while minimizing
the cost:
– Penalty for “arresting” wrong hats
– Information costs
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
A Hierarchical Model for Hats
The planner chooses an organization
that will carry out the attack, and the beacon
that will be attacked. Then a task--force (TF)
for carrying out the attack is chosen. The
planner then generates a meeting schedule
(or a meeting tree) for each TF member, so
that by the completion of those meetings
each TF member carries the assigned
capability.
Highlighted nodes describe a particular plan
instantiation:
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Plans in Hats
•
•
A plan is a sequence of meetings, structured as an inverted tree
– Designed for passing around capabilities and confusing the observer
The observer’s goal is to find this meeting tree (target graph) before the final
meeting takes place
time
Beacon attack
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Bayesian Filtering Approach
•
•
Treat the Hats world as a discrete time stochastic process, and develop
methods for estimating this process
The state of each hat at time t, x(t), is characterized by a set of
variables describing his hidden state,
– “Adversary” indicator variable:   {0,1}
– Intention to acquire a capability:   {0,1,..M}
– Actual capability carried at time t: t  {0,1,..M}





n

Yn







 n 1

Yn 1
DBN model of agent activity

Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Collective Dynamics
•
•
•
The collective state is characterized by a vector X(t) = (x1(t), x2(t),.., xN(t))
Our beliefs about the system are expressed in a pdf Pr[X(t)  X]
Dynamics is governed by some type of hidden Markov process,
– HMM with observation dependent state transitions
Hidden state

Observations
– Given the current state and observation, the next state is conditionally
independent of the past history
Pr[ X(t  1) | X(t),Y (t),..,X(1),Y (1)]  Pr[ X(t  1) | X(t),Y(t)]
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
HMM Definitions
•
Transition Model:
M(x, x ;y)  P(X t  x | X t1  x ,Yt1  y)
•

•
Evidence Generating (or Emission) Model
(y, x)  P(Yt  y | X t  x)
These functions are chosen based on expert knowledge, e.g.,
– Agents
from the same organization are more likely to meet

– An agent intending to acquire a capability will try to meet with
agents carrying that capability
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Bayesian Filtering
• Filtering distribution (joint distribution over the hidden state at
time t and observations up to time t):
(x, y 0t )  P(X t  x,Y0t  y 0t )
Y0t  {Y0 ,Y1,..,Yt }
• Bayesian recursive update formula

t (x, y 0t )  (x, y t ) M(x, x ;y t1)t1( x , y 0t1)
x 

• Given the state transition and evidence generation models, as
well as priors (initial beliefs), we can use this formula to update
out beliefs as new observations arrive.
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Example of a Transition Function
•
•
The adversarial status and intention variable do not change
The only change is due to capability trades, which are specified as follows:
– If our hat has no intention of acquiring a specific capability, then, with
probability  1 he will acquire the capability c of the known hat, and with
probability 1  will keep his old one.
1
– If the hat has already acquired the capability he intended to, then it will
remain unchanged.
– If the hat has not yet acquired its intended capability, then, if the second hat
carries that capability, our hat will acquire it with probability
2


Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Example of an Evidence Generating Function
• Agents from the same group meet with probability p1>0.5, while
agents from different groups meet with probability 1- p1
(homophily, guilt-by-association)
(1, 2 )  p1 1 , 2  (1 p1)(1  1 , 2 )

Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Results for the Bayesian GBA model
•
Given some known adversaries, find the covert adversaries using the
sequence of meetings between different hats
covert
benign
•
Able to correctly identify the coverts (high posteriors after number of
iterations) if p1 is sufficiently large
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Scaling Issues
• For sufficiently large systems it is not feasible to maintain and
update beliefs about all possible joint (collective) states
– E.g., need a table with 2N entries for N agents, binary states
• Flat state representation will not work, need alternatives
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Factorial HMM-s
•
•
Models simultaneous dynamics of multiple chains (hats)
Different chains contribute to the observation process
•
Note: Conditioned on the observation sequence, the state dynamics
across the chains are not independent
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Coupled Factorial HMM-s
• Different chains coupled not only through observations, but also
explicitly
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Coupled HMM-s for Hats
•
•
Hats are coupled via observations (meetings)
Observation is an act of interaction between two (or more) stochastic
processes
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Approximation for the Belief State
•
•
•
After long enough time all the hats will become “coupled”
– E.g., construct a graph by linking any two hats the have met with
each other. Then complexity is determined by the largest connected
component of the graph
Exact inference becomes infeasible for long observation sequences
Use factorization (Boyen-Koller approximation)
C
P(X n | y )  P˜ (X n | y )   P(X c,n | y 0n )
n
0
n
0
c1
•
•
X c,n are subsets (clusters) of variables (need not to be disjoint)
Exact inference is recovered if each connected component is a cluster

Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Factorization
•
Full factorization: Each cluster contains a single agent
N
P(X n | y )   P(X c,n | y 0n )
n
0
c1
•
Projection to the factorized distributions after the update
 P (x i )  M(x i , x i , x j )   (i  j | x i , x j ) 
Pni (x ni )  
n1
n1
n
n1
n1
n1
n1
Pj
Pj
i
x n1
•

•
n1
n1
Intuition: To update our beliefs for the agent i after its meeting with j, we
average the state transition and evidence generating functions over the
state of j (e.g., treat {i,j} as a noisy observation)
Goodness of the approximation depends on the mixing properties of
the Markov chain
– E.g., works well if the transition matrix is very stochastic
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Summary of Initial Results
•
•
Able to correctly infer the intention of an agent if
– The behavior is sufficiently repetitive (e.g., a hat needs to attend
many meetings to acquire the intended capability)
– The priors on capabilities are sufficiently accurate
Not so good at inferring the adversarial/benign status,
– For sufficiently large probability of observing a meeting between a
benign and adversarial hats
– For small fraction of initially known hats
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Other Scaling Issues
• Even if we have an efficient representation for P(X), testing
certain hypotheses might be computationally prohibitive
(combinatorial complexity)
– E.g., evaluating a hypothesis about a task force with m
members requires summing up ~Nm terms
• For sufficiently large problems, the simple strategy of
constructing and scoring all the hypotheses is prohibitively
expensive
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Solution: Hybrid Approach
•
Combine the Bayesian tracking framework with incremental search
– Start with simple hypotheses
– Build more complex hypotheses incrementally
– Prune irrelevant hypotheses and expand the promising ones
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Finding the Covert Plan using the Hybrid Approach
•
•
•
•
Start from a single meeting nodes
Score these nodes using the Bayesian state estimator
Expand promising nodes by adding another meeting to the graph
This is a search problem with costs determined by the Bayesian
estimator
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Open Research Questions
• What are the appropriate search strategies?
– Breadth first: should we commit our resources to analyze low
level data, hoping that something will flag a malicious intent?
– Depth first: or should we quickly build more complex
hypotheses, assuming that the global picture will emerge as
the hypotheses graph gets larger?
– Other heuristic search methods : A*, beam search.
• Initial results
– The choice of the optimal search strategy depends on the
accuracy of the Bayesian state estimator
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Summary
•
•
•
A general probabilistic framework for detecting and tracking adversarial
activities in a virtual environment
– Mines dynamical patterns (processes) rather than static structures
(highly connected sub-graphs, communities)
– Not domain specific, and not limited to Hats
• Underlying assumptions are very generic
• Agent activity is modeled through some type of HMM
• Hypotheses on coordinated activities are represented as graphs
Initial results on approximate filtering methods
Initial steps towards a hybrid approach for plan recognition in large scale
systems
– Bayesian framework for non-linear tracking
– Incremental Search for early hypotheses pruning
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Future Work
•
•
•
•
•
•
Experiments with the new factorial approximation schemes
– Factorization over larger clusters
– Augmenting the algorithm with a better GBA model
Other approximations of the belief states (particle filters)
Developing increasingly complex models of behavior
– Accounting for recognizing higher level policies
• Recruitment, change of tactics
• Formation, evolution and transformation of groups
Information Fusion at different levels of abstraction, e.g.,
– “Hat #29 behaves suspiciously”
– “Organization #2 is up to something”
Theoretical issues:“Trackability”:
– Under what circumstances one can track certain processes?
– Bounds on accuracy
Examining optimal search strategies for hypotheses building
Information Sciences Institute
Viterbi School of Engineering
University of Southern California
Detection on Networks
• Understanding and analyzing adversarial networks
– Static vs. dynamical properties of networks
• Most network analysis research is concerned with static patterns
– Strongly connected sub-graphs, communities
– Degree distribution, clustering, hubs, etc
• Certain patterns are inherently dynamical
– Can not be revealed by analyzing static structures
• Need approaches for mining dynamical patterns
– Detecting processes that represent hostile activities
– Tracking and predicting the evolution of such processes