Presentation to KSU-CIS Advisory Board, 09/15/2000

Download Report

Transcript Presentation to KSU-CIS Advisory Board, 09/15/2000

Real-Time Bayesian Network Inference for
Decision Support in Personnel Management:
Report on Research Activities
William H. Hsu, Computing and Information Sciences
Haipeng Guo, Computing and Information Sciences
Shing I Chang, Industrial and Manufacturing Systems Engineering
Kansas State University
http://groups.yahoo.com/group/onr-mpp
This presentation is:
http://www.kddresearch.org/KSU/CIS/ONR-2002-Jun-04.ppt
Kansas State University
Department of Computing and Information Sciences
Overview
• Knowledge Discovery in Databases (KDD)
– Towards scalable data mining
– Applications of KDD: learning and reasoning
• Building Causal Models for Decision Support
• Time Series and Model Integration
– Prognostic (prediction and monitoring) applications
– Crisis monitoring and simulation
– Anomaly, intrusion, fraud detection
– Web log analysis
– Applying high-performance neural, genetic, Bayesian computation
• Information Retrieval: Document Categorization, Text Mining
– Business intelligence applications (e.g., patents)
– “Web mining”: dynamic indexing and document analysis
• High-Performance KDD Program at K-State
Kansas State University
Department of Computing and Information Sciences
High-Performance Database Mining and KDD:
Current Research Programs at K-State
•
Laboratory for Knowledge Discovery in Databases (KDD)
– Research emphases: machine learning, reasoning under uncertainty
– Applications
• Decision support
• Digital libraries and information retrieval
• Remote sensing, robot vision and control
• Human-Computer Interaction (HCI) - e.g., simulation-based training
• Computational science and engineering (CSE)
•
Curriculum and Research Development
– Real-time automated reasoning (inference)
– Machine learning
– Probabilistic models for multi-objective optimization
– Intelligent displays: visualization of diagrammatic models
– Knowledge-based expert systems, data modeling for KDD
Kansas State University
Department of Computing and Information Sciences
Stages of Data Mining and
Knowledge Discovery in Databases
Kansas State University
Department of Computing and Information Sciences
Visual Programming:
Java-Based Software Development Platform
D2K © 2002 National Center for Supercomputing Applications (NCSA)
Used with permission.
Kansas State University
Department of Computing and Information Sciences
Bayesian Belief Networks (BBNS):
Definition
•
Conditional Independence
– X is conditionally independent (CI) from Y given Z (sometimes written X  Y | Z) iff
P(X | Y, Z) = P(X | Z) for all values of X, Y, and Z
– Example: P(Thunder | Rain, Lightning) = P(Thunder | Lightning)  T  R | L
•
Bayesian Network
– Directed graph model of conditional dependence assertions (or CI assumptions)
– Vertices (nodes): denote events (each a random variable)
– Edges (arcs, links): denote conditional dependencies
•
•
n
General Product (Chain) Rule for BBNs
P X 1 , X 2 , , X n    P X i | parentsX i 
i 1
Example (“Sprinkler” BBN)
Sprinkler: On, Off
Season:
Spring
Summer X
1
Fall
Winter
X2
Ground:
Wet, Dry
X4
X3
X5
Ground:
Slippery, Not-Slippery
Rain: None, Drizzle, Steady, Downpour
P(Summer, Off, Drizzle, Wet, Not-Slippery) = P(S) · P(O | S) · P(D | S) · P(W | O, D) · P(N | W)
Kansas State University
Department of Computing and Information Sciences
Bayesian Networks and
Recommender Systems
•
Current Research
– Efficient BBN inference (parallel, multi-threaded Lauritzen-Spiegelhalter in D2K)
– Hybrid quantitative and qualitative inference (“simulation”)
– Continuous variables and hybrid (discrete/continuous) BBNs
– Induction of hidden variables
– Local structure: localized constraints and assumptions, e.g., Noisy-OR BBNs
– Online learning
• Incrementality (aka lifelong, situated, in vivo learning)
• Ability to change network structure during inferential process
– Polytree structure learning (tree decomposition): alternatives to Chow-Liu
– Complexity of learning, inference in restricted classes of BBNs
•
Future Work
– Decision networks aka influence diagrams (BBN + utility)
– Anytime / real-time BBN inference for time-constrained decision support
– Some temporal models: Dynamic Bayesian Networks (DBNs)
Kansas State University
Department of Computing and Information Sciences
Data Mining:
Development Cycle
•
•
Model Identification
– Queries: classification, assignment
60
– Specification of data model
50
– Grouping of attributes by type
40
Prediction Objective Identification
Effort (%)
•
30
– Assignment specification
20
– Identification of metrics
10
Reduction
0
– Refinement of data model
Objective
Determination
Data Preparation
Machine
Learning
Analysis &
Assimilation
– Selection of relevant data (quantitative, qualitative)
•
Synthesis: New Attributes
•
Integration: Multiple Data Sources (e.g., Enlisted Master File, Surveys)
Environment
(Data Model)
Learning
Element
Knowledge
Base
Decision Support
System
Kansas State University
Department of Computing and Information Sciences
Learning Bayesian Networks:
Gradient Ascent
•
Algorithm Train-BN (D)
– Let wijk denote one entry in the CPT for variable Yi in the network
• wijk = P(Yi = yij | parents(Yi) = <the list uik of values>)
• e.g., if Yi  Campfire, then (for example) uik  <Storm = T, BusTourGroup = F>
– WHILE termination condition not met DO
// perform gradient ascent
• Update all CPT entries wijk using training data D
w ijk  w ijk  r 
Ph y ij , uik | x 
xD
Storm
w ijk
Bus
TourGroup
• Renormalize wijk to assure invariants:
w
j
ijk
1
Lightning
Campfire
j . 0  w ijk  1
•
Applying Train-BN
– Learns CPT values
– Useful in case of known structure
Thunder
ForestFire
– Key problems: learning structure from data, approximate inference
Kansas State University
Department of Computing and Information Sciences
Scores for Learning Structure:
The Role of Inference
•
General-Case BBN Structure Learning: Use Inference to Compute Scores
•
Recall: Bayesian Inference aka Bayesian Reasoning
– Assumption: h  H are mutually exclusive and exhaustive
– Optimal strategy: combine predictions of hypotheses in proportion to likelihood
• Compute conditional probability of hypothesis h given observed data D
• i.e., compute expectation over unknown h for unseen cases
• Let h  structure, parameters   CPTs




P x m 1 | D  P x 1 , x 2 , , x n | x 1 , x 2  , , x m


P x m 1 | D, h  P h | D 

 
 


hH
Posterior Score
Marginal Likelihood
Prior over Parameters
P h | D   P D | h   P h 
 P h   P D | h, Θ   P Θ | h  dΘ

Prior over Structures
Likelihood
Kansas State University
Department of Computing and Information Sciences
Learning Structure:
K2 Algorithm and ALARM
•
Algorithm Learn-BBN-Structure-K2 (D, Max-Parents)
FOR i  1 to n DO
// arbitrary ordering of variables {x1, x2, …, xn}
WHILE (Parents[xi].Size < Max-Parents) DO
// find best candidate parent
Best  argmaxj>i (P(D | xj  Parents[xi])
// max Dirichlet score
IF (Parents[xi] + Best).Score > Parents[xi].Score) THEN Parents[xi] += Best
RETURN ({Parents[xi] | i  {1, 2, …, n}})
•
A Logical Alarm Reduction Mechanism [Beinlich et al, 1989]
– BBN model for patient monitoring in surgical anesthesia
– Vertices (37): findings (e.g., esophageal intubation), intermediates, observables
– K2: found BBN different in only 1 edge from gold standard (elicited from expert)
10
19
6
5
4
21
20
27
15
11
29
28
25
1
2
26
3
7
13
23
8
12
16
36
31 32
17
18
22
34
35
3
7
24
9
33
14
30
Kansas State University
Department of Computing and Information Sciences
Major Software Releases, FY 2002
•
Bayesian Network Tools in Java (BNJ)
– v1.0a released Wed 08 May 2002 to www.Sourceforge.net
– Key features
• Standardized data format (XML)
• Existing algorithms: inference, structure learning, data generation
– Experimental results
• Improved structure learning using K2, inference-based validation
• Adaptive importance sampling (AIS) inference competitive with best
published algorithms
•
Machine Learning in Java (MLJ)
– v1.0a released Fri 10 May 2002 to www.Sourceforge.net
– Key features: (3) inductive learning algorithms from MLC++, (2) inductive
learning wrappers (1 from MLC++, 1 from GA literature)
– Experimental results
• Genetic wrappers for feature subset selection: Jenesis, MLJ-CHC
• Overfitting control in supervised inductive learning for classification
Kansas State University
Department of Computing and Information Sciences
Bayesian Network Tools in Java (BNJ)
•
About BNJ
– v1.0a, 08 May 2002: 26000+ lines of Java code, GNU Public License (GPL)
– http://www.kddresearch.org/Groups/Probabilistic-Reasoning/BNJ
– Key features [Perry, Stilson, Guo, Hsu, 2002]
• XML BN Interchange Format (XBN) converter – to serve 7 client formats
(MSBN, Hugin, SPI, IDEAL, Ergo, TETRAD, Bayesware)
• Full exact inference: Lauritzen-Spiegelhalter (Hugin) algorithm
• Five (5) importance sampling algorithms: forward simulation (likelihood
weighting) [Shachter and Peot, 1990], probabilistic logic sampling
[Henrion, 1986], backward sampling [Fung and del Favero, 1995] selfimportance sampling [Shachter and Peot, 1990], adaptive importance
sampling [Cheng and Druzdzel, 2000]
• Data generator
•
Published Research with Applications to Personnel Science
– Recent work
• GA for improved structure learning: results in [HGPS02a; HGPS02b]
• Real-time inference framework – multifractal analysis [GH02b]
– Current work: prediction – migration trends (EMF); Sparse Candidate
– Planned continuation: (dynamic) decision networks; continuous BNs
Kansas State University
Department of Computing and Information Sciences
GA for BN Structure Learning
[Hsu, Guo, Perry, Stilson, GECCO-2002]
Dtrain (Inductive Learning)
D:
Training Data
[B] Representation Evaluator
for Learning Problems
Dval (Inference)

Ie:
Inference Specification
α
f(α)
Representation
Candidate
Fitness
Representation
[A] Genetic Algorithm
Change of Representation
and Inductive Bias Control
αˆ
Optimized
Representation
Kansas State University
Department of Computing and Information Sciences
Model-Based Validation
[Hsu, Guo, Perry, Stilson, GECCO-2002]
Dtrain (Model Training)
[i] Inductive Learning
(Parameter Estimation
from Training Data)
h
Hypothesis
Dval (Model Validation by Inference)

Ie :
Evidence Specification
[ii] Validation
(Measurement
of Inferential
Loss)
[B] Representation Evaluator
for Input Specifications
f(α)
α
Specification Fitness
(Inferential Loss)
Candidate
Input Specification
Kansas State University
Department of Computing and Information Sciences
BNJ: Integrated Tool for
Bayesian Network Learning and Inference
XML Bayesian Network Learned from Data using K2 in BNJ
Kansas State University
Department of Computing and Information Sciences
Machine Learning in Java (MLJ)
•
About MLJ
– v1.0a, 10 May 2002: 24000+ lines of Java code, GNU Public License (GPL)
– http://www.kddresearch.org/Groups/Machine-Learning/MLJ
– Key features [Hsu, Schmidt, Louis, 2002]
• Conformant to MLC++ input-output specification
• Three (3) inductive learning algorithms: ID3, C4.5, discrete Naïve Bayes
• Two (2) wrapper inducers: feature subset selection [Kohavi and John,
1997], CHC [Eshelman, 1990; Guerra-Salcedo and Whitley, 1999]
•
Published Research with Applications to Personnel Science
– Recent work
• Multi-agent learning [GH01, GH02a]
• Genetic feature selection wrappers [HSL02, HWRC02, HS02]
– Current work: WEKA compatibility, parallel online continuous arcing
– Planned continuations
• New inducers: instance-based (k-nearest-neighbor), sequential rule
covering, feedforward artificial neural network (multi-layer perceptron)
• New wrappers: theory-guided constructive induction, boosting (Arc-x4,
AdaBoost.M1, POCA)
• Integration of reinforcement learning (RL) inducers
Kansas State University
Department of Computing and Information Sciences
Infrastructure for High-Performance
Computation in Data Mining
Rapid KDD Development Environment: Operational Overview
Kansas State University
Department of Computing and Information Sciences
National Center for Supercomputing
Applications (NCSA) D2K
Kansas State University
Department of Computing and Information Sciences
Visual Programming Interface (Java):
Parallel Genetic Algorithms
Kansas State University
Department of Computing and Information Sciences
Time Series Modeling and Prediction:
Integration with Information Visualization
New Time Series Visualization System (Java3D)
Kansas State University
Department of Computing and Information Sciences
Demographics-Based Clustering for
Prediction (Continuing Research)
DimensionalityReducing
Projection (x’)
Clusters of
Similar Records
Delaunay
Triangulation
Voronoi
(Nearest Neighbor)
Diagram (y)
Cluster Formation and Segmentation Algorithm (Sketch)
Kansas State University
Department of Computing and Information Sciences
Data Clustering in
Interactive Real-Time Decision Support
15 × 15 Self-Organizing Map
(U-Matrix Output)
Cluster Map
(Personnel Database)
Kansas State University
Department of Computing and Information Sciences
Summary:
State of High-Performance KDD at KSU-CIS
•
Laboratory for Knowledge Discovery in Databases (KDD)
– Applications: interdisciplinary research programs at K-State, FY 2002
• Decision support, optimization (Hsu, CIS; Chang, IMSE)
• (NSF EPSCoR) Bioinformatics – gene expression modeling (Hsu, CIS;
Welch, Agronomy; Roe, Biology; Das, EECE)
• Digital libraries, info retrieval (Hsu, CIS; Zollman, Physics; Math, Art)
• Human-Computer Interaction (HCI) - e.g., simulation-based training
•
Curriculum Development
– Real-time intelligent systems (Chang, Hsu, Neilsen, Singh)
– Machine learning and artificial intelligence; info visualization (Hsu)
– Other: bioinformatics, digital libraries, robotics, DBMS
•
Research Partnerships
– NCSA: National Computational Science Alliance, National Center for
Supercomputing Applications
– Defense (ONR, ARL, DARPA), Industry (Raytheon)
•
Publications, More Info: http://www.kddresearch.org
Kansas State University
Department of Computing and Information Sciences