Statistical Relational AI

Download Report

Transcript Statistical Relational AI

CSE 574 – Artificial Intelligence II
Statistical Relational Learning
Instructor:
Pedro Domingos
Logistics
• Instructor: Pedro Domingos
Email: [email protected]
Office: 648 Allen Center
Office hours: Wednesdays 4:30-5:30
• TA: Stanley Kok
Email: [email protected]
Office: 216 Allen Center
Office hours: Mondays 4:30-5:30
• Web: www.cs.washington.edu/574
• Mailing list: cse574
Evaluation
• Seminar (Pass/Fail)
• Project (100% of grade)
–
–
–
–
Proposals due April 8
Progress report due May 6
Presentation in class
Final report due June 3
Materials
• L. Getoor & B. Taskar (eds.), Statistical
Relational Learning, MIT Press (to appear).
– Draft chapters
– Feedback for authors
• Papers
Topics
• Background
• SRL approaches
• SRL problems and applications
Background
• Statistical learning
• Inductive logic programming
• Sequential and spatial models
SRL Approaches
•
•
•
•
•
•
Probabilistic relational models
Stochastic logic programs
Bayesian logic programs
Relational Markov networks
Markov logic networks
Etc.
SRL Problems and Applications
•
•
•
•
•
•
Aggregation
Autocorrelation
Information extraction and NLP
Biology and medicine
Relational reinforcement learning
Etc.
Today: Introduction
• Motivation
–
–
–
–
The AI view
The data mining view
The statistical view
The computer science view
• Applications
• Major problem types
• A map of the field
The AI View
Statistical Relational AI
Probability
First-Order Logic
Propositional Logic
The Data Mining View
•
•
•
•
•
Most databases contain multiple tables
Data mining algorithms assume one table
Manual conversion: slow, costly bottleneck
Important patterns may be missed
Solution: Multi-relational data mining
The Statistical View
• Most statistical models assume i.i.d. data
(independent and identically distributed)
• A few assume simple regular dependence
(e.g., Markov chain)
• This is a huge restriction – Let’s remove it!
– Allow dependencies between samples
– Allow samples with different distributions
The Computer Science View
• CS faces a complexity bottleneck
– Cost of hand-coding
– Brittleness
• Machine learning and probability overcome this
• But they mostly apply only to attribute vectors
• Let’s extend them to handle structured objects,
class hierarchies, relational databases, etc.
Applications
• Bottom line: Using statistical and relational
information gives better results
–
–
–
–
–
–
–
Web search (Brin & Page, WWW-98)
Text classification (Chakrabarti et al, SIGMOD-98)
Marketing (Domingos & Richardson, KDD-01)
Record linkage (Pasula et al, NIPS-02)
Gene expression (Segal et al, UAI-03)
Information extraction (McCallum & Wellner, NIPS-04)
Etc.
Major Problem Types
•
•
•
•
•
•
•
•
Collective classification
Link discovery
Link-based search
Link-based clustering
Social network analysis
Object identification
Transfer learning
Etc.
A Map of the Field
• There are many approaches
(“Alphabet soup”)
• Every year new ones are proposed
(and for good reason)
• Key is to understand the major dimensions
along which approaches can differ
Major Dimensions
•
•
•
•
•
Probabilistic language
Logical language
Type of learning
Type of inference
Aggregation
Probabilistic Language
•
•
•
•
Bayesian networks
Markov networks (aka Markov random fields)
Restrictions of these (e.g., logistic regression)
Probabilistic context-free grammars
Logical Language
•
•
•
•
Prolog / Horn clauses
Frame systems / Description logics
Conjunctive database queries
Full first-order logic
Type of Learning
• Generative vs. discriminative
• Structure vs. parameters
• Knowledge-poor vs. knowledge-rich
Type of Inference
• Marginal/conditional vs. MAP
– Marg./cond.: MCMC, belief propagation, etc.
– MAP: Graph cuts, weighted satisfiability, etc.
• Full grounding vs. KBMC
Aggregation
• Quantifiers
• SQL-like aggregators
(MAX, AVG, SUM, COUNT, MODE, etc.)
• Noisy-OR
• Logistic regression
Examples
• Probabilistic relational models
(Friedman et al, IJCAI-99)
• Stochastic logic programs
(Muggleton, SRL-00)
• Bayesian logic programs
(Kersting & De Raedt, ILP-01)
• Relational Markov networks
(Taskar et al, UAI-02)
• Markov logic networks
(Richardson & Domingos, SRL-04)