Multi-Relational Data Mining: An Introduction

Download Report

Transcript Multi-Relational Data Mining: An Introduction

Multi-Relational Data
Mining: An
Introduction
Joe Paulowskey
Overview


Introduction to Data Mining
Relational
 Data
 Patterns




Inductive Logic Programming (ILP)
Relational Association Rules
Relational Decision Trees
Relation Distance-Based Approaches
Relation Data

Relational Database
 Multiple
 Defined
Views
 Tables

Tables
Relational Pattern

Multiple Relations from a relational
database
 More

Expressive
Opens up
 Classification
 Association
 Regression
Relational Pattern (Cont.)

Expressed in Subsets of First Order Logic
Data Mining


Look for patterns in data
What do you discover?




Associations
Sequences
Classifications
Goals of Data Mining




Predict
Identify
Classify
Optimize

Uses




Business Data
Environmental/Traffic
Engineering
Web Mining
Drug Design
Data Mining:
Relational Databases

Most Data Mining approaches deal with
single tables
 Not
safe to merge multiple tables into one
single table

Number of patterns increases
 Explicit
constraints required
Inductive Logic Programming (ILP)
Logic Programs used to find patterns
 Clauses

 Head
and Body
 Literals
 Types
Definite
 Program

ILP (Cont)

Predicate
 Relations
in relational database
 Arguments -> Attributes

Attributes are Typed
Database Clauses are typed program
clauses
 Deductive Database

Relational Rule Induction ILP
Learn logical definitions of relations
 Classification

 Rules
can be found by decision trees
 Simple Algorithm

Dealing with noisy/incomplete data
ILP Problems to Propositional
Forms

Propositional
 attribute-value
Use Single Table Data Mining algorithms
 LINUS

 Background
Knowledge
ILP/RDM Algorithms

Share
 Learning

as a Search Paradigm
Differences
 Representation
of Data, Patterns
 Refinement operators
 Testing Coverage

Upgrading from Propositional to Relational
Relational Association Rules

Frequent Patterns
 Determining
Frequency
 Itemsets

Association Rules
 Obtained
by frequent itemsets
Relational Decision Trees
Used for Prediction
 Binary Trees
 First Order Decision List

Relational Distance-Based
Approaches
Calculated distance between two objects
 Statistical Approaches

Conclusion