Intelligent Database

Download Report

Transcript Intelligent Database

Intelligent Databases
An Overview of Ideas and Developments
Jenny Carter – De Montfort University, UK
Web ref: www.jennycarter.com
1
Introduction
 Overview of developments & research w.r.t.
Database/AI integration
 Active databases
 Overview of KBS (Knowledge Based
Systems)
 Deductive Databases
 Coupling of KBS and standard DBMS
 State of the Art
 Other developments
2
1. Active Databases
 Traditional databases are passive: i.e. queries,
updates, transactions executed only when requested.
 Certain applications e.g. inventory control, factory
automation, etc. are not well supported by passive
DBMS
 Capabilities such as automatic monitoring of
conditions & ability to take actions (e.g. re: timing)
require an ACTIVE DBMS.
 Uses the idea of TRIGGERS.
3
1. Active Databases
 Two initial approaches:

specific code in applications programs to perform these
tasks (problems – maintenance can be difficult – conditions/actions might
be spread over a few applications programs. Also, can be hard to
understand such code fragments)

building special applications software that periodically polls
the DB to determine relevant events (generally all coded in one
application program. Frequency of poling is an issue though)
 Due to problems with both of these methods, many systems
extended with built in centralised sub-system to provide active
capabilities (i.e. active rules, or Triggers)
4
Basic Concepts of Active rules
Active rules are in form ECA:
On Event IF Condition Do Action
e.g.
On Update of Employees Salary If the new salary < 10000 Do rollback the
update
 Example events: insertions/ deletions/ updates on columns; temporal
events - time when rule should be activated; application defined events – can
be external to database e.g. temperature as measured by a sensor.
Application needs to tell DBMS.
 Example conditions: like WHERE clause in an SQL statement, or even a
complete query. Or from procedure written in host language with possible
embedded database queries. Can be related to special system variables e.g.
Current User etc.
 Actions: data updates, further queries, other DB operations (commit, abort,
etc.), calls to applications procedures.
5
Active Rules
 Format for writing a trigger in oracle is:
Create trigger name
{before | after} {insert | delete | update [of list-of
column-names]}
on table-name
[referencing references]
[for each row]
[when condition]
PL/SQL block;
6
Example of Trigger in Oracle
create trigger salary_check
before insert OR update of salary, job
on employee
for each row
when (new.job <> ‘PRESIDENT’)
declare
/*start of PL/SQL block */
minsal number;
Maxsal number;
begin
select minsal, maxsal from sal_guide
where job =:new.job;
if(:new.sal < minsal OR :new.sal > maxsal)
then raise_application_error(-20601, ‘salary’||:new.sal||
‘out of range for job’ ||:new.job||’for employee’||:new.name);
end if;
end;
/* end of PL/SQL block*/
Oracle provides commands for trigger management. E.g. alter trigger, drop trigger, enable,
disable etc.
7
Active Databases
 Most work on active DBs is associated with RDBMS




rather than OODBs.
Partly because of OODBs having methods
incorporated as well as data
Also because of complexity that including rules would
cause – scope issues due to inheritance/ overriding
features etc.
Many attempts have taken the approach that rules
apply to whole class.
There are a number of research prototypes, & work in
this area is ongoing.
8
2. Brief overview of Knowledge
Based Systems (KBS)
 KBSs differ substantially from traditional DBs. They
contain rules (as well as simple facts) and they have
an inference engine.
 Two main types of representation for KBS are:

Rule based – supporting inference by resolution (could
be in Expert System form, or logic programming form)

Frame Based – supports inference by inheritance.
9
Example Expert System in Leonardo
10
KBS – Rule Based
 The MYCIN system is probably the best-known example (&
historically important) of an Expert System. Built in the mid 70s
& containing about 500 rules, it was designed to perform
medical diagnosis in the field of bacterial infections.
 Example rule from MYCIN is:
IF
The infection type is primary-bacteremia, and
The site of the culture is one of the sterile sites, and
The suspected portal of entry of the organism is the gastro
intestinal tract
THEN
There is suggestive evidence (0.7) that the identity of the organism
is bacteroides.
11
KBS – Frame Based
 Inheritance is one of the most powerful and
popular concepts used in AI.
 allows grouping of similar notions into
classes, economise on descriptions of some
of the attributes;
 allows deductions to be made about
properties of lower level entities;
 allows definition of new classes as variants of
existing ones.
12
A simple inheritance hierarchy
Need to incorporate possibilities for over riding where necessary (e.g. all
elephants are grey, except for a particular known instance, Nellie, who is
pink.)
13
3. Deductive Database Systems
 DBs need to store & manage data from which
users can extract relevant information
 Difficult where there is large amounts of
complex data
 More difficult when information must be
derived according to some complex rules
 An approach to this might be to code rules
into application programs
14
3. Deductive Databases
 Deductive databases attempt to solve the problem by
storing explicit data and deductive rules that enable
inferences to be made from stored data.
 Data obtained via action of deductive rules on stored
data is known as Derived Data
 Deductive databases are therefore the result of
combining logic programming with traditional
databases.
 Characterised by handling large amounts of data as
well as performing reasoning based on that
information.
15
Basic Concepts of Deductive DBs
 Includes set of data – FACTS (sometimes known as
the extensional database)
 Includes set of inference rules – RULES (sometimes
known as the intentional database)
 The DATALOG language offers an approach to this.
 It is a combination of a database and the logic
language Prolog.
 It allows definition of both tables & rules.
 Includes facilities for defining integrity constraints etc.
 Easier to store facts than with a logic programming
language.
16
Prolog Database – Simple Example
17
Prolog database with facts and rules
A possible query might be:
?-weather(X)
18
Deductive DB Architectures
An example of a heterogeneous system
known as NAIL (Not Another
implementation in Logic), developed at
Stanford University. Links DATALOG to
a conventional SQLDB system.
DDBs are especially useful for
problems involving temporal and/or
spatial aspects.
Also see ProDBI:
www.sics.se/isl/quintus/prodbi/db.htm
19
4. Coupling of KBS and ‘standard’
DBMSs.
 These types of system often use KBS as a front end,
with a DBMS as the back end.
 Some people say this leads to a fundamental mis-
match due to:



Knowledge representation (KR) – flat DB tables are not
compatible with some of the advanced KR techniques
used
ESs often have fact base developed in an ad-hoc way can result in performance problems that a traditional
DB system would not have.
Often end up with use of redundant data descriptions
in order to make data exchange possible
20
4. Coupling of KBS and ‘standard’
DBMSs.


Use of static inference processing in AI versus dynamic
queries in DBs:
 DBMS uses operational knowledge from information in
applications programs e.g. embedded SQL, stored
procedures etc. The operational part of a KBS is
represented by declarative knowledge in the rule
base.
Granularity mis-match – KBS can’t handle set
optimisation that is a benefit in DBMS:
 KBS works with a row at a time instead of sets of
rows, hence lose effect of optimisation on sets of data.
 There are implementations existing already that suit
the purpose & that are not seriously affected by these
problems.
21
Coupling of DBMS & KBS
 Can adopt different levels of coupling the two types of
system.
 Communication channel between two subsystems
 Extract data from DBMS, store & use the snapshot in
the KBS

(problems here – snapshots soon become obsolete as
DBs are updated frequently, may need snapshots from
a few sources at once, slow.)
22
Architectural solutions for
KBS/DBMS integration
The first architecture in the
diagram was implemented by
Trinity College, Dublin - system
known as DIFEAD (Dictionary
interface for ES and DB). One
of the first systems to do this
and also first to base the
interface functionalities on the
data dictionary concept.
An earlier similar example is
KADBASE. Uses a network
data access manager to
provide central interface
between different components.
(KESE = Knowledge Engineering Software Environment)
23
Extending KBS with DB
components
 This is the solution adopted by ES tool vendors, so that their
systems can use information extracted from a database.
 A well known product that operates in this way is KBMS.
 Written in C
 Uses idea of forward & backward chaining
 Incorporates NL facility by allowing developers to write rules in
English
 Includes its own relational DB storage facility.
 Uses If-Then type rules
24
5. State of The Art
A series of annual workshops
take place that aim to go beyond
the classical KBS/DB
connection. These are
international and the first one
was held in 1994. The
proceedings for these can be
found at:
http://sunsite.informatik.rwthaachen.de/Publications/CEURWS/Vol-54/
KRDB (Knowledge Representation meets Databases) Workshops
25
CYC project
Can see website about this at:
http://www.CYC.com
•Launched in late 1984 as an
MCC (Micro-electronic &
computer technology corporation,
Texas) project.
•A very large KB built with a huge
amount of common sense
knowledge. Includes ideas to do
with time, space, substances,
contradiction, causality, emotions,
beliefs, etc.
•Contains more than a million
hand inserted assertions, that are
made up of facts and rules.
•Includes interface tools, runs on
various platforms. Currently
developing another interface so
that general public can insert
26
facts and rules.
CYC Project
Attempts to redress ‘narrowness’ problem of domains addressed by KBSs.
It is being used in concrete applications now.
27
CYC products
28
6. Other Developments
 Temporal databases
 Ontologies
 Semi structured & un-structured data
 Internet indexing & retrieval
 Data Mining
29
Temporal DB example - Temibase
 Integrates AI rules with temporal database
 Can handle incomplete temporal information
 Supports temporal reasoning
 Supports learning through derivation
performed on data and rules
 Supports both active and passive rules
depending on purpose of system
 Currently developing NL interface
See web page for links
30
7. Personal interest – music representation
 Symbolic approaches have proved useful


Vocabulary of symbols used to represent
concepts or objects
Programmer uses the vocabulary to say in its
terms how the programs can achieve results
 Level of abstraction for music?


No right answer for everyone
Jackendoff’s idea of “musical surface”
“lowest level of representation which has musical significance”
31
Music representation
 Wiggins& Smaill propose 2 dimensions on which to
compare music representations:
 Expressive completeness
 Structural generality
32
Music representation
 They aim for
 “a represntation with an explicit but not too erstrictive
musical surface, within which the widest possible range
of data can be represented”
 Enables sharing of data between researchers
 Better means of expressing and exchanging new and
difficult ideas
 Propose the CHARM system
 Language independent but most implementations
have been in Prolog – v. good for symbol
manipulation
33
Summary
 Various approaches to bringing AI and
database technology together
 Many applications for which these are already
being used
 Many potential applications
 Especially useful for unusual problems
34