The Louse and the Mars Explorer

Download Report

Transcript The Louse and the Mars Explorer

Abductive Logic Programming Agents
•
•
•
•
•
The ALP agent cycle
ALP combines backward and forward reasoning
ALP gives a semantics to production rules
ALP can be used for explaining observations, conditional
solutions, generating actions, default reasoning
Pre-active reasoning, combining utility and uncertainty

Deciding whether or not to carry an umbrella

The prisoner’s dilemma
Abductive logic programming (ALP) agent model
An agent
Maintenance goal
Forward
reasoning
using
beliefs
Observe
Achievement goal
Backward
Reasoning
using
beliefs
Consequences
Consequences
Forward
reasoning
using
beliefs
The World
Consequences
Judge
probabilities
and utilities
Decide
Act
ALP agents combine beliefs and goals
Beliefs, represented by logic programs,
describe how things are.
Goals represented by integrity constraints,
prescribe how things should be. They include
 condition-action rules
 commands
 queries
 obligations & prohibitions
 atomic and non-atomic actions
 denials
The ALP agent cycle
 Record current observations,
 Use forward reasoning to derive consequences of the
observations, triggering any integrity constraints and
adding any new goals
 Use backward reasoning to reduce goals to sub-goals
 Perform conflict-resolution to choose between candidate
sub-goals that are atomic actions.
 Execute the associated actions.
Conflict-resolution can be performed by using forward
reasoning to derive consequences of candidate actions.
Decision theory can be used to choose actions whose
consequences have maximal expected utility.
Backward reasoning can also be used to explain
observations, before using forward reasoning to derive
consequences.
The London underground
Goal
Beliefs
If there is an emergency then I get help.
A person gets help
if the person alerts the driver.
A person alerts the driver
if the person presses the alarm signal button.
There is an emergency if there is a fire.
There is an emergency if one person attacks another.
There is an emergency if someone becomes seriously ill.
There is an emergency if there is an accident.
There is a fire if there are flames.
There is a fire if there is smoke.
ALP combines forward and backward reasoning
If there is an emergency then get help
get help
There is an emergency
Forward
The world
reasoning
alert the driver
There is a fire
Backward
reasoning
press the
alarm signal
observe
button
act
Abductive Logic Programming
Abductive Logic Programs <P, A, IC> have three components:
• P is a normal logic program.
• A is a set of abducible predicates.
• IC, the set of integrity constraints, is a set of first-order sentences.
Often, ICs are expressed as conditionals:
If A1 &...& An then B
or as denials:
not (A1 &...& An & not B)
Normally, P is not allowed to contain any clauses whose conclusion
contains an abducible predicate.
(This restriction can be made without loss of generality.)
ALP Semantics and Proof Procedures
Semantics:
Given an abductive logic program, < P,A,IC > , an abductive
explanation for a goal G is a set Δ of ground atoms in terms of
the abducible predicates such that:
G holds in P  Δ
IC holds in P  Δ
or
P  Δ  IC is consistent.
Proof procedures:
Backward reasoning to show G.
Forward reasoning to show observations and explanations
satisfy IC
Different notions of “holds” are compatible with these
characterisations, i.e.: truth in the “intended” minimal model,
truth in all models, etc.
ALP gives a logical semantics to production rules.
•
Logical rules used to reason forward can be
represented by LP clauses, with forward reasoning.
•
Reactive rules that implement stimulus-response associations can be
represented by integrity constraints, with forward reasoning.
•
Pro-active rules that simulate goal-reduction:
If goal G and conditions C then add H as a sub-goal.
can be represented by LP clauses, with backward reasoning.
ALP viewed in Active Deductive Database terms
Logic programs define data. E.g.
The bus leaves at 9:00.
The bus leaves at 10:00.
The bus leaves at X:00
if X is an integer & 9 ≤ X ≤ 18.
Integrity constraints maintain integrity. E.g.
There is no bus before 9:00.
If the bus leaves at X:00,
then it arrives at its destination at X:Y & 20 ≤ Y ≤ 30.
ALP can be used to explain observations
Program:
Grass is wet if it rained.
Grass is wet if the sprinkler was on.
The sun was shining.
Abducible predicates:
it rained,
the sprinkler was on
Integrity constraint:
not (it rained and the sun was shining)
Observation:
Grass is wet
Two potential explanations:
it rained,
the sprinkler was on
The only explanation that satisfies the integrity constraint is
the sprinkler was on.
ALP can be used to generate conditional solutions
Program:
X citizen if X born in USA.
X citizen if X born outside USA & X resident of USA & X naturalised.
X citizen if X born outside USA & Y is mother of X & Y citizen & X registered.
Mary is mother of John.
Mary is citizen.
Abducible predicates: X born in USA,
X born outside USA,
X resident of USA, X naturalised, X registered
Integrity constraint:
if John resident of USA then false.
Goal:
John citizen
Two abductive solutions:
John born in USA,
John born outside USA & John registered
ALP can be used to generate actions
Program:
there is an emergency if there is a fire
you get help if you alert the driver
you alert the driver if you press the alarm signal button
Abducible predicates
there is a fire,
you press the alarm signal button
Integrity constraint functioning as a maintenance goal:
If there is an emergency, then you get help
Abductive solution
you press the alarm signal button
ALP can be used for default reasoning
Program:
X can fly if X is a bird and normal X
X is a bird if X is a penguin
Abducible predicate:
normal X
Integrity constraint functioning as a denial:
If normal X and penguin X then false
Observation:
tweety is a bird
Abductive consequence:
tweety can fly,
assuming normal tweety
New observation:
Consequence withdrawn
ALP agents can reason pre-actively,
taking into account utility and uncertainty
A common form of belief has the form:
Different effects have different utilities
an effect takes place if
an agent does something and
some conditions hold in the environment
The state of the environment is uncertain
The same belief can be used:
• to reason forwards from observations
• to reason backwards from desired effects
• to reason forwards from candidate actions
• To reason backwards from observed effects
Combining utility and uncertainty
with pre-active thinking
To get rich, I am thinking about robbing a bank
But before constructing a plan in all its detail,
I mentally infer the possible consequences.
Apart from any moral considerations,
if I rob a bank, get caught, and am convicted, then I will end up in jail.
But I don’t want to go to jail.
I can control whether or not I try to rob a bank.
But I can not control whether I will be caught or be convicted.
I can only judge their likelihood.
If I judge that the likelihood of getting caught and being convicted is high,
then I will decide not to rob a bank, because I don’t want to go to jail.
I will not even think about how I might rob a bank,
because all of the alternatives lead to the same undesirable consequence.
Preactive thinking can be applied at different levels of detail
Maintenance goal
Forward
reasoning
Achievement goal
Consequences
Backward
reasoning
Consequences
Forward
reasoning
Observe
Consequences
Judge
probabilities
and utilities
Decide
Act
Pre-active thinking
Goal
I carry an umbrella or I do not carry an umbrella.
Beliefs
I stay dry if I carry an umbrella.
Assume
I carry an umbrella .
Infer
I stay dry (whether or not it rains).
Assume
I do not carry an umbrella .
Infer
I get wet if it rains.
I get wet if I do not carry an umbrella and it rains.
I stay dry if it doesn’t rain.
I stay dry if it doesn’t rain.
(whether or not I carry an umbrella).
Decision Theory:
to find the expected utility of a proposed action,
find all the alternative resulting states of affairs,
weigh the utility of each such state by its probability, and
add them all up.
p11
p12
action1
p13
p14
u11
u12
u13
Expected utility of action1
p11·u11+p12·u12+p13·u13+p14·u14
u14
u21
action2
p21
p22
p23
p24
u22
Expected utility of action2
p21·u21+p22·u22+p23·u23+p24·u24
u23
u24
Choose the action of highest expected utility
Deciding whether or not to carry an umbrella
Assume
Probability it rains
Probability it doesn’t rain
= .1
= .9
Utility of getting wet
= – 10
Utility of staying dry
= 1
Utility of carrying an umbrella
=–2
Utility of not carrying an umbrella = 0
Assume
Infer
Expected utility
I carry an umbrella .
I stay dry with probability 1.
-2 + 1 = -1
Assume
Infer
I do not carry an umbrella .
I get wet with probability .1.
I stay dry with probability .9
0 -10·.1 + 1·.9 = -1 + .9 = -.1
Expected utility
Decide
I do not carry an umbrella!
A more practical alternative might be to use maintenance
goals or condition-action rules instead:
If I leave home and it is raining
then I take an umbrella.
If I leave home and there are dark clouds in the sky
then I take an umbrella.
If I leave home and the weather forecast predicts rain
then I take an umbrella.
The maintenance goals compile decision-making into the thinking
component of the agent cycle. The compilation might be an exact
implementation of the Decision Theoretic specification. Or it
might be only an approximation.
The Prisoner’s Dilemma
Goal
Beliefs
I turn state witness or
I do not turn state witness
A prisoner gets 0 years in jail
if the prisoner turns state witness
and the other prisoner does not.
A prisoner gets 4 years in jail
if the prisoner does not turn state witness
and the other prisoner does.
A prisoner gets 3 years in jail
if the prisoner turns state witness
and the other prisoner does too.
A prisoner gets 1 year in jail
if the prisoner does not turn state witness
and the other prisoner does not turn state witness too.
Preactive thinking
Assume
I turn state witness
Infer
I get 0 years in jail
if the other prisoner does not turn state witness.
I get 3 years in jail
if the other prisoner turns state witness .
Assume
I do not turn state witness
Infer
I get 4 years in jail
if the other prisoner turns state witness.
I get 1 year in jail
if the other prisoner does not turn state witness.
In Classical Logic
Given the additional belief
the other prisoner turns state witness
or the other prisoner does not turn state witness.
Infer
If I turn state witness
then I get 0 years in jail or I get 3 years in jail.
If I do not turn state witness
then I get 4 years in jail or I get 1 year in jail.
In Decision Theory
Assume Probability the other prisoner turns state witness
= .5
Probability the other prisoner does not turn state witness =.5
Utility of getting N years in jail = N
Assume
I turn state witness
Infer
Probability I get 0 years in jail = .5
Probability I get 3 years in jail = .5
Expected utility .5·0 + .5·3 = 1.5 years in jail.
Assume
I do not turn state witness
Infer
Probability I get 4 years in jail = .5
Probability I get 1 years in jail = .5
Expected utility .5·4 + .5·1 = 2.5 years in jail
Decide I turn state witness
Conclusion:
Logic can be used to combine
proactive, reactive and proactive thinking together with
Decision Theory (and other ways of making decisions)
Maintenance goal
Forward
reasoning
Achievement goal
Consequences
Backward
reasoning
Consequences
Forward
reasoning
Observe
Consequences
Judge
probabilities
and utilities
Decide
Act