Solving Problem by Searching
Download
Report
Transcript Solving Problem by Searching
11 planning
Feng Zhiyong
Tianjin University
Fall 2008
Given:
– Initial state, goal state, and actions
Find:
– A plan: a sequence of actions that when
applied, beginning with the initial state,
transforms the world into a goal state
11.1 The Planning Problem
11.2 Planning with State-Space Search
11.3 Partial-Order Planning
11.4 Planning Graphs
11.5 Planning with Propositional Logic
11.6 Analysis of Planning Approaches
11.7 Summary
an ordinary problem-solving agent
may face:
◦ The problem-solving agent can be
overwhelmed by irrelevant actions。
◦ Difficult to find a good heuristic
function.
◦ Inefficient: cannot take advantage of
problem decomposition
The agent is the sole cause of change in the
environment World is accessible (i.e. the
agent knows all it need to know about the
environment)
Closed World Assumption:
◦ State description lists all that is true
◦ Anything else is assumed false
The planning task is very difficult, even with
such a simplified framework!
Dressing
◦ Initial state: socks, shoes
◦ Goal state: socks on, shoes on correct feet,
◦ Actions: PutOnSock(f), PutOnShoe(f)
Blocks World
◦ Initial state: some configuration of blocks on a table
◦ Goal State: another configuration (stacked?)
◦ Actions: Pickup(x), Putdown(x), Stack(x,y), Unstack(x,y)
Shopping
◦ Initial state: at home, with no items
◦ Goal state: at home, having a list of items
◦ Actions: Go(store), Buy(item), etc…
Facts: ground literals with variables
◦ Poor Unknown At(Plane, Beijing)
Situations: conjunction of facts
◦ At(Plane1, Beijing) ⋀ At(Plane2, Tianjin)
◦ Poor ⋀ Unknown
Goal: conjunction of positive literals
◦ Variables allowed, assume all variables are existential
◦ Rich ⋀ Famous, At(Plane1, Xi’an)
Actions:
◦ Action name
◦ Preconditions: conjunction of positive literals that
defines if action is legal/applicable
◦ Effects: conjunction of positive literals (called the add
list) and negative literals (called the delete list)
Action(Fly(p, from, to),
PRECOND:AT(P,from) ⋀ Plane(p) ⋀ Airport(from) ⋀ Airport(to)
EFFECT : ¬At(p,from) ⋀ At(p, to))
◦
delete list, add list
◦ Assumption: everything stays the same unless explicitly
on the delete list (avoids frame problem)
Result of an action:
◦ The positive literals in the effect are added to the
state.
◦ Any negative literals in the effect that match
existing positive literals in the state make the
positive literals disappear.
•Exceptions:
◦ Positive literals already in the state are not added
again.
◦ Negative literals that match with nothing in the
state are ignored.
A B
C
A
B
C
The planning problem can be seen as a
search problem.
We can move from one state of the problem
to another in both a forward and backward
direction because the actions are defined in
terms of both preconditions and effects.
Forward search: progression planning
Backward search: regression planning
Progression: Forward Chaining
◦ Like state-space search except for representation
◦ Inefficient due to large situation space to explore
Regression: Backward Chaining
◦ Start from the goal state and solve its subgoals(preconditions)
◦ More efficient and goal-directed than progression
(fewer applicable operators)
Forward
Backward
The initial state of the search is the initial state from the planning
problem. In general, each state will be a set of positive ground
literals; literals not appearing are false.
The actions that are applicable to a state are all those whose
preconditions are satisfied. The successor state resulting from an
action is generated by adding the positive effect literals and deleting
the negative effect literals. (In the first-order case, we must apply
the unifier from the preconditions to the effect literals.) Note that a
single successor function works for all planning problems - a
consequence of using an explicit action representation.
The goal test checks whether the state satisfies the goal of the
planning problem.
The step cost of each action is typically. Although it would be easy
to allow different costs for different actions, this is seldom done by
STRIPS planners.
Forward planning is equivalent to forward search and
is very inefficient. In fact, it suffers from all the
caveats of the underlying search algorithm.
A better way to solve a planning problem is through
backward state-space search, i.e. by starting at the
goal and working our way back to the initial state.
Advantage: we need only consider moves that achieve
part of the goal!
In STRIPS, there is no problem in finding the
predecessors of a state.
the goal in our 10-airport air cargo problem
Searching backwards is sometimes called
regression planning
PreCon:
Consistent:
Not be consistent
◦ Any positive effects of A that appear in G are deleted.
◦ Each precondition literal of A is added, unless it already
appears.
Substitution in FOL
State-space search (forward and backward) is not
efficient enough.
Can we perform A* style search with an admissible
heuristic?
Key Assumption
Sub-goals are independent of each other
◦ Divide and conquer the problem without worrying
about other parts of the problem
e.g. With putting on socks: the order doesn’t matter;
putting on left sock first doesn’t preclude putting
on the right
◦ Whole plan is sum of all sub-plans
This heuristic is:
◦ –Optimistic (admissible) when the goals do interact
i.e. an action in a subplandeletes a goal achieved by
another subplan.
◦ –Pessimistic (inadmissible) when subplans contain
redundant actions
This heuristic assumes that all actions have
only positive effects.
For example, if an action has the effects A
and ¬B, the empty-delete list heuristic
considers the action as if it only had the
effect A.
In that way, we assume that no action can
delete the literals achieved by another action.
Up to now, plans have been totally ordered
i.e. the exact temporal relationships between
the actions are known: Ai is after Ai-1 and
before Ai+1
In partially ordered plans, we don’t have to
specify the temporal relationships between all
the actions.
In practice, this means that we can identify
actions that happen in any order.
Total-order planner (linear):
◦ – Maintains a partial solution as a “totally ordered”
list of steps found so far
◦ – e.g. STRIPS
◦ – e.g. Situation-space progression/regression
planners
Partial-order planner (non-linear):
◦ – Only maintains partial order
◦ – Constraints on the ordering of steps in the plan
Principle of Least Commitment: don’t make
an ordering choice unless required to do so
◦ – Property of partial-order planners (POP)
◦ – Not a property of situation-space planners: they
commit to an ordering when an operator is applied
Keep the ordering choice as general as
possible
Reduces the amount of backtracking needed
◦ – Don’t waste time undoing steps
Ordering constraints:
◦
◦
◦
◦
– S1 < S2: S1 before S2
– S1 must occur before S2
but not necessarily immediately before it
– Thin links
Causal constraints:
◦ – S1 c S2: S1 achieves c for S2
◦ – S1 has a literal c in its effect list that is needed
to satisfy part of the precondition for S2
An action threatens a causal link when it might delete
the goal that the link satisfies.
Example: in the dynamic blocks world, pickup(a) has
“handempty”in its effects so it threatens the link
(putdown(c,b),handempty,pickup(d))
The consequences of adding an action that breaks a
causal link into the plan are serious. We have to make
sure to remove the threat by demotion (move earlier)
or promotion (move later).
A open (i.e. unsatisfied) precondition is one
that does not have a causal link to it. How is
an open precondition p for step S solved?
◦ – Step addition: add new plan step R that contains p
in its Effects list
◦ – Simple establishment: find an existing plan step R
prior to S that has p in its Effects list
◦ – Then add a causal and ordering links from R to S
To keep the search focused, the planner only
adds steps that achieve an open precondition
POP is sound and complete
POP Plan is a solution if:
◦ All preconditions are supported (by causal links),
i.e., no open conditions.
◦ No threats
◦ Consistent temporal ordering
By construction, the POP algorithm reaches a
solution plan
“Fast Planning Through Planning Graph
Analysis,” Artificial Intelligence,
Propositionalize actions and situations
Construct a planning graph
◦ Levels (e.g. time steps) with potential action nodes
Include persistence actions (inactions) to
deal with frame prob.
◦ Link actions to situation nodes between each level
◦ Indicate which situation descriptions are mutually
exclusive with “mutex links”
Planning graphs work only for propositional
planning problems
Inconsistent effects: one action negates an effect of the other.
For example Eat(Cake) and the persistence of Have(Cake) have
inconsrstent effects because they disagree on the effect Have
( Cake).
Interference: one of the effects of one action is the negation
of a precondition of the other. For example Eat(Cake)
interferes with the persistence of Have(Cake) by negating its
precondition.
Competing needs: one of the preconditions of one action is
mutually exclusive with a precondition of the other. For
example, Bake( Cake) and Eat (Cake) are mutex because they
compete on the value of the Have( Cake) precondition.
Literals increase monotonically: Once a literal appears at a given
level, it will appear at all subsequent levels. This is because of the
persistence actions; once a literal shows up, persistence actions
cause it to stay forever.
Actions increase monotonically: Once an action appears at a given
level, it will appear at all subsequent levels. This is a consequence of
literals' increasing; if the preconditions of an action appear at one
level, they will appear at subsequent levels, and thus so will the
action.
Mutexes decrease monotonically: If two actions are mutex at a given
level Ai, then they will also be mutex for all previous levels at which
they both appear. The same holds for mutexes between literals. It
might not always appear that way in the figures, because the figures
have a simplification: they display neither literals that cannot hold at
level Si nor actions that cannot be executed at level Ai. We can see
that "mutexes decrease monotonically" is true if you consider that
these invisible literals and actions are mutex with everything.
“Planning as Satisfiability,”
◦ Initial state ⋀all possible action descriptions ⋀ goal
Recall that a planning environment can be
expressed in situation calculus
◦ Axioms of the form a→b (rather ﹁a ⋁ b)
Recall that plans are considered to be a
conjunction of sub-goals:
◦ Start state ∧axioms ∧ goals
The basic idea with SAT-Plan:
◦ Describe the environment in situation calculus
◦ Propositionalize all the axioms
disjunctions),enumerated for each of an arbitrary
number of steps
◦ Conjoin all instantiated rules with the initial state
and goal descriptions
This provides us with a PL formula in CNF,
which we can try to solve using HC, SA, Tabu,
GAs, etc.
Initial state:
Some propositions are unknown
Time: T1 successor--state axioms
Time: T0.
KB: initial state ⋀ successor-state axioms ⋀ Goal
precondition axioms:
Action exclusion axioms
◦ ¬(Fly(P2,J FK, SFO)0 ⋀ Fly(P2,J FK, LAX)')
The number of clauses is larger, For example, with 10 time steps, 12
planes, and 30 airports, the complete action exclusion axiom has
583 million clauses. (T x Planes x I Airportls2 )
reduced to a set of binary predicates (symbol splitting)
Fly(P1, SFO, JFK)0,
T x Act x P x O
Parallel actions
Fly(P1, SFO, JFK)0 and Fly(P2, JFK, SFO)0
State-space search (STRIPS) can be directed
using logic, but is still incomplete
Partially-ordered planners are complete, but
are practically limited in the number of steps
they can accurately plan
Planning was sort of a “dead” AI research area
for a while
Since 1992, there have been several new
approaches to the planning task discovered
(e.g.Graph-Plan and SAT-Plan) that can find
plans upto thousands of steps long
D. Weld, “Recent advances in AI planning,”
AI Magazine,1999
◦ Excellent coverage of these new approaches
Planning agents search to find a sequence of
actions to achieve a goal using a flexible
representation of states, operators, goals,
plans
◦ – STRIPS language describes actions in terms of
their preconditions and effects
Not feasible to search through the entire
space as was done with search agents
◦ Regression planning focuses the search
◦ STRIPS assumes sub-goals are independent
◦ POP uses principle least commitment, declobbering
Partial-Order Planning (POP) is a sound and
complete planning algorithm, but can be
limited by plan length
Recent advances in AI planning reduce the
planning environment to other problems
(Graphs, SAT formulas) that can be solved
using other methods