Transcript PowerPoint
CS 416
Artificial Intelligence
Lecture 2
Agents
Review
We’ll study systems that act rationally
• They need not necessarily “think” or act like humans
• They need not “think” in rational ways
The domain of AI research changes over time
AI research draws from many fields
• Philosophy, psychology, neuroscience, mathematics,
economics, mechanical, linguistics
AI has had ups and downs since 1950
What is an agent?
Perception
• Sensors receive input from environment
– Keyboard clicks
– Camera data
– Bump sensor
Action
• Actuators impact the environment
– Move a robotic arm
– Generate output for computer display
Perception
Percept
• Perceptual inputs at an instant
• May include perception of internal state
Percept Sequence
• Complete history of all prior percepts
Do you need a percept sequence to play Chess?
An agent as a function
Agent maps percept sequence to action
• Agent:
– Set of all inputs known as state space
• Repeating loop:
We must construct f( ), our agent
• It must act rationally
The agent’s environment
What is known about percepts?
• Quantity, range, certainty…
– If percepts are finite, could a table store mapping?
What is known about environment?
• Is f (a, e) a known function and predictable?
More on this later
Evaluating agent programs
We agree on what an agent must do
Can we evaluate its quality?
Performance Metrics
• Very Important
• Frequently the hardest part of the research problem
• Design these to suit what you really want to happen
Performance vis-à-vis rationality
For each percept sequence, a rational agent
should select an action that maximizes its
performance measure
Example: autonomous vacuum cleaner
• What is the performance measure?
• Penalty for eating the cat? How much?
• Penalty for missing a spot?
• Reward for speed?
• Reward for conserving power?
Learning and Autonomy
Learning
• To update the agent function,
,in light of observed
performance of percept-sequence to action pairs
– Does the agent control observations?
What parts of state space to explore?
Learn from trial and error
– How do observations affect agent function?
Change internal variables that influence action
selection
Adding intelligence to agent
function
At design time
• Some agents are designed with clear procedure to improve
performance over time. Really the engineer’s intelligence.
– Camera-based user identification
At run-time
• Agent executes complicated equation to map input to output
Between trials
• With experience, agent changes its program (parameters)
How big is your percept?
Dung Beetle
• Almost no perception (percept)
– Rational agents fine-tune actions based on feedback
Sphex Wasp
• Has percepts, but lacks percept sequence
– Rational agents change plans entirely when fine tuning fails
A Dog
• Equipped with percepts and percept sequences
– Reacts to environment and can significantly alter behavior
Qualities of a task environment
Fully Observable
• Agent need not store any aspects of state
– The Brady Bunch as intelligent agents (lost in Hawaii)
– Volume of observables may be overwhelming
Partially Observable
• Some data is unavailable
– Maze
– Noisy sensors
Qualities of a task environment
Deterministic
• Always the same outcome for environment/action pair
Stochastic
• Not always predictable – random
Partially Observable vs. Stochastic
• My cats think the world is stochastic (lack of perception)
• Physicists think the world is deterministic
Qualities of a task environment
Markovian
• Future environment depends only on current environment and action
Episodic
• Percept sequence can be segmented into independent temporal
categories
– Behavior at traffic light independent of previous traffic
Sequential
• Current decision could affect all future decisions
Which is easiest to program?
Qualities of a task environment
Static
• Environment doesn’t change over time
– Crossword puzzle
Dynamic
• Environment changes over time
– Driving a car
Semi-dynamic
• Environment is static, but performance metrics are dynamic
– Drag racing (reward for reaching finish line after 12 seconds is
different from reward for reaching it after 14 seconds)
Qualities of a task environment
Discrete
• Values of a state space feature (dimension) are constrained
to distinct values from a finite set
– Blackjack:
Continuous
• Variable has infinite variation
– Antilock brakes:
– Are computers really continuous?
Qualities of a task environment
Towards a terse description of problem domains
• Environment: features, dimensionality, degrees of freedom
• Observable?
• Predictable?
• Dynamic?
• Continuous?
• Performance metric
Building Agent Programs
The table approach
• Build a table mapping states to actions
– Chess has 10150 entries (1080 atoms in the universe)
– I’ve said memory is free, but keep it within the confines of
the boundable universe
• Still, tables have their place
Discuss four agent program principles
Simple Reflex Agents
• Sense environment
• Match sensations with rules in database
• Rule prescribes an action
Reflexes can be bad
• Don’t put your hands down when falling backwards!
Inaccurate information
• Misperception can trigger reflex when inappropriate
But rules databases can be made large and complex
Simple Reflex Agents w/ Incomplete
Sensing
How can you react to things you cannot see?
• Vacuum cleaning the room w/o any sensors
• Vacuum cleaning room w/ bump sensor
• Vacuum cleaning room w/ GPS and perfect map of static
environment
Model-based Reflex Agents
So when you can’t see something, you model it!
• Create an internal variable to store your expectation of
variables you can’t observe
• If I throw a ball to you and it falls short, do I know why?
– I don’t really know why…
Aerodynamics, mass, my energy levels…
– I do have a model
Ball falls short, throw harder
Model-based Reflex Agents
Admit it, you can’t see and understand everything
Models are very important!
• We all use models to get through our lives
– Psychologists have many names for these contextsensitive models
• Agents need models too
Goal-based Agents
Overall goal is known, but lacking moment-to-moment
performance measure
• Don’t exactly know what performance maximizing action is at each step
Example:
• How to get from A to B?
– Current actions have future consequences
– Search and Planning are used to explore paths through state space
from A to B
Utility-based Agents
Goal-directed agents that have a utility function
• Function that maps internal and external states into a scalar
– A scalar is a number used to make moment-to-moment
evaluations of candidate actions
Learning Agents
Desirable to build a system that “figures it out”
• Generalizable
• Compensates for absence of designer knowledge
• Reusable
• Learning by example isn’t easy to accomplish
– What exercises do you do to learn?
– What outcomes do you observe?
– What inputs to your alter?
Learning Agents
Performance Element
•
Selecting actions (this is the “agent” we’ve been discussing)
Problem Generator
•
Provides suggestions for new tasks to explore state space
Critic
•
Provides learning element with feedback about progress (are we doing
good things or should we try something else?)
Learning Element
•
Making improvements (how is agent changed based on experience)
A taxi driver
Performance Element
•
Knowledge of how to drive in traffic
Problem Generator
•
Proposes new routes to try to hopefully improve driving skills
Critic
•
Observes tips from customers and horn honking from other cars
Learning Element
•
Relates low tips to actions that may be the cause
Review
Outlined families of AI problems and solutions
I consider AI to be a problem of searching
• Countless things differentiate search problems
– Number of percepts, number of actions, amount of a priori
knowledge, predictability of world…
• Textbook is divided into sections based on these differences
Sections of book
• Problem solving: Searching through predictable, discrete environments
• Knowledge and Reasoning: Searching when a model of the world is
known
– a leads to b and b leads to c… so go to a to reach c
• Planning: Refining search techniques to take advantage of domain
knowledge
• Uncertainty: Using statistics and observations to collect knowledge
• Learning: Using observations to understand the way the world works
and to act rationally within it