PowerPoint - University of Virginia, Department of Computer Science

Download Report

Transcript PowerPoint - University of Virginia, Department of Computer Science

CS 416
Artificial Intelligence
Lecture 2
Agents
Chess Article
Deep Blue (IBM)
• 418 processors, 200 million positions per second
Deep Junior (Israeli Co.)
• 8 processors, 3 million positions per second
Kasparov
• 100 billion neurons in brain, 2 moves per second
But there are 85 billion ways to play the first four moves
Chess Article
1997 - Kasparov Lost to Deep Blue
2002 - Kramnik tied Deep Junior (current World Champion)
2003 - Kasparov (current number 1) plays Deep Junior
Jan 26 – Feb 7
Chess Article
Cognitive psychologists report chess is a game of
pattern matching for humans
• But what patterns do we see?
• What rules do we use to evaluate perceived patterns?
What is an agent?
Perception
• Sensors receive input from environment
– Keyboard clicks
– Camera data
– Bump sensor
Action
• Actuators impact the environment
– Move a robotic arm
– Generate output for computer display
Perception
Percept
• Perceptual inputs at an instant
• May include perception of internal state
Percept Sequence
• Complete history of all prior percepts
Do you need a percept sequence to play Chess?
An agent as a function
Agent maps percept sequence to action
• Agent:
f ( ps)  a; ps  p*
– Set of all inputs known as state space
Agent Function
• If inputs are finite, a table can store mapping
• Scalable?
• Reverse Engineering?
Evaluating agent programs
We agree on what an agent must do
Can we evaluate its quality?
Performance Metrics
• Very Important
• Frequently the hardest part of the research problem
• Design these to suit what you really want to happen
Rational Agent
For each percept sequence, a rational agent
should select an action that maximizes its
performance measure
Example: autonomous vacuum cleaner
• What is the performance measure?
• Penalty for eating the cat? How much?
• Penalty for missing a spot?
• Reward for speed?
• Reward for conserving power?
Learning and Autonomy
Learning
• To update the agent function in light of observed performance
of percept-sequence to action pairs
– Explore new parts of state space
 Learn from trial and error
– Change internal variables that influence action selection
Adding intelligence to agent
function
At design time
• Some agents are designed with clear procedure to improve
performance over time. Really the engineer’s intelligence.
– Camera-based user identification
At run-time
• Agent executes complicated equation to map input to output
Between trials
• With experience, agent changes its program (parameters)
How big is your percept?
Dung Beetle
• Largely feed forward
Sphex Wasp
• Reacts to environment (feedback) but not learning
A Dog
• Reacts to environment and can significantly alter behavior
Qualities of a task environment
Fully Observable
• Agent need not store any aspects of state
– The Brady Bunch as intelligent agents
– Volume of observables may be overwhelming
Partially Observable
• Some data is unavailable
– Maze
– Noisy sensors
Qualities of a task environment
Deterministic
• Always the same outcome for state/action pair
Stochastic
• Not always predictable – random
Partially Observable vs. Stochastic
• My cats think the world is stochastic
• Physicists think the world is deterministic
Qualities of a task environment
Markovian
• Future state only depends on current state
Episodic
• Percept sequence can be segmented into independent temporal
categories
– Behavior at traffic light independent of previous traffic
Sequential
• Current decision could affect all future decisions
Which is easiest to program?
Qualities of a task environment
Static
• Environment doesn’t change over time
– Crossword puzzle
Dynamic
• Environment changes over time
– Driving a car
Semi-dynamic
• Environment is static, but performance metrics are dynamic
– Drag racing
Qualities of a task environment
Discrete
• Values of a state space feature (dimension) are constrained
to distinct values from a finite set
– Blackjack: f(your cards, exposed cards) = action
Continuous
• Variable has infinite variation
– Antilock brakes: f (vehicle speed, wheel velocity) = unlock
– Are computers really continuous?
Qualities of a task environment
Towards a terse description of problem domains
• State space: features, dimensionality, degrees of freedom
• Observable?
• Predictable?
• Dynamic?
• Continuous?
• Performance metric
Building Agent Programs
The table approach
• Build a table mapping states to actions
– Chess has 10150 entries (1080 atoms in the universe)
– I’ve said memory is free, but keep it within the confines of
the boundable universe
• Still, tables have their place
Discuss four agent program principles
Simple Reflex Agents
• Sense environment
• Match sensations with rules in database
• Rule prescribes an action
Reflexes can be bad
• Don’t put your hands down when falling backwards!
Inaccurate information
• Misperception can trigger reflex when inappropriate
But rules databases can be made large and complex
Simple Reflex Agents
Randomization
• The vacuum cleaner problem
Dirty
Left
Right
Model-based Reflex Agents
So when you can’t see something, you model it!
• Create an internal variable to store your expectation of
variables you can’t observe
• If I throw a ball to you and it falls short, do I know why?
– Aerodynamics, mass, my energy levels…
– I do have a model
 Ball falls short, throw harder
Model-based Reflex Agents
Admit it, you can’t see and understand everything
Models are very important!
• We all use models to get through our lives
– Psychologists have many names for these contextsensitive models
• Agents need models too
Goal-based Agents
Lacking moment-to-moment performance measure
Overall goal is known
How to get from A to B?
• Current actions have future consequences
• Search and Planning are used to explore paths through state
space from A to B
Utility-based Agents
Goal-directed agents that have a utility function
• Function that maps internal and external states into a scalar
– A scalar is a number
Learning Agents
Learning Element
•
Making improvements
Performance Element
•
Selecting actions
Critic
•
Provides learning element with feedback about progress
Problem Generator
•
Provides suggestions for new tasks to explore state space
A taxi driver
Performance Element
•
Knowledge of how to drive in traffic
Critic
•
Observes tips from customers and horn honking from other cars
Learning Element
•
Relates low tips to actions that may be the cause
Problem Generator
•
Proposes new routes to try and improved driving skills
Review
Outlined families of AI problems and solutions
Next class we study search problems