Adaptive Intelligent Mobile Robotics Leslie Pack Kaelbling, PI

Download Report

Transcript Adaptive Intelligent Mobile Robotics Leslie Pack Kaelbling, PI

Adaptive Intelligent Mobile Robotics
Leslie Pack Kaelbling, PI
MIT Artificial Intelligence Laboratory
Hierarchical Domain Decomposition for Probabilistic Planning
• High-level goal determines reward at
exit states
• Construct decomposition off line
• Solve for macro operators
• Combine pre-computed value
functions to determine near-optimal
action
• Plan for new goals in time
logarithmic in plan length
Planning
• Trade optimality for efficiency
Skill Learning
Built-in Behaviors
Vision-Based Navigation
Optical flow gives estimated distance to objects
Practical Reinforcement Learning
• Human guidance generates efficient exploration
• Locally weighted regression provides fast function approximation
• Uncertainty modeling and experience replay cause fast value
propagation
Phase One
Phase Two
Environment
Environment
R O
A
Supplied
Control
Policy
Comparison of potential-field method to empirically discovered human
control laws for local navigation
R O
Supplied
Control
Policy
Learning
System
Learning
System
Corridor-following Task
Average steps to goal
Phase 1
Steps to goal
125
Phase 2
Average training
105
85
“Best” possible
65
Current work: acquiring topological maps based on these primitives
-25
-15
-5
5
Training runs
15
25
A