CALO项目研究进展

Download Report

Transcript CALO项目研究进展

CALO项目研究进展
2008年10月
大 纲
 引言
 CALO系统结构
 主要研究内容
• OAA
• SPARK
• IRIS
• PTIME
• SR/AR
 展望
2
引 言 (1)
 项目背景
• DARPA, 2003, PAL(Perceptive Assistant Learns, 2003~2008)
• SRI, CALO(Cognitive Assistant that Learns and Organizes)
 Latin word "calonis", which means "soldier’s servant".
 项目目的
• The goal of the project is to create cognitive software
systems, that is, systems that can reason, learn from
experience, be told what to do, explain what they are
doing, reflect on their experience, and respond robustly
to surprise.
3
引 言 (2)
 研究领域
• Artificial Intelligence, Machine Learning, Natural Language
Processing, Knowledge Representation, Human-computer
Interaction, Flexible Planning, and Behavioral Studies
 组织结构
• 美国斯坦福国际研究院(Stanford Research Institute International,
简称SRI International)
• HTTP://www.ai.sri.com/project/CALO, HTTP://caloproject.sri.com/
• 22家研究机构, 250科研人员
4
引 言 (3)
5
引 言 (4)
6
CALO系统结构 (1)
7
CALO系统结构 (2)
8
CALO系统结构 (3)
 ORGANIZE AND MANAGE INFORMATION
• 通过收集各种用户信息(电子邮件、月历、文件、项目、联系人等),学习
出用户所处环境中潜在的关系模型,为更高层次的学习打基础。
 PREPARE INFORMATION PRODUCTS
• CALO自动将与项目相关的资料如邮件、文档、网页等打包以便用户
在会议上使用。
 OBSERVE AND MEDIATE INTERACTIONS
• 包括电子邮件交互、会议交互、多方式的人机交互等,电子邮件交
互包括对邮件的摘要、分类及排定回复的优先次序等,会议交互包
括对会议记录进行评注等,多方式的人机交互指综合运用语音、手
写、笔势、GUI界面操纵等多种方式进行人机交互。
9
CALO系统结构 (4)
 MONITOR AND MANAGE TASKS
• 对涉及多个子系统和参与者的复杂任务进行协调和管理。
 SCHEDULE AND ORGANIZE IN TIME
• 帮助用户安排日程、发现时间上的冲突并给出解决建议、代表用户
和其他人协商会议时间等,并能够学习用户的习惯和具有可调整地
自主性(用户对日程安排的参与程度)。
 ACQUIRE AND ALLOCATE RESOURCES
• 发现新的信息来源,学习以及推理角色和专家信息。
10
核心技术
 OAA
 SPARK
 IRIS
 PTIME
 SR/AR
11
自底向上
OAA (1)
 OAA (Open Agent Architecture)
http://www.openagent.com
12
http://www.ai.sri.com/oaa/
An Case
 场景:
• Perrault通过麦克风通知CALO系统: 当关于安全的邮件到达时立
刻通知我;
• Cheyer写了一封标题为“security alert”的邮件给Perrault;
• Perrault在办公室接到了电话,语音提示他有新邮件到达,要他
输入密码;
• Perrault通过电话按键输入密码后,系统通过电话播放了邮件的
内容。
DEMO
13
Collaboration Process (1)
14
Collaboration Process (2)
15
Collaboration Process (3)
16
Collaboration Process (4)
17
Collaboration Process (5)
18
Collaboration Process (6)
19
OAA (2)
 Characteristics [Martin, AAI99] [Cheyer, AAMAS01]
• Open
 agents can be created in many languages and interface with existing
systems
• Extensible
 agents can be added or replaced on the fly
• User friendly
 high-level, natural expression of delegated tasks
• Developer friendly
 Unified approach to service provision, data management, and task
monitoring
20
• Multimodal
 handwriting, speech, gestures, and direct manipulation can be
combined together
• Reusable
 Unanticipated sharing across many applications
OAA (3)
 ICL (Interagent Communication Language)
• A layer of conversational protocol defined by event types, similar
with KQML.
• A content layer consists of the specific goals, triggers, and data
elements, similar with KIF.
• Based on an extension of the Prolog language.
 Event
• All communications between agents occur in the form of events.
 Trigger
 Provide a general mechanism for specifying some action to be
taken when some set of conditions is met.
21
OAA (4)
 Facilitation
• Delegation, optimization, interpretation
 Declarations of solvables
• solvable(GoalTemplate, Parameters, Permissions)
• solvable(send_message(email, +ToPerson, +Params),
[type(procedure), callback(send_mail)], [])
• solvable(last_message(email, -MessageId), [type(data),
single_value(true)], [write(true)])
22
SPARK (1)
 SPARK (SRI Procedural Agent Realization Kit)
• PRS, and shares the same Belief Desire Intention (BDI)
model of rationality.
• Support the construction of large-scale, practical agent
systems, and contains sophisticated mechanisms for
encoding and controlling agent behavior.
• Has a well-defined semantic model that is intended to
There is a need for agent systems that can scale to real world applications,
support reasoning about the agents' knowledge and
yet retain the clean semantic underpinning of more formal agent frameworks.
[Morley,execution.
AAMAS04] [Morley, AAAI04]
http://www.ai.sri.com/~spark/
23
SPARK (2)
Overall Architecture for a SPARK Agent
24
SPARK (3)
 Belief
• A Knowledge base of beliefs about the world and itself that is
updated both by sensory input from the external world and by
internal events.
 Procedures
• provide declarative representations of activities for
responding to events and for decomposing complex tasks
into simpler tasks.
 Intentions
• At any given time the agent has a set of intentions, which are
procedure instances that it is currently executing.
25
SPARK (4)
 Executor
• Is SPARK’s core. Its role is to manage the execution of
intentions.
• It does this by repeatedly selecting one of the current
intentions to process and performing a single step of that
intention.
• Steps generally involve activities such as performing tests on
and changing the KB, adding tasks, decomposing tasks by
applying procedures, or executing primitive actions.
26
IRIS (1)
 IRIS: Integrate. Relate. Infer. Share.
• Semantic Desktop [Cheyer, Semantic Web05]
• CALO is an artificial intelligence application for which
IRIS serves as the semantic desktop user interface.
 Integrate
• Information resources
• A knowledge base
• User interface framework
http://www.openiris.org/
27
IRIS (2)
28
IRIS (3)
 Relate
• IRIS is used to semantically integrate the tools of
knowledge work.
• Clib (the Component Library Specification)
 CALO’s ontology
 Consists of definitions for everyday objects and
events.
 Use OWL as the data representation.
29
 Infer
IRIS (4)
• One of the key differentiators of IRIS, compared to
many semantic desktop systems, is the emphasis on
machine learning and the implementation of a plug-andplay learning framework.
• A typical use case
 Email Harvesting.
 Contact/Expertise Discovery.
 Learn from Files.
 Project Creation.
 Classification According to Project.
30
 Higher-level Reasoning
IRIS (5)
 Share
• Shared structures are essential for both end-user
applications, such as team decision making and project
management,
• and for infrastructural components such as machine
learning algorithms, which improve when given larger
data sets to work on.
31
PTIME (1)
 PTIME (Personalized Time Management) [Berry,
AAAI05]
• PTIME will unobtrusively learn user preferences through a
combination of passive learning, active learning, and advicetaking;
• As above result, over time the user will become more
confident of PTIME’s ability, and will thus let it make more
decisions autonomously;
• And as autonomy increases, PTIME will learn when to
involve the user in its decisions.
32
PTIME (2)
[Berry, AAMAS06]
33
PTIME (3)
 Three components of PTIME
• Process Controller (Heart of PTIME)
 A SPARK agent that captures possible interactions.
 Manages PTIME’s processes, tasking and coordinating the
activities of the Constraint Reasoner and Preference Learner.
• Constraint Reasoner
 Explore conflict resolution options using relaxation, event
bumping, and explanation techniques.
• Preference Learner
 Is an unobtrusive, online learner where the user’s selections
from suggested alternatives provide feedback to the learning
algorithm.
34
PTIME (4)
 Research Directions [Berry, AAAI05]
• Soft CSP design [Venable, IJCAI05]
 Simple Temporal Problem (STP)
 Disjunctive Temporal Problem (DTP)
 Simple Temporal Problem with Uncertainty (STPU)
 Disjunctive Temporal Problem with Uncertainty
(DTPU)
• Negotiation: Process Design for Conflict Resolution
• Learning for Adjustable Autonomy
35
SR/AR (1)
 SR/AR (Situation Assessment / Activity Recognition)
[Hung 05]
• Empower CALO with the ability to interpret and make
sense of what is going on in its environment.
 Tcchnical Challenges
• Large, dynamic and relational state space.
• Large sources of temproal and multi-model data.
• Semantic gaps, uncertainty.
36
37
SR/AR (2)
 Research Work
• T1: Methods for state estimation in relational domains, including
dealing with unknown number of objects and their identity,
relevance determination and focus of attention.
• T2: Methods for inference and learning in continuous time complex
dynamic processes.
• T3: Methods for active learning, strategic user querying and fast
inference in large HMM.
• T4: Methods for learning and recognizing hierarchical activity
models from desktop activity traces.
• T5: Methods for location-based activity recognition.
38
SR/AR (3)
 Research Work
• T6: Methods for learning and recognizing activities, gestures and
relevant objects from low-level physical sensors.
• T7: Methods for state estimation in communicative activities.
• T8: Methods for tracking the progress of the CALO plan, including
possible failures and missed deadlines.
39
SR/AR (4)
 T1: Situation assessment in relational domain
• Develop a language for representing domain in which the number
of objects and their identity is unknown ---- BLOG (Bayesian LOGic)
and DBLOG (Dynamic BLOG).
• Propose an approach based on probabilistic relational models that
does not insist on making a complete propositionalization of the
domain at inference time.
 T2: Continous time modeling in complex dynamic
processes
• From DBN to CTBN (Continuous Time Bayesian Network).
40
SR/AR (5)
 T3: Active learning, strategic user querying, and fast
inference in large HMM
• Have implemented active learning for HMMs and obtained promising
results on user activity data from an instrumented desktop.
• Will extend these results to the domain of general graphical models,
including DBNs.
 T4: Learning and recognizing user’s activities from desktop
traces
• Typical user’s activities have an inherent hierarchical structure.
• The main challenge for CALO is to chain the related events together, and
infer the hidden sub-activity and activity at the high-level.
• Efficient inference algorithms and semi-supervised learning
approach in abstract and hierarchical hidden Markov models, with
continuous time Bayesian network
41
SR/AR (6)
 T5: Location-based activity recognition
• Develop techniques that can reliably estimate the location (Location
information is extracted from WiFi signal strength).
• Develop methods for learning and inferring higher-level patterns of
movement and activities from the data generated by a locationaware CALO.
• From RMNs (Relational Markov Networks) to RFGs (Relational
Factor Graphs).
 T6, T7 and T8
• HHMM (Hierarchical Hidden Markov Models) [Nguyen, CVPR05]
• ProPL (Probabilistic Process Language)
42
SR/AR (7)
43
展望 (1)
 Transfer Learning
[Dietterich 05]
• Replacing an employee
 Employee A is leaving an organization and being replaced by
employee B. Can B’s CALO demonstrate transfer based on
learning that took place in A’s CALO?
• Moving to a new job
 An employee leaves organization A and moves to a new
organization B. Can his CALO demonstrate transfer learning
from experiences in A to capabilities in B?
44
展望 (2)
 Some learning mechanisms for transfer learning
• Hierarchical Bayesian learning
• Shared parameter models
• Instance weighting
• Abstraction regularization
• Cascading classifiers
• Attribute Weights and Low Dimensional
Representations
45
展望 (3-CALO Learning)
Relational: Learn relationships among entities
Sequential: Learn the dynamic structure of
ongoing activity of the user
Category: Learn relevant groupings
for observed information
Language: Learn new
Information from text and
utterances
Procedural: Learn to handle
new tasks through planning
Observation
Reflection
Inference
Long-Term
Memory
Factual: Reason to learn
new facts
Perceptual: Learn to associate images
and sounds with other knowledge
Interaction
Situational/Episodic
Memory
Advice: Learn from
the user
46
展望 (4-Using CALO Learning)
Jean
Learn when to
interact
Mary
Harry
Learn important
relationships
Inference
Timeline
John
Interact
MMTM
Notice
Learn to handle
new tasks
Plan
Anticipate
Associate people with
roles and places
Learn to adapt to
new situations
Act
t
t
47
Now
展望 (5-Technical Challenges)
Robust mixed-initiative multitasking
in a changing environment
Enduring improvement
through learning
Timeline
Introspect
Interact
MMTM
Plan
Notice
Integration of heterogeneous
cognitive components
t
Anticipate
Act
Now
Establishing and maintaining trust
Knowing what’s
out there
48
Seamless use across platforms
Thanks!
49
参考文献 (1)
 [Morley, AAMAS04] Morley, D. and Myers, K. The SPARK Agent
Framework. In Proc. of the Third Int. Joint Conf. on Autonomous
Agents and Multi Agent Systems (AAMAS-04), New York, NY, pp. 712719, July 2004.
 [Morley, AAAI04] Morley, D. and Myers, K. Balancing Formal and
Practical Concerns in Agent Design. In Proc. of the AAAI Workshop on
Intelligent Agent Architectures: Combining the Strengths of Software
Engineering and Cognitive Systems, 2004.
 [Cheyer, Semantic Web05] Cheyer, A. and Park, J. and Giuli, R. IRIS:
Integrate. Relate. Infer. Share. In 1st Workshop on The Semantic
Desktop. 4th International Semantic Web Conference, p. 15, Nov 2005.
 [Berry, AAMAS06] Berry, P. and Conley, K. and Gervasio, M. and
Peintner, B. and Uribe, T. and Yorke-Smith, N. Deploying a
Personalized Time Management Agent, in Proceedings of the Fifth
International Joint Conference on Autonomous Agents and Multi Agent
50 Systems (AAMAS’06) Industrial Track, Hakodate, Japan, May 2006.
参考文献 (2)
 [Berry, AAAI05] Berry, P. and Gervasio, M. and Uribe, T. and Pollack, M.
and Moffitt, M. A Personalized Time Management Assistant, in AAAI
2005 Spring Symposium Series, Stanford, CA, Mar 2005.
 [Venable, IJCAI05] Venable, K. B. and Yorke-Smith, N. Disjunctive
Temporal Planning with Uncertainty, in Proceedings of Nineteenth
International Joint Conference on Artificial Intelligence (IJCAI’05),
Edinburgh, UK, pp. 1385–1386, Aug 2005.
 [Nguyen, CVPR05] Nguyen, N. and Phung, D. and Venkatesh, S. and
Bui, H. Learning and detecting activities from movement trajectories
using the hierarchical hidden Markov model, in IEEE International
Conference on Computer Vision and Pattern Recognition, 2005.
 [Duong, CVPR05] Duong, T. and Bui, H. and Phung, D. and Vekatesh,
S. Activity recognition and abnormality detecting with the switching
hidden semi-Markov model, in IEEE International Conference on
Computer Vision and Pattern Recognition, 2005.
51
参考文献 (3)
 [Hung 05] Hung Bui. Situation Assessment and Activity Recognition.
Technique Report, SRI International, 2005.
 [Dietterich 05] Tom Dietterich, Girish Acharya. Transfer Learning
Activity for Years 3-5. Technique Report, SRI International, 2005.
 [Martin, AAI99] Martin, David L. and Cheyer, Adam J. and Moran,
Douglas B. The Open Agent Architecture: A Framework for Building
Distributed Software Systems. Applied Artificial Intelligence, vol. 13, no.
1-2, pp. 91-128, January-March 1999.
 [Cheyer, AAMAS01] Cheyer, Adam and Martin, David. The Open Agent
Architecture. Journal of Autonomous Agents and Multi-Agent Systems,
vol. 4 , no. 1, pp. 143-148, March 2001.
52