09/11/01: Lecture 3, Part II

Download Report

Transcript 09/11/01: Lecture 3, Part II

Intelligent Agents
Katia Sycara
The E-Commerce Institute
[email protected]
www.cs.cmu.edu/~softagents
Teaching assistant: Joe Giampapa
[email protected]
Internet Agents
•
Web search Agents
•
Information filtering agents
•
Off-line delivery agents
•
Notification agents
•
Service agents
•
Web site agents
•
Mobile agents
Information Search
• Ways to Find Information
– Browsing: Following hyper-links that seem of interest
– Searching: Sending a query to a search engine such as Lycos
– Categories: Following existing categories such as Yahoo
• Problems
– Spent a lot of time and effort to navigate. Can search be made
more efficient?
– Search but it is difficult to accurately express the user’s
intention.
– Search engines are not personalized
Search Engines
Web etiquette guidelines for spiders
• Identify the name of the agent
• Identify the user deploying the agent
• Announce the agent by posting a message to the
comp.infosystems.www.providers Usenet newsgroups
• Announce the agent to the Webmasters of the servers the
agent will visit
• Provide additional information (using the Referrer field)
• Be accessible to fix problems the agent may cause
• Design the agent so it does not consume lot of resources (e.g.
does not use successive hits on a single server, does not loop,
runs at appointed times, etc.)
Advantages and Disadvantages of
Search Engines
Feature
Advantage
Disadvantage
Keyword query
Ease of use
Lost productivity due ot
poor precision
Instant response
Increased productivity,
If user knows what he
Is looking for
Decreased productivity,
due to chasing links
Hierarchical subject
categories
Increased productivity
due to high precision
Low recall in response to
user needs
Information discovery
via spiders
Reduced user workload
Lack of scalability and
bandwidth inefficiency
Limitations of current
search engines
• Lack of personalization; this results in low precision
of answers
• Unscaleability: *the robot must visit not only new links
but also old ones to keep them up to date; *the
information gathering is centralized
Some solutions to scalability issues:
• use specialized information brokers for building
information indices
• use massive replication and caching of popular
information
• distributed information gathering by placing gatherers
on the provider’s site; thus information is ready for
analysis as new information comes in, but the
provider must implement the software.
Information Filtering Agents
• Information Filtering agents find the content of
interest to a user.
• Information Filtering agents could gather information
from different sources
• They could filter information based on user’s
personal interest
• Filtering agents typically use a fixed number for
information sources
• Information filtering agents may use Information
Retrieval techniques
*Vector space models, where a document is
represented as a vector of attributes
*Tree structure, which represents a hierarchical
view of a document
Filtering Agents Attributes
Element
Description
Environment
Internet
Task Skills
Information gathering,
filtering, presentation
Web, news in different
domains
HTTP, HTML, indexing
protocols
Knowledge
Communication
Filtering Agent Architecture
Filtering Agent Architecture
Figure 3.4 Filtering based on word usage
Insignificant
Low-frequency
words
Insignificant
High-frequency
words
Words usage frequency
Benefits of Information Filtering Agents
Advantage
Feature
User-benefit
Information profile
Easy-to use, Form
based spec
Good for persistent
interests
Web page delivery
Info available as Web
page
Browser independent;
Requires site visits
E-mail delivery
Proactive information
delivery
Eliminates site visits;
e-mail clutter
Profile filtering
One-to-one
“broadcasting”
Reduced information
overload
Heterogeneous
Combines hetero info
sources
Reduces subscription
costs
Functionality of WebMate
• Learning user’s interests for information filtering
– Multiple TF-IDF vectors representation
– Incremental and adaptive Learning
– Compile personal newspaper
• Support for efficiently finding information
– Automatic refinement using Trigger Pairs
– Relevance feedback
_____________________________
Chen, Sycara, “WebMate: A Personal Agent for Browsing and Searching”,
Proceedings of the Second International Conference on Autonomous Agents, Minneapolis, MN, May 1998
Profile Representation
• Multiple TF-IDF vectors representation
• How many vectors are used? (Settable parameters; depends on #
User’s interests, Computational complexity)
• How many dimensions are used in a vector? (Computational
complexity, typical lexicons in a domain)
Learning Algorithm
• Preprocess: Parse HTML page, delete stop words, stemming
• Extract TF-IDF vector of the current interesting document
• If the number of vectors in the profile is less than predefined
number, add the vector to the profile
• Otherwise, calculate the cosine similarity between every two TF-IDF
vectors in the profile
• Combine the two vectors with the greatest similarity.
• Sort the weights in the new vector in decreasing order and keep the
highest several elements
Compile Personal Newspaper
• Automatically spide a list of URLs or Construct a query from the
profile
• Calculate the similarity and check whether the similarity is greater
than some threshold
• Experiments: Accuracy in top 10 is between 50% and 60%; Accuracy
in top 20 is about 50%; Accuracy in the whole is about 30%
Search Refinement
• Trigger Pairs Based Automated Refinement
– If a word S is significantly¹ correlated with another word T,
then (S, T) is considered a “trigger pair”, with S being the
trigger and T the triggered word.
• Relevance Feedback
– The context of the search keywords in the “relevant” pages is
used to automatically refine the search
• Parallel Search and Rerank
• Similarity-based Query
___________________
¹Significance is measured by mutual information (MI): MI ( s, t )  P( s, t ) log
P ( s, t )
P( s) P(t )
Examples of Trigger Pairs
• Broadcast News Corpus: 140M words, Distance between S and T is
500
• Examples1: product << {maker,company, corporation, industry,
incorporate, sale, computer, market, business,…}
• Example 2: car <<{motor, auto, model, maker, vehicle, for, buick,
honda, inventory, assembly, chevrolet, sale, …}
• Example 3: fare << {airline, maxsaver, carrier, discount, air, coach,
flight, traveler, continental, unrestrict, ticket,…}
• Example 4: music << {symphony, orchestra, composer, song,
concert, tune, concerto, sound, musician, album, …}
Automatic Search Refinement
• The user chooses the domain, and the system automatically expands
the query using domain specific triggers or ontology
• The user chooses the intended definition of the ambiguous words,
and the system according to the definition expands the query
• For a search with only one keyword, the top several triggers to the
keyword are used to expand the search
• For a search with more than 2 keywords, the intersection of the
triggers to the keywords are used to expand the search
Relevance Feedback Algorithm
• The context of the search keywords in the “relevant” pages is used to
refine the search
• Given a relevant page, the system looks for the context of the
keywords, and calculates the frequency in order to use the top several
frequent words to expand the query
The Query Restart Problem
• Agent A sends query to Agent B.
• Agent B can complete the query in time X, where
 X = 1 with probability p.
 X=c
(c > 1) with probability 1 - p.
Expectation: EX = p + (1 - p) c
• If not done by time 1, should agent A abort and restart, or wait?
• Can restarting reduce expectation? The variance? Both?
• Does it help to repeatedly restart k times?
_______________________
Chalasani, Jha, Shehory, Sycara, “Query Restart Strategies for Web Agents”,
Proceedings of Autonomous Agents 98, Minneapolis, MN, May 1998
A Simple Scenario: Single restart
Strategy: restart just after time 1, if not done by then.
Let Xi = completion time of i'th query, i = 1,2.
X1, X2 are independent, identically distributed.
New completion time is Y:
{
Y=
New expectation
EY
1
if X1 = 1,
1 + X2
if X1 = c.
= p + (1 - p)(1 + E X2)
= 1 + p (1 - p) + (1 - p) c
If (and only if) c > 1 + 1 / p, EY < X1 !
(X1, X2 indep.)
A Simple Scenario: k Restarts
Number of Restarts k
Off-Line Delivery Agents
Information filtering agents that deliver personalized information without
the need for a direct Internet connection
Off-line Delivery of Agents Attributes
Element
Description
Environment
Task skills
Internet, news feeds
Information
Knowledge
Web, news, finance, sports, weather
Communication skills
HTTP, Meta tags, Desktop OS
Benefits of Off-line Delivery Agents
Feature
Direct delivery
Advantage
Transparent
delivery
Benefit
User does not need to
visit sites
Automatic delivery Delivery according Avoidance of peak
to user specified
traffic hours
schedule
Local Viewing
HTML links are
locally resolved
Avoids the need to get
on-line
Disk management
New information
Relieves user from
replaces out of date disk management task
Notification Agents
A notification agent is one that notifies a user of significant events, i.e. a change in the state of information, e.g.
•
Content change in a particular Web page
•
Search engine additions for specific keyword queries
•
User-specified reminders for personal events (e.g. birthdays)
•
Notification Agent Attributes
Element
Description
Environment
Internet
Task Skills
Monitoring, determining, and notifying
change in information
Web
Knowledge
Communication
Skills
HTTP, Meta Tag, IDML
Benefits of Notification Agents
Feature
Monitoring
Benefit
Reduces user
workload
Browserless
monitoring
Advantage
Monitors for change
in information
Monitor only header file or
body text
Change
determination
Machine check of document
change
Reduced user
workload
Increased network
efficiency
Server
Checks each resource for
implementation multiple clients
Eliminates
bandwidth waste
Notification
Increases site
visits
Notifies user of changes
Other Service Agents
•
•
•
•
•
•
•
•
•
•
Announcement Agents
Business information monitoring agents
Classified ads agents: search database of ads
Direct mail agents: deliver direct mail advertising
Financial service agents: deliver e-mails with prices or other
financial news
Food and wine agents
Job agents: virtual recruiters to find appropriate employees
Entertainment agents: find communities of interests similar to
the user and recommend items, such as music, movies etc.
Shopping agents: comparison shopping for user-specified items
Site agents: virtual hosts at sites
Shopbots
Advantages:
• Provide unified interface to different stores, thus mitigating need to
navigate and deal with different interfaces
• Find best price and availability of a product
Challenges
• Virtual stores stop agents since they do not want to be compared on
price and availability alone
• User’s trust in a shopbots’s ability to notice sales and promotions.
Solutions:
• Cooperative vendor/agent model
• Vendor form learning agent
Collaborative Filtering
A collaborative filtering system makes
recommendations based on the preferences
of similar users.
People: Yenta, Referral Web
Products: Firefly, Tunes, Syskill & Webert
Readings: Wisewire, Phoaks
Content vs. Collaboration
• Content-based retrieval returns documents
that are similar to a query (search) or a user
profile (preference)
• Collaborative recommendation retrieves
documents liked by others with similar
profiles
Early Apps
• Group Lens (1994) Filtered newsgroups..
news client displays predicted scores & user
rates after reading..
• Phoaks Recommended webpages.. uses
frequency of mention data within Usenet
news groups to rate URL’s
Getting the Data
Explicit: Firefly
rate
match
recommend
Implicit: Amazon
purchase
match recommend
Priming the Pump: Lifestyle Finder uses
demographic data to assign users to market
research categories
Over the Shoulder: Letizia uses observed
browsing behavior & heuristics to recommend
links
Problems in Collaborative Filtering
Incentives & Startup
• Need a critical mass of users/recommenders to
make meaningful predictions
• Need mechanisms to maintain participation
Reliability
• Spoofing- will content providers inflate their
ratings
• Technical problems with clustering & similarity
measures
Privacy
• Once you share your profile who else may want
it?
Synthetic Agents (e.g. Julia)
Julia is a chatterbot that tries to convince users of its humanlike behavior:




















Repeating user’s input in questions
Admitting ignorance
Changing the topic of conversation
Using conversational statements
Using humorous statements
Providing excerpts fro Usenet News
Simulating typing, mimicking a user’s imperfect performance
Possible applications of chatterbots:

Visiting on-line chatroooms on topics of interest to your company

Initiating interesting conversations in chatrooms

Presenting comparison ads against your rivals

Querying information requests about your products

Serving as a site guide for finding information

Serving as a product guide on your site (e.g. demonstrate an automobile)
Intranets
Business applications of intranets:
•
•
•
•
•
•
•
•
Effective communication medium for enterprises
Create virtual communities within an enterprise
Automating order tracking and transaction
processing
Marketing support automation
Customer service and knowledge sharing among
customers
Internal help desk to provide guidance for corporate
processes and resources
Human resources support
Internet Search Agent Model Attributes
Table 4.1
Element
Description
Environment
Intranet
Task Skills
Indexing document databases,
searching, and retrieval
Knowledge
Corporate databases and
document formats
Communication Skills
HTTP, SQL, CGI, WAIS
Benefits of Intranet Search Agents
Feature
Advantage
Benefit
Multidatabase search
Client search of all corporate
databases
Increased
organizational
productivity, reduced
costs
Search save on
servers
Enables sharing of search
results within organization
Reduced workload
Multiple-level access
control
Allows access of certain field to Corporate security
authorized users
Proactive Notification Notifies users of change in
information
Increased
productivity,
enhanced corporate
communications
Intranet Filtering Agent Attributes
Element
Descriptions
Environment
Intranet
Task Skills
Information organizing,
sharing and presentation
Knowledge Skills
Corporate database,
workgroup discussions,
newsfeeds
Communication
HTTP, HTML, OLAP
Benefits of Intranet Filtering Agents
Feature
Advantage
Benefit
Information Profile
Form-based specification of
individual workgroup
interests
Ideal for persistent but
cumbersome for dynamic
interests
Notification
Proactive information
Delivery
Increased site visits and
increased productivity by
alleviating information
search
Profile based
filtering
Relevant information for
critical decisions
Increased organizational
productivity
Heterogeneous
information sources
Combines heterogeneous
information sources
Increased productivity and
reduced subscription costs
through sharing
Drawbacks and extended features
Drawbacks include:
·
Separate notification for each user interest,
cluttering
mailbox
·
Do not incorporate user model for tracking user’s
actions upon information delivery
Advanced Features
·
Recommend an agent for each new user interest topic
·
Modify an existing agent, based on user’s use of agent
recommended information (e.g. specialize an information
agent)
·
Remove an agent that the user does not use
·
Temporally activate an agent based on user interest and
disinterest in the agent’s recommendation
Collaboration Agents
The software runs over a network and enables a team to work
together and share information. It assists groups in:
·
·
·
·
Group scheduling
Discussion groups
Resource tracking
Document Management
It could do some simple tasks:
·
·
·
Save and re-execute shareable queries that search
groupware data bases
Perform a script under pre-specified conditions
Perform a script according to pre-specified schedule
Example: Lotus Notes
Agent definition
·
Agent name with optional comment
·
When the agent should run:
*manually
*if new mail has arrived
*if documents have been created, modified, deleted
*at scheduled times, e.g. hourly, daily etc
•
•
What document should the agent act on?
*all documents
*all new and modified documents since last time agent
ran
*all unread documents
*selected documents
What should the agent do?
*User can enter LotusScript program that can examine
named fields, and apply simple conditional logic.
Process Automation Agents
The goal is to use agents to automate workflow in business
applications
Differences between traditional workflow and agent-based
workflow
·
Traditional workflow is centralized; agents offere a
distributed infrastructure
·
Traditional workflow works only in structured
environments; agents could manage workflow during
execution
·
Traditional workflow pre-specifies paths to take for
exception handling: agents can negotiate new tasks and
resources dynamically
Attributes of Process Automation Agents
Element
Environment
Description
Intranet
Task Skills
Process scheduling,
negotiation, execution, and
notification
Business processes, resources
management
Knowledge
Communication skills
KQML, KIF, CORBA
Advantages of Process Agents
Feature
Task
Scheduling
Advantage
Schedule user tasks
Negotiating with server
agents
Benefit
Alleviate the need for
User to be present to
execute a task
Resource
Management
Dynamically allocate
resources for task
execution
Exception
handling
Renegotiate to
reschedule in response
to execution errors
Proactively notify user
of task completion
Reduced workload as the
user no longer needs to
worry about resource
availability
Reduced workload as
this is transparent to user
Proactive
notifications
Increased productivity
by reducing user need to
monitor
Database Agents
Agents that provide Enterprise-based support
·
Run scheduled database analyses in the
background
·
Exception reporting for operations management
·
Notify of information changes in a user-specified
database object
Database Agents:
Enterprise data delivery system
OLAP
Server
DSS Agent
Desktop
VLDB Drivers
Oracle
Informix
Server
...
SQL
Server
Database Agents Attributes
Element
Description
Environment Intranet
Task Skills
Data analysis automation, exception
reporting, notification of information
change
Knowledge
Data warehouse, metadata, RDBMS
Communicati SQL, ODBC, OLE
on
Skills
Database Agent Benefits
Feature
Advantage
Benefit
Automatic data Automates users’
Reduced workload
repetitive data analysis
Analysis
Exception
reporting
Notification
alerts
Reports user-defined
exceptions in business
Operations
Notifies user of
changes in information
Faster decision
making
Increased
productivity
Desired Features of Database Agents
Exception reporting alerts
·
·
·
·
server
Time or event triggered report execution
Workflow actions triggered by reports
Incorporation of learning capability into the
Database agents
Incorporation of learning into the OLAP