Mass Collaboration and Data Mining
Download
Report
Transcript Mass Collaboration and Data Mining
MASS COLLABORATION AND DATA
MANAGEMENT
Raghu Ramakrishnan
Professor, University of Wisconsin-Madison
CTO, QUIQ
DATA MINING IN 2010
• Two possible futures:
– Stand-alone suite of analysis tools
• E.g., part of SAS
– Embedded in various applications
• E.g., Blue Martini, QUIQ
• What will the dominant paradigm be?
University of Wisconsin-Madison
Page 2
CUSTOMER SERVICE CHALLENGES
MEET
INCREASING
DEMAND
IMPROVE
CUSTOMER
SATISFACTION
SERVICE
ORGANIZATION
CONTROL
RISING
COSTS
SOLVE
SERVICE
COMPLEXITY
University of Wisconsin-Madison
Page 3
“OLD” SERVICE PARADIGM
Web
Support KB
Customer
Support
Center
University of Wisconsin-Madison
Page 4
MASS COLLABORATION
QUESTION
People using the
web to share
knowledge and help
each other find
solutions
SELF SERVICE
KNOWLEDGE
BASE
Answer added to
power self service
ANSWER
MASS COLLABORATION
-Experts
-Partners
-Customers -Employees
University of Wisconsin-Madison
Page 5
CURRENT KNOWLEDGE BASES
+
Support
Knowledge Base
• Agent knowledge
management
increases
productivity
• “Solutions” eliminate
repeat inquiries
• Web knowledge base
enables “customer
self-service”
• Requires expensive
knowledge
engineering
• FAQs & static
knowledge not good
enough … leading to
increased call volume
• Knowledge base only
contains what
company knows
University of Wisconsin-Madison
Page 6
CURRENT “MASS COLLABORATION”
+
Support
Newsgroups
• Many high-tech
leaders offer informal
support newsgroups
or message boards
• Small circles of user
enthusiasts actively
use them
• Low-cost way to tap
into the expertise of
thousands …
• Low “signal to noise”
ratio (designed for
“social conversations”)
• Hard to find existing
“solutions”… similar
questions asked over &
over again
• Threaded discussions
not popular with novice
users
University of Wisconsin-Madison
Page 7
Support
Newsgroups
Few
Experts
Many
Experts
QUIQ MASS COLLABORATION
Call Center
Support Knowledge Base
Interactions
Solutions
University of Wisconsin-Madison
Page 8
TYPICAL SERVICE CHAIN
40%
50%
FAQ
Self Service
Knowledge
base
$
Auto
Email
Manual
Email
Chat
$$
10%
Call
Center
2nd Tier
Support
$$$
University of Wisconsin-Madison
Page 9
SERVICE CHAIN WITH QUIQ
80%
15%
QUIQ
Mass
Collaboration
QUIQ
Self Service
$
Manual
Email
Chat
$$
5%
Call
Center
2nd Tier
Support
$$$
University of Wisconsin-Madison
Page 10
CASE STUDIES: COMPAQ
“In newsgroups, conversations disappear and you have to ask
the same question over and over again. The thing that makes
the real difference is the ability for customers to collaborate
and have information persistent. That’s how we found QUIQ.
It’s exactly the philosophy we’re looking for.”
“Tech support people can’t keep up with generating content
and are not experts on how to effectively utilize the product …
Mass Collaboration is the next step in Customer Service.”
– Steve Young, VP of Customer Care, Compaq
University of Wisconsin-Madison
Page 11
CASE STUDIES: NI
“To reduce service costs and provide value, B-to-B sites must
deploy a Meta-Service Network that permits customer-tocustomer collaboration. Companies should seek out vendors
that have domain experience, such as QUIQ, to assist in
deploying such a network.
Austin-based National Instruments deployed such a Network to
capture the specialized knowledge of its clients and take the
burden off its costly support engineers, and is pleased with the
results. QUIQ increased customers’ participation, flattened call
volume and continues to do the work of 50 support engineers.”
– David Daniels, Jupiter Media Metrix
University of Wisconsin-Madison
Page 12
CASE STUDIES
“iPlanet relies almost entirely on its 100,000 registered users to serve as a
virtual help line. Each question answered this way saves iPlanet between $50
and $100.”
– Franz Aman, Director of iMarketing, iPlanet
“…I am thrilled that I found the [QUIQ] forum now. I will be able to solve my
problems…” “…the [QUIQ] forum is best because there are SO MANY
people having to fix problems… I look to other experienced users and plug
away…”
– QUIQ end-users
“There is no better place to make customers for life than during their support
interactions… Forums can be powerful retention tools because they create
community and build loyalty, not only to your company, but to your customer
base as well”
– Hans Peter Brondmo, author of “The Engaged Customer”
University of Wisconsin-Madison
Page 13
DATA MANAGEMENT FOR MASS
COLLABORATION
University of Wisconsin-Madison
Page 14
MASS COLLABORATION
Communities + Knowledge Management
+ Service Workflows
• Content driven by users; changes rapidly.
• Interactions must be structured to
encourage creation of “solutions”.
• Search central to giving user best
available solution, avoiding noise.
• Notifications drive participation, routing.
– Extension of search; scalable triggers.
University of Wisconsin-Madison
Page 15
SEARCH AND INDEXING
Text plus metadata, updated constantly
• Quality and performance
– Must exploit metadata to improve quality of results, in
addition to considering text.
– Must be fast!
• Control
– Enterprise customers demand ability to “tune” search
behavior.
• Timeliness
– Can’t afford to index once a day.
University of Wisconsin-Madison
Page 16
SEARCH AND INDEXING
• KB of Qs and As, each with lots of metadata
– Author status, popularity, date info, approval status, etc.
• User types in “How can I configure the IP address on
my Presario?”
– Need to find most relevant content that is of high quality and is
approved for external viewing.
• User decides to post question because no good
answer was found in the KB.
– Search controls when experts and other users will see this new
question; need to make this (near) real time.
– Concurrency, recovery issues!
University of Wisconsin-Madison
Page 17
DBMS vs. IR
• Database systems and IR systems have
developed as independent silos.
– DB: Flexible tables, queries; concurrency control,
recovery
– IR: Fast text search; based on “relevance secret
sauce”, with little user control
• Mass collaboration requires a hybrid system.
University of Wisconsin-Madison
Page 18
A HYBRID DB-IR SYSTEM
• Searches are queries that can specify
boolean filters, and control relevance:
– Degree of match
– Quality of matching document
• Can effectively leverage metadata about text,
including some obtained by data mining.
• Data indexed (near) real-time.
• Foundation of QUIQ’s mass collaboration
application.
University of Wisconsin-Madison
Page 19
DATA MINING TASKS
• There is a lot of insight to be gained by
analyzing the data.
–
–
–
–
–
What will help the user with his problem?
Who does a given user trust?
Identify high-quality content.
Summarize content.
Who can answer this question?
• Question: What does it take to leverage
this insight?
University of Wisconsin-Madison
Page 20
LEVERAGING DATA MINING
• How do we get at the data?
– Relevant information is distributed across
several sources, not just the DBMS.
• How do we incorporate the insights
obtained by mining into the search
phase?
– Need to constantly update info about every
piece of content (Qs, As, users …)
University of Wisconsin-Madison
Page 21
LEVERAGING DATA MINING
• Three-step approach:
– Off-line analysis to gather new insight
– Periodic refresh of KB and/or indexes
– Use insight (from KB/index) to improve
search
• “Periodically” updating an “offline”
index is the key idea behind:
– Supporting (near) real-time search
– Incorporating mining results into
search
University of Wisconsin-Madison
Page 22
A LIST OF CHALLENGES
•
•
•
•
Similarity (real-time)
Matching (real-time)
Trends (off-line)
Correlation (off-line)
University of Wisconsin-Madison
Page 23
The Similarity Problem
• Find users with similar tastes, in context.
– Joe’s looking at an Athlon processor; which users are
similar to Joe in their PC tastes? Whose
recommendations is Joe likely to follow?
• Find similar content, in context.
– Which processors are similar in that they appeal to the
same groups of people?
– Which processors are similar in that they have similar
performance characteristics?
– Which articles appeal to the same people?
University of Wisconsin-Madison
Page 24
The Matching Problem
• Match user to data, in context.
– What related information should you
recommend to Joe when he is looking at the
Athlon PC product?
• Related products: graphics cards, monitors
• Related reviews, discussions
• If Joe’s been looking only at AMD products, other
AMD chips; if not, show alternatives from Intel
• Match data to user, in context.
– Which expert is best qualified to answer Joe’s
question?
University of Wisconsin-Madison
Page 25
The Trends Problem
• Identify trends in sales.
• Identify trends in overall user preferences,
user segmentation.
• Identify trends for individual users.
• Identify trends in overall product
popularity, product segmentation.
• Identify trends for specific products.
University of Wisconsin-Madison
Page 26
The Correlations Problem
• Given a set of trends (e.g., in pricing)
track the impact on other trends.
– Are there correlated trends?
– Are there causal relationships?
• Note that correlating a given trend to an
overall trend is hard enough, but trying to
find all other individual or product-specific
trends that happen to be correlated is
much harder!
University of Wisconsin-Madison
Page 27