Title goes here - Linguistic Technology Systems

Download Report

Transcript Title goes here - Linguistic Technology Systems

Mining for What’s Missing:
How to Find What’s Not in the
Speech Application’s Vocabulary
AMY NEUSTEIN, Ph.D.
LINGUISTIC TECNOLOGY SYSTEMS
[email protected]
SpeechTEK 2004
First Problem:
Critical business intelligence data is
lost in a sea of recorded calls when
callers use words outside of the
application’s vocabulary
Second Problem:
Early warning signs of caller
frustration are hard to detect
when callers do not use
expected “keywords” from the
application’s vocabulary to
express frustration
Third Problem:
To build a Statistical Language Model to
accommodate all the ways users might
express themselves would require a very
large data corpus that is costly to
assemble;
and still there would be no guarantee
that an accurate word match would be
found.
THE SOLUTION:
SEQUENCE PACKAGE
ANALYSIS
A new natural language intelligence
method that has been successfully peer
reviewed;
and cited by other researchers as a data
mining method for call center quality
monitoring.
METHODOLOGY
SPA draws mainly from the field of
conversation analysis:
the study of the orderly properties of
interactive dialog that revolve around the
turn-taking process;
and other sequentially based features that
are part of that process such as spacing
between turns and overlap of turns
How Does Sequence Package
Analysis (SPA) Work?
SPA parses NL dialog to locate a series of
related turns, discretely packaged as a
sequence of conversational interaction.
SPA locates generic sequence packages,
rather than isolated key words, because
speakers are more likely to vary in their
choice of words than in their basic
conversational sequence patterns.
WHERE DOES SPA FIT ON THE
SPEECH RECOGNIZER?
SPA provides a “filter” for the front end of a
speech recognizer, using generic templates that
can be deployed in many different applications
and languages.
A SPA “add on” layer can be used with
conventional vector-based n-gram language
models, which hold spaces and determine
“global weighting” of specific lexical items.
MINING HELP-LINE CALLS
Using SPA to caption the text of a
help - line call to capture signs of
caller frustration
SPA mining tools are based on the detection of
conversational sequence patterns rather than
solely on word spotting (“get me a supervisor!”)
or changes in prosody (e.g., increased pitch)
While speakers can vary widely in their choice of
words or in stress patterns, conversational
sequence patterns are more consistent across a
wide spectrum of callers
Australian Help-Line Desk
Caller: “I’ve installed Office 97 and…I was a bit
stupid. I went into uninstall and um pulled off a
whole stack of items off the uninstall and it was a
very silly thing to do so now when I start up my
computer I get a screen um which say um a blacka black and white screen which says never delete
this item. It’s a message screen and every time I
start up it comes up……[deleted text]……...
Caller: “I’m wondering if I reinstall will I wipe out
[my documents]”
Agent: “Okay, well look I could certainly have a
technician look at the problem for you; we do
charge for are you aware of that?”
Caller: “I’m just asking a question - I’m just
wondering whether or not I should uninstall
Microsoft Word?”
Using SPA to Find CONVERSATIONAL
SEQUENCE PATTERNS in this Dialog Sample
Step One: Locate the pre-question
phrases of reports of troubles and
requests for assistance:
“I’m wondering if”
“I’m just asking a question”
“I’m just wondering whether or not”
Step Two: Quantify the number of times
and the proximity of such pre-question
phrases.
Step Three: Determine if they escalate or,
in the alternative, diminish?
ANALYSIS
The caller to the Australian help-line began
her complaint as a long winded narrative,
but with the noticeable absence of a
request for help.
The caller later produced pre-question phrases
when she made her request for help
However, these phrases began to escalate (by
being combined with one another) just at the point
where she began to show signs of frustration:
“I’m just asking a question - I’m just wondering
whether or not I should uninstall Microsoft Word?”
CODA
Conventional data mining programs
would have“missed” these signs of caller
frustration in that they try to locate
keywords and phrases:
“get me a supervisor”
“I’m frustrated because I’m really not
getting answers to my questions.”
SPA offers as an add on layer to
mining programs in order to locate
what is missing