Title goes here - Linguistic Technology Systems

Download Report

Transcript Title goes here - Linguistic Technology Systems

SEQUENCE PACKAGE ANALYSIS:
A NEW WAY TO UNDERSTAND NATURAL
LANGUAGE DATA ACROSS DIFFERENT
LANGUAGES AND DIALECTS
AMY NEUSTEIN, Ph.D.
LINGUISTIC TECNOLOGY SYSTEMS
[email protected]
LISA FORUM USA
WASHINGTON, D.C.
DECEMBER 8-12
2003
WHY DO WE NEED A NEW NATURAL
LANGUAGE METHOD?
1) In the real world speakers do not always use
“key” words that can be spotted in a dialog.
2) Dialects can vary so greatly - and be replete
with so many idioms - that the words in the
application vocabulary can provide a very poor
match for what the user actually says.
3) The costs of designing natural language
applications to accommodate so many different
languages and dialects can be high.
How Does Sequence Package Analysis
(SPA) Work?
SPA is based on Conversation Analysis which studies
dialog as a socially organized activity in which speakers
make requests, report on troubles, ask for help, etc.
Because dialog is a social activity, SPA focuses on the way
speakers organize their interactive dialog as a series of
related turns and parts of turns that are “packaged” as a
sequence rather than as isolated key words and phrases.
By looking for conversational sequence patterns rather
than “key” words, a voice application system can adapt to
a wide variety of dialects and different languages by not
restricting its application vocabulary to a preset lexicon.
Illustration
A caller needs a service call but rather than use
words in the application vocabulary such as “service
call” or “technician” this is what the frustrated caller
says to the IVR-driven auto attendant at the help-line
desk at the customer care and contact center.
Caller: “I really can’t do this myself. I can’t get this
to work without someone coming here. I really don’t
know what to do with this.”
Finding the Sequence Package in the
Dialog Example
Look for a concatenation of the following
utterance components:
• the use of an anaphor - the word that refers
back to a prior word or group of words - with the
noticeable absence of its referent (“I really
can’t do this myself”)
• the amplification of the source of the trouble but
with the frequent use of pronouns that have no
stated subject/object referents (“I can’t get this
to work without someone coming here”)
• a recycle of the first part of the complaint (“I
really don’t know what to do with this”)
ANALYSIS
Since natural language systems promote natural
and unscripted dialog, callers are more likely to
use pronouns and other indirect referents - and
sometimes even lapse into repetitions and
circumlocutions - than carefully chosen key
words.
SPA uses algorithms that conform to how callers
engaged in the social activity of interactive dialog
truly express themselves.
Since the algorithms are based on conversational
sequence patterns rather than a preset lexicon,
they can be easily modified to adapt to the
conversational sequence patterns indigenous to
other languages.
Sequence Package Analysis
A New Way to Understand Natural Language Across
Different Dialects and Languages
Amy Neustein, Ph.D.
Founder and CEO
Linguistic Technology Systems
[email protected]