Transcript raytheon08
Spoken Dialog Systems
Diane J. Litman
Professor, Computer Science Department
Spoken Dialog Systems
Systems that interact with users via speech
Provide automated telephone or microphone
access to a back-end
Advantages: naturalness, efficiency, eyes
and hands free
user
Speech
Recognition
TTS or
recording
Spoken Dialog
System
DB, web,
system
2
Challenges in Spoken Dialog Systems
Automated speech recognition
Natural language understanding
Dialog Management
Sphinx, Microsoft Speech, Dragon Naturally Speaking
How to keep the conversation going? Best strategy?
How to detect errors in communication?
How to recover from errors?
Spoken language generation
3
Application areas I have worked on
AT&T
Pitt
Phone-based Information Access
Call Centers
Social Networking Systems
(Physics) Tutoring
Backup for Port Authority human operators
Other Interests
Training, Troublesheeting, PDA’s
4
Speech-based Computer Tutors
What are they?
Example
Tutor: Well, if an object has non zero constant velocity, is it moving
or staying still?
Student: Moving
Tutor: Yep. If it’s moving, then its position is changing. So then
what will happen to the packet’s horizontal displacement from the
point of its release?
Student: It will change
Intersection of two fields:
Spoken
Dialog Systems
Intelligent Tutoring Systems
5
Intelligent Tutoring Systems
Education
Classroom
instruction [most frequent form]
Human (one-on-one) tutoring [most effective
form]
Computer tutors – Intelligent Tutoring
Systems
Not
as good as human tutors
Ways to address the performance gap
(Spoken) dialog systems
Affective (dialog) systems
6
• Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002]
• Sphinx2 speech recognition and Cepstral text-to-speech
7
• Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002]
• Sphinx2 speech recognition and Cepstral text-to-speech
8
• Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002]
• Sphinx2 speech recognition and Cepstral text-to-speech
9
Current Research Directions
Automatic System Optimization
Can a system learn to optimize behavior based on prior data?
Evaluation
How can we tell if we are improving a system?
Can systems be tested with simulated rather than real users?
Affective Computing
Speech vs keyboard, TTS vs recordings, graphics vs. transcripts
How can user emotions be predicted in real-time?
How can the system exploit such information?
Prosodic and Linguistic Analysis
Respond to both what a user says, and how it is said
10
Human-Computer Excerpt
Tutor26: Let me ask my question in a more general way. What happens
to the velocity of a body when there is no force acting on it?
Student27: dammit (ASR: it is)
Tutor28 : Could you please repeat that?
Student29 : same (ASR: i same)
Tutor30 : Right. And what happens to the HORIZONTAL velocity of a
body when there is no HORIZONTAL force acting on it?
Student31: zero (ASR: the zero)
Tutor32 : Well... OK, let's consider how forces in the vertical direction
affect the motion of a body in a horizontal direction in a different
scenario <…omitted…>
Student33: oh gravity you already said this (ASR: beats gravity you're
exert less)
Tutor34: Fine. Are there any other forces acting on the apple as it falls?
Student35: no why are you doing this again (ASR: no y and to it yes)
Tutor36: Correct. The only force acting on the apple as it falls to the
earth is the earth's gravitational force. In which direction does
gravitational force act?
Student37: downward you computer (ASR: downward you computer)
Thank You! Questions?
Further Information
http://www.cs.pitt.edu/~litman/itspoke.html
12