Transcript cs2001_08
Speech, Natural Language, & Affect
in Tutorial Dialogue Systems
Prof. Diane J. Litman
Computer Science Department,
Intelligent Systems Program, &
Learning Research and Development Center
http://www.cs.pitt.edu/~litman
A few words about me…
Currently
Professor in CS and ISP
Research Scientist at LRDC
ITSPOKE research group (3 PhD students, 1 CS ugrad, 2 postdocs, 1
programmer)
AI Research (speech and natural language, intelligent tutoring)
Discourse and dialogue
Prosody, spoken dialogue systems
Speech and language technology for education (take my spring seminar!)
Reinforcement learning, user simulation
Affective computing
AI and education
Cognitive science
Previously
Member Technical Staff, AT&T Labs Research, NJ
Assistant Professor, CS at Columbia University, NY
AI Research (speech and NLP, knowledge representation and reasoning,
plan recognition)
2
Speech-based Computer Tutors
What are they?
Example
Tutor: Well, if an object has non zero constant velocity, is it moving or
staying still?
Student: Moving
Tutor: Yep. If it’s moving, then its position is changing. So then what will
happen to the packet’s horizontal displacement from the point of its
release?
Student: It will change
Intersection of two fields:
Intelligent Tutoring Systems (ITS)
Spoken Dialogue Systems (SDS)
3
Intelligent Tutoring Systems (ITS)
Education
Classroom instruction [most frequent form]
Human (one-on-one) tutoring [most effective form]
Computer tutors – Intelligent Tutoring Systems
Not as good as human tutors
Ways to address the performance
Language technologies
gap
Text-based dialogue
Talking heads
Speech-based dialogue: react to how in addition to what
Affective computing
4
Adding speech to ITS
Spoken Dialogue Systems (SDS)
Advantages
Systems that interact with users via speech
Naturalness
Efficiency
Eye and hands free
Domains
Information access [Raux et al., 2005; Rudnicky et al., 1999; Zue et al., 2000]
Tutoring [Graesser et al., 2001; Litman and Silliman, 2004; Pon-Barry et al., 2006]
Assistants [Allen et al., 2001; Rayner et al., 2005; Acomb et al., 2007]
5
Challenges in ITS
What does it mean to teach a subject?
What to teach?
Designing instruction
Delivering instruction
Understanding the human learning process
6
Challenges in SDS
Automated speech recognition (ASR)
Natural language understanding (NLU)
Dialogue Management (DM)
Sphinx, Microsoft Speech, Dragon Naturally Speaking
How to keep the conversation going? Best strategy?
How to detect errors in communication?
How to recover from errors?
Spoken language generation
7
Outline
ITSPOKE
Main research tools & projects
Comparing systems
Modeling learning
Interactions between phenomena
Other projects
8
How to do research in speech-based
computer tutors
ITSPOKE (Intelligent Tutoring SPOKEn Dialogue System) [Litman and
Silliman, 2004]
Speech-enabled version of the Why2-Atlas computer tutor [VanLehn, Jordan,
Rose et al., 2002]
Domain: Qualitative physics
Sample ITSPOKE problem
Suppose a man is in a free-falling elevator and is holding his keys
motionless right in front of his face. He then lets go. What will be the
position of the keys relative to the man's face as time passes? Explain.
9
• Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002]
• Sphinx2 speech recognition and Cepstral text-to-speech
10
• Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002]
• Sphinx2 speech recognition and Cepstral text-to-speech
11
• Back-end is Why2-Atlas system [VanLehn, Jordan, Rose et al. 2002]
• Sphinx2 speech recognition and Cepstral text-to-speech
12
Human-Computer Excerpt
Tutor26: Let me ask my question in a more general way. What happens
to the velocity of a body when there is no force acting on it?
Student27: dammit (ASR: it is)
Tutor28 : Could you please repeat that?
Student29 : same (ASR: i same)
Tutor30 : Right. And what happens to the HORIZONTAL velocity of a
body when there is no HORIZONTAL force acting on it?
Student31: zero (ASR: the zero)
Tutor32 : Well... OK, let's consider how forces in the vertical direction
affect the motion of a body in a horizontal direction in a different
scenario <…omitted…>
Student33: oh gravity you already said this (ASR: beats gravity you're
exert less)
Tutor34: Fine. Are there any other forces acting on the apple as it falls?
Student35: no why are you doing this again (ASR: no y and to it yes)
Tutor36: Correct. The only force acting on the apple as it falls to the
earth is the earth's gravitational force. In which direction does
gravitational force act?
Student37: downward you computer (ASR: downward you computer)
How ITSPOKE/WHY works
Simplified conversation structure
Question-answer format
Tutoring information authored in a hierarchical structure - KCDs
[VanLehn, Jordan, Rosé et al, 2002]
Problem
Essay
Dialogue with
ITSPOKE
Q1
Q2
Q3
14
ESSAY SUBMISSION & ANALYSIS
ITSPOKE behavior
Q1
Q2
Q3
Q5
Q4
Remediation subdialogue
Sample KCD (Knowledge Construction Dialogue)
16
Outline
ITSPOKE
Main research tools & projects
Comparing systems
Modeling learning
Interactions between phenomena
Other projects
17
Comparing systems
Metrics
Subjective metrics
Questionnaire at the end – agreement with statements like:
“It was easy to learn from the tutor”
“I enjoyed working with the tutor”
“It was easy to loose track of where I was in the conversation”
Problems
Unreliable
Need for standardization (psychometrics)
18
Comparing systems (2)
Objective metrics
Learning (gain)
Time spent with the computer tutor
Correctness of student answers
Number of help/hint requrests
Speech recognition performance
PreTest
PostTest
Learning
19
Comparing systems - projects
Speech vs. typed tutoring [Litman, Rose, Forbes-Riley et al., 2006]
Human-speech vs. Human-typed
Computer-speech (ITSPOKE) vs. Computer-typed (WHY)
No improvement in learning
No degradation due to ASR
Prerecorded vs. Synthesized [Forbes-Riley et al., 2006]
Better posttest scores
Shorter instruction time
No degradation due to synthesized voice
High ASR errors -> prefer synthesized more
Navigation Map ITSPOKE vs. regular ITSPOKE
[Rotaru and Litman, 2007; Rotaru, PhD thesis]
Affective ITSPOKE vs. regular ITSPOKE
[Forbes-Riley, Litman et. al 2008a, 2008b, in progress]
20
The Navigation Map
Outline of the conversation
Theoretical concept: discourse structure
Hypotheses
Easier for users to follow the conversation
Preferred / learn more
21
Discourse segment hierarchy
Intention/purpose structure
Discourse segments
Solution walkthrough
TUTOR1: To analyze this problem we will first describe the
motion of the person and his keys while he is holding
them. Then we will look at the motion of the person and
his keys after he lets go of them.
Let's begin by looking at the motion of the man
and his keys while he is holding them. How does his
velocity compare to that of his keys?
USER1: it’s the same
TUTOR2: Yes. So while he is holding his keys they both
have the same velocity. Now let's see what happens
when he releases them. To do so we will first find the
individual forces exerted on the man and his keys, then
both net forces, both accelerations, and finally we will
describe and compare the velocities.
So what are the forces exerted on the man after
he releases his keys? Please, specify their directions
(e.g., vertically up).
USER2: vertically down gravity
TUTOR3: Good. Because there is just one force, gravity's, it
is trivial to determine the NET force (i.e., the vector sum
of all forces). So what is the direction of the NET force
on the person?
..............
Two time frames: before release,
after release
Before release
Man’s velocity ? keys’ velocity
After release
Recipe: Forces Net force
Acceleration Velocity
Man: Forces/acceleration
Forces on the man
Net force on the man
………….
………….
………….
22
Experiment design
Within-subjects design
Rate tutor after each problem
1 problem with the NM; 1 without the NM (noNM)
16 questions, 1 (Strongly Disagree) – 5 (Strongly Agree) scale
Two conditions (to account for order and problem)
F (First) : 1st problem NM; 2nd problem noNM
S (Second) : 1st problem noNM; 2nd problem NM
Experiment procedure
F condition
Read
Problem 1
Problem 2
NM
noNM
Pretest
S condition
Questionnaire
noNM
Questionnaire
Posttest
NM
Survey
Interview
NM
Differences due to NM
23
Experiment design (2)
ITSPOKE dialogue history was disabled
Compare Audio-Only versus Audio+Visual (NM)
NM
noNM
24
Results – subjective metrics
NM trend/significant effects on system perception during the dialogue:
Dimension
Question
p
Average rating
noNM
NMPres NM
Structure
identify tutoring structure
follow the tutoring structure
Integration
forward looking integration
backward looking integration
... the tutor had a clear and structured agenda behind its
explanations
... it was easy to loose track of where I was in the
interaction with the tutor
0.008
4.4 > 3.9
0.012
2.7 < 3.3
Rating scale
... it was easy to figure out where the tutor's
instruction Disagree
1 - Strongly
0.017
was leading me
…….
... when the tutor asked me a question I knew why it was
0.054
5 - Strongly Agree
asking me that question
4.1 > 3.6
4.0 > 3.5
Correct answers
know the correct answer
know if correct
... whenever I answered incorrectly, it was easy to know
the correct answer after the tutor corrected me
... I knew whether my answer to the tutor's question was
correct or incorrect
0.085
4.1 > 3.7
0.358
3.6 > 3.4
0.004
3.7 < 4.3
Level of concentration
level of concentration
... a high level of concentration is required to follow the
tutor
25
Outline
ITSPOKE
Main research tools & projects
Comparing systems
Modeling learning
Interactions between phenomena
Other projects
26
Modeling learning
Problem: What contributes to/causes learning?
Correlations with learning
Events that significantly correlate with learning
Does not imply causality but it is a requirement for it
What events to measure?
… correctness
… time spent
PreTest
PostTest
Learning
27
What events?
Time on task (+), number of student words (+)
[Litman, Rose, Forbes-Riley et al., 2006] [Forbes-Riley, Rotaru and Litman, 2008]
Student emotions [Forbes-Riley, Rotaru and Litman, 2008]
Type of turns – on human-human [Forbes-Riley et al., 2005]
Neutral on certainty (-)
Neutral on frustration (-)
Student: introduce new concept (+)
Tutor: control dialogue (-)
Discourse structure inspired parameters
[Rotaru and Litman, 2006]
Computational implications?
28
Intuition 1 – Conditioning
Student learned?
Correctness:
……………
Incorrect
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
It is more important to be correct at
specific “places in the dialogue”.
Phenomena related to performance:
Correct
Correct
Incorrect
Correct
Incorrect
Incorrect
Correct
Correct
Incorrect
Incorrect
Correct
not uniformly important across the
dialogue
have more weight at specific places in the
dialogue.
Discourse structure can be used to define
“places in the dialogue”
Correct
Correct
29
Intuition 1 - Results
Correctness
Transition – correctness parameters
Q1
Q2
Q2.1
Q3
Q2.2
PopUp–Correct, PopUp–Incorrect
Interpretation: Capture successful learning events or failed learning
opportunities
Generalizes across corpora
ITSPOKE modification: engage in an additional remediation
dialogue
30
Intuition 2 – Discrimination
Student that learned less
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
Different
discourse structure
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
……………
Student that learned more
31
Intuition 2 - Results
Transition – Transition parameters
Q1
Q2
Q2.1
Q3
Q2.2
Push–Push
Interpretation: system uncovers potential
major knowledge gaps
Q2.1.1
Q2.1.2
32
Other events
Psychology inspired
Models of reading comprehension – Landscape Model
[Ward and Litman, 2005]
Alignment model – lexical and prosodic convergence
[Ward and Litman, 2007a, 2007b]
NLP inspired
Cohesion – lexical co-occurrence [Ward and Litman, 2006]
33
From Correlations to Causality
Correlation does not imply causality
But can inform modifications
E.g. more instruction after PopUp-Incorrect events
E.g. different instruction depending on student uncertainty
Incorrect more tutoring
Q1
Q2
Q2.1
Q3
Q2.2
34
Outline
ITSPOKE
Main research tools & projects
Comparing systems
Modeling learning
Interactions between phenomena
Other projects
35
Interactions between phenomena
Things interact in a dialogue
Why look for interactions?
Student correctness tutor reply
Student emotion tutor reply
Capture human tutor behavior
Extract new patterns
Allow us to formulate hypotheses
How to find interactions?
Dependency tests: χ2 (Chi-Square)
Example with 2 windows
36
Projects
Certainty human tutor reply [Forbes-Riley and Litman, 2005]
Student uncertainty associated with
Student certainty associated with
Increase in Bottom-up replies
Decrease in Expansions
Increase in Restatements
Speech recognition errors [Rotaru and Litman, 2005, 2006a, 2006b]
Speech recognition errors Next student state
Student State Speech recognition errors
Increase in frustration
Incorrect, Uncertain, Frustrated more speech errors
Discourse Structure Speech recognition errors
37
Other projects
Affective computing (Kate Forbes-Riley’s postdoc)
Emotion prediction
Emotion adaptation/handling
What are the important emotions in tutoring
How to predict them
Model human tutor behavior
Formulate hypotheses from empirical analysis
Reinforcement Learning and User Modeling
System learns best way to react from rewards (Min Chi’s PhD)
Needs a lot of data -> user simulations (Hua Ai’s PhD)
38
Resources
Recommended classes
Introduction to Natural Language Processing
Foundations of Artificial Intelligence
Machine Learning
Knowledge Representation
Seminar classes
Advance Topics in Artificial Intelligence (Speech and Language Technology for
Educational Applications (this spring!), Affective Spoken Dialogue Systems,
Spoken Dialogue Systems, etc.)
Other resources
ITSPOKE Group Meetings
NLP @ Pitt
DoD @ CMU
YRRSDS
ISP Forum
PSLC
39
Further information
Visit my homepage and talk with me
http://www.cs.pitt.edu
Take my seminar (CS 3710), projects course (CS 2002)
Talk with members of the ITSPOKE group
http://www.cs.pitt.edu/~litman/itspoke.html
40