GRYPHON - Carleton College

Download Report

Transcript GRYPHON - Carleton College

GRYPHON
Taylor Curtis
Eric Lantz
Kate Nelson
Gio Messner
Michelle Phillips
General Framework
Speech
-to-text
Parser
Sphinx
Phoenix
DM
Dialogue
Manager
DB
Texttospeech
Database
Festival
General Framework
Speech
-to-text
Parser
Sphinx
Phoenix
DM
Dialogue
Manager
DB
Texttospeech
Database
Festival
General Framework
Speech
-to-text
Parser
Sphinx
Phoenix
DM
Dialogue
Manager
DB
Texttospeech
Database
Festival
General Framework
Speech
-to-text
Parser
Sphinx
Phoenix
DM
Dialogue
Manager
DB
Texttospeech
Database
Festival
General Framework
Speech
-to-text
Parser
Sphinx
Phoenix
DM
Dialogue
Manager
DB
Texttospeech
Database
Festival
Galaxy Communicator
Parser
Speech-toText
DM Server
HUB
Text Input
Speech
and Text
Output
Database
General Framework
Speech
-to-text
Parser
Sphinx
Phoenix
DM
Dialogue
Manager
DB
Texttospeech
Database
Festival
SPHINX




Speech-to-text
Real-time
CMU
Partial Hypotheses
"KWON
"KWON
"PAUL
"PAUL
"PAUL
"PAUL
"WHAT



"
CARON "
BIOLOGY
BIOLOGY
BIOLOGY
BIOLOGY
BIOLOGY
GROW "
COURSES
COURSES
COURSES
COURSES
"
ARE AND "
ARE THE "
ARE THERE"
Vocabulary / .corpus File
Hand dictionary / .handdict
Phoneme Set
TEXT
Phoneme Set
Phoneme
AA
AE
AH
AO
AW
AY
B
CH
D
DH
EH
ER
EY
F
G
HH
IH
IY
JH
Example
odd
at
hut
ought
cow
hide
be
cheese
dee
thee
Ed
hurt
ate
fee
green
he
it
eat
gee
Translation
AA D
AE T
HH AH T
AO T
K AW
HH AY D
B IY
CH IY Z
D IY
DH IY
EH D
HH ER T
EY T
F IY
G R IY N
HH IY
IH T
IY T
JH IY
Phoneme
K
L
M
N
NG
OW
OY
P
R
S
SH
T
TH
UH
UW
V
W
Y
Z
ZH
Example
key
lee
me
knee
ping
oat
toy
pee
read
sea
she
tea
theta
hood
two
vee
we
yield
zee
seizure
Translation
K IY
L IY
M IY
N IY
P IH NG
OW T
T OY
P IY
R IY D
S IY
SH IY
T IY
TH EY T AH
HH UH D
T UW
V IY
W IY
Y IY L D
Z IY
S IY ZH ER
Text Input



Sphinx can limit system
Recognizes some voices poorly
Text input option increases reliability
General Framework
Speech
-to-text
Parser
Sphinx
Phoenix
DM
Dialogue
Manager
DB
Texttospeech
Database
Festival
Phoenix




University of Colorado, Boulder
Semantic Frame Parser
No predefined grammar
Grammars are defined in Backus-Naur Form
Traditional syntax parse
Our Advisor,
Jeff Ondich
Keyword parse
Animal
Action
Food
moose
ate
leaves
Our Grammar Is Somewhere In Between
"What Computer Science courses does Jeff Ondich teach?"
[Department]
[Professor]
[_Cs]
[First_name]
Computer Science
Jeff
[Last_name]
Ondich
Format of a Phoenix Grammar File
Frame: Courses
Nets:
[Department]
[Professor]
Net definitions are in brackets
and elsewhere in the file:
[Department]
([_Biol])
([_Cs])
;
[_Cs]
(CS)
(Computer Science)
;
[Professor]
(*[First_name] [Last_name])
;
[First_name]
(Jeff)
(David)
;
[Last_name]
(Ondich)
(Musicant)
;
Slots (subnets)
Strings in parentheses without
brackets are terminals
[Professor]
[First_name]
Jeff
[Last_name]
Ondich
An Example Parse
“Show me all 5As in English and Philosophy”
[Class_period]
[Department]
[_5a]
[_Engl]
5As
English
1
[_And]
[Department]
And
[_Phil]
2
Philosophy
Courses:[Class_Period].5a
Courses:[Department].Engl
Logic:And
Courses:[Department].Phil
“Courses” and “Logic” are frames (“Respond” is the third one we use)
3
General Framework
Speech
-to-text
Parser
Sphinx
Phoenix
DM
Dialogue
Manager
DB
Texttospeech
Database
Festival
One Huge Table?
•Bad
•Redundant
•Repetitive
+----+---------+------+--------------------------------------------------+------------+-------+------+-------------+-----------+
| id | syn_num | dept | title
| build_name | room1 | day1 | start_time1 | end_time1 |
+----+---------+------+--------------------------------------------------+------------+-------+------+-------------+-----------+
| 16 |
6398 | ocp | s i t australia:environment
|
|
|
|
|
|
| 17 |
6399 | ocp | a c m florence
|
|
|
|
|
|
| 18 |
6400 | astr | introduction to astronomy
| olin
| 149
| m
| 01:50pm
| 03:00pm
|
| 19 |
6401 | astr | observational and laboratory astronomy
| good
| 104
| t
| 07:00pm
| 10:00pm
|
| 20 |
6402 | biol | conservation biology
| olin
| 149
| m
| 11:10am
| 12:20pm
|
| 21 |
6402 | ents | conservation biology
| olin
| 149
| m
| 11:10am
| 12:20pm
|
| 22 |
6403 | astr | special project
|
|
|
|
|
|
| 23 |
6405 | astr | special project
|
|
|
|
|
|
| 24 |
6412 | biol | genes, evolution, and development
| olin
| 141
| m
| 08:30am
| 09:40am
|
| 25 |
6415 | phys | complexity and chaos
| olin
| 302
| m
| 09:50am
| 11:00am
|
| 26 |
6417 | phys | cellular automata:new science
| olin
| 302
| m
| 08:30am
| 09:40am
|
| 27 |
6418 | amst | introduction to u.s. latino/a studies
| good
| 3
| m
| 01:50pm
| 03:00pm
|
| 28 |
6419 | phys | newtonian mechanics
| olin
| 2
| m
| 12:30pm
| 01:40pm
|
| 29 |
6420 | biol | genes and evolution lab
| hul
| 206
| t
| 01:00pm
| 05:00pm
|
| 30 |
6421 | phys | newtonian mechanics lab
| olin
| 301
| t
| 01:00pm
| 05:00pm
|
| 31 |
6422 | biol | genes and evolution lab
| hul
| 206
| w
| 02:00pm
| 06:00pm
|
| 32 |
6423 | biol | genes and evolution lab
| hul
| 206
| th
| 01:00pm
| 05:00pm
|
| 33 |
6425 | phys | newtonian mechanics lab
| olin
| 301
| th
| 01:00pm
| 05:00pm
|
| 34 |
6426 | amst | the sublime in america
| boli
| 161
| t
| 01:15pm
| 03:00pm
|
The Database Structure
DAY
id
day
TIMESPAN
id
start_time
day_id end_time
TIMEANDPLACE
id
course_id
COURSE
id
syn_num
title
min_cred
max_cred
scrch_only
FULFILLS
id
course_id
dist_id
ROOM
id
building_id
room
room_id
timespan_id
BUILDING
id
build_name
COURSEINDEPT
id
course_id
dept_id
DISTRO
id
dist_name
PROFESSOR
id
first_name
last_name
TAUGHTBY
id
course_id
professor_id
DEPARTMENT
id
dept_name
course_num
section
General Framework
Speech
-to-text
Parser
Sphinx
Phoenix
DM
Dialogue
Manager
DB
Texttospeech
Database
Festival
Dialogue Manager




Query Manager
Logic Parser
SQL Query Builder
Output Director
Some Queries Can’t Stand Alone
CanWhere
I see Geology
are they held?
as well?
Show me theI said
ones3A
in the morning
What math classes What’s in Sayles
are there?
Where
Who251?
teaches
are theythose?
held?
Who teaches
those?”
A Query is a Continuation if…

It contains a continuation keyword
“Show me the RAD classes too.”

It contains no question words
“And Physics.”

It contains only question words
“When is it?”
Combining the Old With the New
OLD QUERY
NEW QUERY
Courses:[Class_Period].5a
Courses:Correction
Courses:[Department].Engl
Courses:[Class_Period].3a
Logic:And
Courses:[Department].Phil
Combined Query
Courses:[Class_Period].3a
Courses:[Class_Period].5a
Courses:[Department].Engl
Logic:And
Courses:[Department].Phil
“No, 3A”
Logic Parser
“Real” Logic vs. Human Logic
“Show me all the English and History courses.”
If we interpret this as a logical AND, we won’t
return any results.
“Show me the courses that are English and RAD.”
In this case, we do want to find courses that fulfill
both of these criteria.
How can we get most cases right?
Change any occurrence of “and” to “or” unless
it comes after the word “course” or “class”.
Resolving Ambiguity
“Classes that are RAD and English or History.”
This has two possible interpretations, because English uses
infix ordering.
and
RAD
or
or
English
and
History
RAD
History
English
Postfix order is unambiguous, so each of the above trees is mapped to a
different postfix statement:
RAD English History or and
RAD English and History or
Which interpretation to use?
Precedence rules:

1.
2.
3.
4.
5.
apply NOTs.
apply ORs between elements of the same type.
apply ANDs.
apply all other ORs
if a NOT is inside a “homogenous” clause, apply it to
the whole clause
So, for example…
Courses:[Class_Period].5a
Courses:[Class_Period].5a
Courses:[Department].Engl
Courses:[Department].Engl
Logic:And
Logic:Or
Courses:[Department].Phil
Courses:[Department].Phil
Courses:[Class_Period].5a
Logic:And
Courses:[Department].Engl
Logic:Or
Courses:[Department].Phil
Courses:[Class_Period].5a
Courses:[Class_Period].5a
Logic:And
Courses:[Department].Engl
Courses:[Department].Engl
Courses:[Department].Phil
Courses:[Department].Phil
Logic:Or
Logic:Or
Logic:And
Make SQL Query
The Problem:
Phoenix Output
Database
Courses:[Course_Period].5a
SELECT course.name FROM course
Courses:[Department].Engl
WHERE course.id = 7;
Courses:[Department].Phil
?
Logic:Or
DAY
Logic:And
TIMESPAN
TIMEANDPLACE
ROOM
BUILDING
DISTRO
FULFILLS
COURSE
COURSEINDEPT
PROFESSOR
TAUGHTBY
DEPARTMENT
SQL Syntax
•SELECT
the information (columns) we want to show
•FROM
the tables that have the information we want
•WHERE
certain things are true
•JOINS
connecting two tables
•VALUES
specified by user
“What are the names of all courses in the English department?”
SELECT course.name FROM course, courseindept, department
WHERE course.id = courseindept.course_id AND
courseindept.dept_id = department.id AND department.name = ‘Engl’;
The Reason For Joins
DAY
TIMESPAN
id
start_time
day_id end_time
id
day
TIMEANDPLACE
id
course_id
ROOM
id
building_id
room
room_id
timespan_id
Course names
are here
COURSE
id
syn_num
title
min_cred
max_cred
scrch_only
FULFILLS
id
course_id
dist_id
DISTRO
id
dist_name
BUILDING
id
build_name
COURSEINDEPT
We need to
connect these
tables together
PROFESSOR
last_name
first_name
id
id
course_id
dept_id
TAUGHTBY
id
course_id
professor_id
Dept names
are here
course_num
section
DEPARTMENT
id
dept_name
SELECT course.name FROM course, courseindept, department
WHERE course.id = courseindept.course_id AND
courseindept.dept_id = department.id AND department.name = ‘Engl’;
Getting Back to Our Problem
•Endless possibilities for Phoenix output
•But many outputs have similar form
Courses:[Department].Geol
Courses:[Department].Engl
Courses:[Department].Soan
Constant Portion
Variable Portion
•We want to look for constant portions and map the variable
portions to appropriate SQL statements
Courses:[Department].Engl
department.name = ‘Engl’
The Patterns File: Our New Best Friend
Variable Portion (wildcard)
Courses:[Department].$1
| course.name
| course, courseindept, department
| JOIN_COURSE_DEPARTMENT
| department.name = $1
Add to SELECT field
Add to FROM field
Add to JOINS
Add to VALUES
Set equal to variable portion
# JOIN_COURSE_DEPARTMENT = course.id = courseindept.course_id AND
courseindept.dept_id = department.id
Courses:[Department].Engl
SELECT course.name FROM course, courseindept,
department WHERE course.id =
courseindept.course_id AND courseindept.dept_id =
department.id AND department.name = ‘Engl’;
Putting It All Together
•For SELECT, FROM, and JOINS, we can simply concatenate the
components for different lines
•For VALUES, we need to take logic into account
•(This is why we did all that crazy postfix stuff)
Courses:[Class_Period].5a
5a
Courses:[Department].Engl
Engl
Phil
Courses:[Department].Phil
Phil
(PhilEngl
Or Engl)
Logic:Or
Or
Logic:And
And
((Phil Or
Engl)
5a
And 5a)
_______
Stack
•If it’s not logic, put it on a stack
•If it is logic, pop things off the stack, match
to patterns file, combine, and put back on
Preparing Output
•Send the formed SQL query to the database, and
process the results
Database Results
Mozilla
Festival
Why Bimodal Output
•The project was originally envisioned with allspeech output
•Searches can return hundreds of results
•Festival speaks quite slowly
•Not designed for telephone use
•Text output of results allows user to find relevant
results quickly
•Mozilla provided a predefined user interface
General Framework
Speech
-to-text
Parser
Sphinx
Phoenix
DM
Dialogue
Manager
DB
Texttospeech
Database
Festival
Festival





Multi-Lingual
Text-to-Speech
British and American English
Initial plan vs. actual implementation
Response
Thank You from
Golibly
raciously
R eady
earranging
eimagining
Your
P arser
H
O
Ney
emes