Sports Scores Speech Recognition System Major League Baseball

Download Report

Transcript Sports Scores Speech Recognition System Major League Baseball

Sports Scores Speech
Recognition System
Major League Baseball Score System
Development Team
Members

Dan Corkum
 Jason NguyenTrieu
 Dan Ragland
 Quang Vu
 Andrew Wagner
(Director)
(Producer)
Sponsor: Jim Larson, Intel Corporation
Goals & Objectives

Develop a compelling Speech Recognition
Application for Retrieval of Sports
Information.
 Incorporate Ease of Use Techniques
including: Tapered Prompts, Global
Commands, Barge-In, Repair Dialogs, and
others.
 Develop an Architecture that is both Robust
and Modular. Design for Reuse.
Core Modules

“Web Viking” – Parse Internet Web Pages to retrieve
sports information.

Data Warehousing & Querying – Database for
storage of searchable information.

Client and Server Communication – Enables
communication between Server and remote Clients.

VUI (Voice User Interface) Voice Prompts
and Response System – The core engine that
controls the entire VUI.

Dialog Database – Contains the content for the textto-speech prompts and response criteria.
Web Viking

The purpose of the Web Viking is to retrieve data
from web sites, parse and format it into a format
so that the database interface can understand it.
 There are three data collection scripts: Schedule,
Scores, and Standing/Ranking
 The data comes from 2 sources:
– Major League Baseball
– ESPN

Two chances to get the right data:
– First, we get data from MLB web site and parse it. If it
fails for any reason, we'll try to get data from the
ESPN web site.
Web Viking

How is the data retrieved?
–

We used the library functions available in the CPAN
(Comprehensive Perl Archive Network.)
– The HTTP::Request module: package up the URL
request
– The HTTP::Response module: handle the data coming
back.
How the data is parsed:
1. Match and strip off unnecessary data.
2. Regular expression
3. Split
4. Format data and check result.
Data Warehousing & Querying

The Database was implemented using MS Access.
 It functions as a storage site keeping track of team
names, scores associated with each team,
league/division ranking information, and the
schedules for each game.

The Database Handler was written in Java.
 Its primary purpose is to query the database and
fetch the results to the sport score server.
Client & Server Communication

danC stuff
Client & Server Communication

danC #2
VUI (Voice User Interface)
Voice Prompts and Response System
User Interface and Underlying Logic
VUI
Design Considerations
Two Options For Design:
1. Dialog logic coded directly into code.
2. Dialog logic entered into a data structure
and presented by separate internal logic.
VUI
Advantages & Disadvantages
of Hard-Coded Dialogs

Fast initial
implementation
 Ultimate flexibility of
features

Duplicated code
 Difficult to provide
consistent global
functionality
 Hard-coded grammars
VUI
Advantages & Disadvantages
of Dialog Database





Good design: Data
separated from
presentation
Consolidation of code
Easy to create and
maintain dialogs
Features aided by use of
recursion
Computer-generated
grammars

Much work required
before any results seen
 Difficult to customize
specific components
VUI
Decision: Dialog Database

Sports Score dialogs all follow the same
basic pattern
 Implementation could be modularized by
separating the dialogs from their
presentation logic
 The gains made by the ease of entry and
flexibility for the end-user outweighed the
losses in implementation time
 Some features require recursion
VUI
VUI Infrastructure Design

Features
– Tapered, User-Level Sensitive Prompts
– Tapered, User-Level Sensitive Help
– Barge-In capability
– User shortcut capability (users can answer future prompts
from any prompt)
– Navigational user commands (“back”,”quit”,etc)
– Enumerated user commands to allow the user to say a
number as an alternative to the command
VUI
Dialog Components

Prompt
– Point of user interaction
– Has associated Prompt Text levels, including
text to be read, the user level for which it is to
be read, and the number of visits before the
next user-level is used
– Has associated Commands, or phrases the user
is allowed to say and the actions to take
– Has a parameter name to be used in a query
VUI
Dialog Components (cont)

Commands
– The text the user says to access the command
– The text that will be returned when this
command is accessed (used in a query)
– A flag to indicate whether or not the command
is to be enumerated
– The action the system is to take when the
command is accessed
VUI
Dialog Components (cont)

Scripts
– Series of prompts to be called in succession

Script Steps
– Individual prompts belonging to a script
– Each contains its own grammar (reflecting
shortcuts available later in the script)
– Each contains a flag indicating whether or not a
query will be performed following the step
VUI
How A Prompt Works

Shortcuts take place when a user answers
multiple prompts in a row, so the first thing
the prompt does is checks for overflow from
the last prompt. If there is overflow, jump
ahead to the command processing.
Otherwise, cycle through the following:
– Find the appropriate prompt text to be read to
the user based on user level and number of
visits.
VUI
How A Prompt Works (cont)
– If the user requires help, find the appropriate help text
to be read
– Begin the reading of the help and prompt text to the
user
– At the same time, begin listening for a user response (if
the user responds while it is reading, interrupt the
reading)
– When the computer finishes reading, begin timing.
After five seconds with no user speaking, time-out.
VUI
How A Prompt Works (cont)
– Attempt to match what the user said to a
command that is available at this prompt.


Match using the longest available command, so if
the user said “New York Yankees”, match “New
York Yankees,” not “New York”
Any portion of what the user said that was not
matched (if anything was matched at all) gets sent to
proceeding prompts for processing. Example: The
user said “score New York” in the first prompt. If
the prompt matches “score”, “New York” will get
passed to any following prompts.
VUI
How A Prompt Works (cont)


When a command is matched, the command’s return
value is attached to the parameter name of the
prompt
The action that is then performed is dictated by the
command. Some possibilities are:
– Calling another script/prompt and returning all values
– Calling another script/prompt and returning only those
values
– Repeating the prompt and reading help to the user
– Changing the user level
– Running a query and repeating or calling another
prompt/script
VUI
How A Script Works

A script is presented simply by presenting the first
script step in the script
 A script step presents its associated prompt, using
its own grammar (reflecting the ability of the user
to shortcut to the next script step)
 After the script step is executed, a query may be
performed and the next script step (if any) may be
performed
VUI
Other Dialog Routines

Components are also involved in routines to
build grammars acceptable to Microsoft’s
SAPI interface
– The dialog structure is descended recursively,
with all dependent grammars being included in
each prompt’s grammar
– Global commands are also created and added to
grammars
VUI
Queries

All query parameters are accumulated in an
XML document
 When a query occurs, the document is sent
to the server
 The server returns an XML document
containing results
 The results are read to the user based on
administrator-defined result strings
Why XML?

XML is fast becoming the industry standard
for data transfer over the Internet
 XML’s hierarchical structure lends itself to
this application
 Several XML parsers already exist for
various platforms (we used IBM’s XML4J)
 The HTML-like nature of XML makes
results easy to read, even for a human.
How Query Results Are Read

The administrator defines parameter-value
pairs as criteria for which response is read
 Each response consists of segments of
literal text along with parameter values
(which can be drawn either from the client
or server)
Query Results Example

Criteria
– Function = “Score”

To be read
– “The Yankees score
– Team = “Yankees”
–
–
–
–
–
was”
<TeamA>
<ScoreA>
“To”
<TeamB>
<ScoreB>
VUI
The Results

The front-end is very customizable
 Dialogs can be built simply and quickly
 The system administrator needs no
knowledge of programming concepts
 The overall behavior of the system could be
changed without changing each prompt
 The computer speech engine is accessed in
only one area of code, so it could be
swapped with minimal effort
Dialog Structure

The Dialog System consists of:
– Prompts
– Responses
– Help System
 All Dialogs are tapered (Prompts, Responses, & Help)
 Repair Dialogs – Example: Two teams from same city
(New York  Mets and Yankees)
Dialog Structure Overview
Main Menu
HELP
Sc
ore
Sc
Rank
he
du
le
Score
Info.
HELP
Ranking
Info.
HELP
Scheduling
Info.
Info by
League
HELP
Summary

We not only developed a powerful Speech
Recognition Application for Retrieval of Sports
Information, we also developed a reusable
framework which can be easily modified for use in
other applications.

We incorporated Ease of Use Techniques
including: Tapered Prompts, Global Commands,
Barge-In, Repair Dialogs, and others.