Sports Scores Speech Recognition System Major League Baseball

Download Report

Transcript Sports Scores Speech Recognition System Major League Baseball

Sports Scores Speech
Recognition System
Major League Baseball Score System
Development Team
Members

Dan Corkum
 Jason NguyenTrieu
 Dan Ragland
 Quang Vu
 Andrew Wagner
(Director)
(Producer)
Sponsor: Jim Larson, Intel Corporation
Goals & Objectives

Develop a compelling Speech Recognition
Application for Retrieval of Sports
Information.
 Incorporate Ease of Use Techniques
including: Tapered Prompts, Global
Commands, Barge-In, Repair Dialogs, and
others.
 Develop an Architecture that is both Robust
and Modular. Design for Reuse.
Example
Application
Cellular Phone Application
– Using Wireless Web
– Embedded Windows CE
(Auto PC)
Core Modules

“Web Viking” – Parse Internet Web Pages to retrieve
sports information.

Data Warehousing & Querying – Database for
storage of searchable information.

Client and Server Communication – Enables
communication between Server and remote Clients.

VUI (Voice User Interface) Voice Prompts
and Response System – The core engine that
controls the entire VUI.

Dialog Database – Contains the content for the textto-speech prompts and response criteria.
Architecture - Server
Architecture - Client
Web Viking

The purpose of the Web Viking is to retrieve data
from web sites, parse and format it into a format
so that the database interface can understand it.
 There are three data collection scripts: Schedule,
Scores, and Standing/Ranking
 The data comes from 2 sources:
– Major League Baseball
– ESPN

Two chances to get the right data:
– First, we get data from MLB web site and parse it. If it
fails for any reason, we'll try to get data from the
ESPN web site.
Web Viking

How is the data retrieved?
–

We used the library functions available in the CPAN
(Comprehensive Perl Archive Network.)
– The HTTP::Request module: package up the URL
request
– The HTTP::Response module: handle the data coming
back.
How the data is parsed:
1. Match and strip off unnecessary data.
2. Regular expression
3. Split
4. Format data and check result.
Database & Queries

The Database was implemented using MS Access.
 It functions as a storage site keeping track of team
names, scores associated with each team,
league/division ranking information, and the
schedules for each game.

The Database Handler was written in Java.
 Its primary purpose is to query the database and
fetch the results to the sport score server.
Client & Server Communication

Being an Internet based application, the server is
designed to support multiple clients
simultaneously.
 Communications is implemented using TCP
(Transmission Control Protocol). A secure,
reliable, and widely used Internet protocol.
 The maximum number of clients supported by the
Sports Score server is administrator configurable
based on the performance needs of the server.
Client & Server Communication

Both server and client-side communications are
data independent.
 Data is encapsulated in a packet before
transmission. Data wrapper contains information
pertaining to what type of data is encapsulated,
and it’s size.
 Data packeting allows for multiple information
types (ping, data request, communications
termination, etc…)
 Labeling each packet with a type allows for quick
identification and routing of information to
necessary destinations within the server/client.
VUI (Voice User Interface)
Voice Prompts and Response System
User Interface and Underlying Logic
VUI
Design Considerations
Two Options For Design:
1. Dialog logic coded directly into code.
2. Dialog logic entered into a data structure
and presented by separate internal logic.
VUI
Advantages & Disadvantages
of Hard-Coded Dialogs

Fast initial
implementation
 Ultimate flexibility of
features

Duplicated code
 Difficult to provide
consistent global
functionality
 Hard-coded grammars
VUI
Advantages & Disadvantages
of Dialog Database





Good design: Data
separated from
presentation
Consolidation of code
Easy to create and
maintain dialogs
Features aided by use of
recursion
Computer-generated
grammars

Much work required
before any results seen
 Difficult to customize
specific components
VUI
Decision: Dialog Database

Sports Score dialogs all follow the same
basic pattern
 Implementation could be modularized by
separating the dialogs from their
presentation logic
 The gains made by the ease of entry and
flexibility for the end-user outweighed the
losses in implementation time
 Some features require recursion
VUI
VUI Features






Tapered, User-Level Sensitive Prompts
Tapered, User-Level Sensitive Help
Barge-In capability
User shortcut capability (users can answer future
prompts from any prompt)
Navigational user commands (“back”,”quit”,etc)
Enumerated user commands to allow the user to say a
number as an alternative to the command
VUI
Queries

All query parameters are accumulated in an
XML document
 When a query occurs, the document is sent
to the server
 The server returns an XML document
containing results
 The results are read to the user based on
administrator-defined result strings
Why XML?

XML is fast becoming the industry standard
for data transfer over the Internet
 XML’s hierarchical structure lends itself to
this application
 Several XML parsers already exist for
various platforms (we used IBM’s XML4J)
 The HTML-like nature of XML makes
results easy to read, even for a human.
How Query Results Are Read

The administrator defines parameter-value
pairs as criteria for which response is read
 Each response consists of segments of
literal text along with parameter values
(which can be drawn either from the client
or server)
VUI
The Results

The front-end is very customizable
 Dialogs can be built simply and quickly
 The system administrator needs no
knowledge of programming concepts
 The overall behavior of the system could be
changed without changing each prompt
 The computer speech engine is accessed in
only one area of code, so it could be
swapped with minimal effort
Dialog Structure

The Dialog System consists of:
– Prompts
– Responses
– Help System
 All Dialogs are tapered (Prompts, Responses, & Help)
 Repair Dialogs – Example: Two teams from same city
(New York  Mets and Yankees)
Dialog Structure Overview
Main Menu
HELP
Sc
ore
Sc
Rank
he
du
le
Score
Info.
HELP
Ranking
Info.
HELP
Scheduling
Info.
Info by
League
HELP
Summary

We not only developed a powerful Speech
Recognition Application for Retrieval of Sports
Information, we also developed a reusable
framework which can be easily modified for use in
other applications.

We incorporated Ease of Use Techniques
including: Tapered Prompts, Global Commands,
Barge-In, Repair Dialogs, and others.
More Information is available on the Web:
http://www.cs.pdx.edu/~danr/public/capstone/