VoiceXML implementation

Download Report

Transcript VoiceXML implementation

VoiceXML implementation
VoxBuilder vs. OpenVXI
VoiceXML Overview
–
Provide an automated user interface to web-like
content via telephone or VoIP
–
Uses synthesized speech, or pre-recorded audio as
output
–
Recognition of spoken word and DTMF keys serve
as input
–
Control mechanisms allow for serial access to data
“Hello World” Example
●
<?xml version="1.0"?>
●
<vxml version="1.0">
●
●
●
●
<form>
<block>Hello World!</block>
</form>
</vxml>
Form tag and child tags
●
●
●
●
●
Like a paper form, each field must be filled out
before accessing the next
Field, child of form, supplies a location for
user response
Grammar, child of field, specifies expected
audio response
Prompt, child of field, asks user for input
Filled, child of field, executes when user
input matches the grammar
Form Example
This will respond back to the user with any number between 0 and 9999
<form id="hello_form">
<field name="first">
<grammar>
NATURAL_NUMBER_THRU_9999
</grammar>
<prompt>
<audio>Say a number.</audio>
</prompt>
<filled>
<audio> You said </audio>
<audio><value expr="first"/></audio>
<filled>
</field>
</form>
System Overview
Web Server
HTML and VXML
Content
IP Network
VoiceXML Interpreter
HTML Scraper
Synthesis & Recognition
Telephony Services
OpenVXI
●
Advantages
–
Open source for easy modification, and verification
of code
–
Dedicate use of server may allow for better
performance
–
Run on own server, allows explicit and direct
control
●
Allows users choice of speech synthesis and voice
recognition packages, telephony integration, operating
system
OpenVXI
●
Other notes
–
Tested with a number of telephony and speech APIs
including JTAPI, TAPI, JSAPI, SAPI and Sphinx
III
–
Expects Natual Lanuguage Semantic Markup
Language (NLSML) as a recognition response
–
Operates on v1.0 and v2.0 VXML
–
Does not explicitly require certain telephony API
calls such as source number
OpenVXI
●
Performance
–
Use of a separate server for speech synthesis and
recognition speeds processing
–
Caching any remote documents and scripts can
improve response in VXML parsing
VoxBuilder
●
Remote hosted Internet based VoiceXML
service provider
–
Provides all required equipment and software for
adding VoiceXML to an existing Web presence
–
Does offer hosting for VXML, scripts and audio,
they may optionally be remotely hosted
–
Provides Web interface for all development and
management
–
Multiple project support
VoxBuilder
●
Focused on European deployment
–
Access to European phone system
–
Multilingual support
●
Prompt in one language
●
Accepts responses in another language
–
13 different languages supported
–
Integrated phone management for different billing
rates
Comparing voxBuilder and OpenVXI
●
For small projects particularly multi-country,
multi-language projects, voxBuilder is the right
choice
●
●
Easy quick deployment on a proven platform with little
initial cash outlay and minimal effort
For large corporations interested in high
volumes or more precise control OpenVXI
works best
●
Your choice of text-to-speech, and recognition software,
your choice of setup for optimal performance under
heavy load conditions
Information Retrieval and VXML
●
Broadens IR searching toolkit
–
Typical web browsing involves a large display, not
practical for phones or a small PDA
–
Viewing a display while driving is not
recommended
–
High speed connections for image download is
needed for typical web page
Information Retrieval and VXML
●
Adds a layer of difficulty to search engine
designers
–
Current voice recognition technology works on a
limited grammar set or a long training period
–
A search which can involve any word (including
foreign words), it not possible today
–
Future improvements of the voice recognition
algorithms could alleviate this problem