TeleMorph & TeleTuras: Bandwidth determined Mobile
Download
Report
Transcript TeleMorph & TeleTuras: Bandwidth determined Mobile
TeleMorph & TeleTuras:
Bandwidth determined Mobile MultiModal
Presentation
Student: Anthony J. Solon
Supervisors: Prof. Paul Mc Kevitt
Kevin Curran
School of Computing and Intelligent Systems
Faculty of Engineering
University of Ulster, Magee
Aims of Research
To develop an architecture, TeleMorph, that dynamically morphs between output
modalities depending on available network bandwidth:
Mobile device’s output presentation (unimodal/multimodal) depending on
available network bandwidth
network latency and bit error rate
mobile device display, available output abilities, memory, CPU
user modality preferences, cost incurred
user’s cognitive load determined by Cognitive Load Theory (CLT)
Utilise Causal Probabilistic Networks (CPNs) for analysing union of constraints
giving optimal multimodal output presentation
Implement TeleTuras, a tourist information guide for city of Derry
Objectives of Research
Receive and interpret questions from user
Map questions to multimodal semantic representation
Match multimodal representation to knowledge base to retrieve answer
Map answers to multimodal semantic representation
Monitor user preference or client side choice variations
Query bandwidth status
Detect client device constraints and limitations
Combine affect of all constraints imposed on system using CPNs
Generate optimal multimodal presentation based on bandwidth constraint data
Wireless Telecommunications
Generations of Mobile networks:
1G - Analog voice service with no data services
2G - Circuit-based, digital networks, capable of data transmission speeds
averaging around 9.6K bps
2.5G (GPRS) - Technology upgrades to 2G, boosting data transmission
speeds to around 56K bps. Allows packet based “always on” connectivity
3G (UMTS) - digital multimedia, different infrastructure required, data
transmission speeds from 144K-384K-2M bps
4G - IP based mobile/wireless networks, Wireless Personal Area Networks
(PANs), ‘anywhere and anytime’ ubiquitous services. Speeds up to 100Mbps
Network-adaptive multimedia models:
Transcoding proxies
End-to-end approach
Combination approach
Mobile/Nomadic computing
Active networks
Mobile Intelligent MultiMedia
Systems
SmartKom (Wahlster, 2003)
Mobile, Public, Home/office
Saarbrücken, Germany
Combines speech, gesture and facial expressions on input & output
Integrated trip planning, Internet access, communication applications,
personal organising
VoiceLog (BBN, 2002)
BBN technologies in Cambridge, Massachusettes
Views/diagrams of military vehicles and direct connection to support
Damage identified & ordering of parts using diagrams
MUST (Almeida et al., 2002)
MUltimodal multilingual information Services for small mobile Terminals
EURESCOM, Heidelberg, Germany
Future multimodal and multilingual services on mobile networks
Please select a
parking place from
the Map
Intelligent MultiMedia Presentation
Flexibly generate various presentations to meet individual requirements of:
1) users, 2) situations, 3) domains
Intelligent MultiMedia Presentation can be divided into following processes:
determination of communicative intent
content selection
structuring and ordering
allocation to particular media
realisation in specific media
coordination across media
layout design
Key research problems:
Semantic Representation
Fusion, integration & coordination
Semantic representation - represents meaning of media information
Frame-based representations:
- CHAMELEON
- REA
XML-based representations:
- M3L (SmartKom)
- MXML (MUST)
- SMIL
- MPEG-7
Fusion, integration & coordination of modalities
Integrating different media in a consistent and coherent manner
Multimedia coordination leads to effective integrated multiple media in
output
Synchronising modalities
Time threshold between modalities E.g. Input - “What building is
this?”, Output - “This is the Millenium forum”
Not synchronised => side effect can be contradiction
SMIL modality synchronisation and timing elements
Intelligent MultiMedia Presentation
Systems
Automatically generate coordinated intelligent multimedia presentations
User-determined presentation:
COMET (Feiner & McKeown, 1991)
COordinated Multimedia Explanation Testbed
Generates instructions for maintenance and repair of military radio
receiver-transmitters
Coordinates text and 3D graphics of mechanical devices
WIP (Wahlster et al., 1992)
Intelligent multimedia authoring system
presents instructions for assembling/using/maintaining/repairing
devices (e.g. espresso machines, lawn mowers, modems)
IMPROVISE (Zhou & Feiner, 1998)
Graphics generation system
constructive/parameterised graphics generation approaches
Uses an extensible formalism to represent a visual lexicon for graphics
generation
Intelligent MultiMedia
Interfaces & Agents
Intelligent multimedia interfaces
Parse integrated input and generate coordinated output
XTRA
Interface to an expert system providing tax form assistance
Generates & interprets natural language text and pointing gestures
automatically; relies on pre-stored graphics
Displays relevant tax form and natural language input/output panes
Intelligent multimedia agents
Embodied Conversational Agents (e.g. MS Agent, REA)
Natural human face-face communication - speech, facial expressions, hand
gestures & body stance
MS Agent
Set of programmable services for interactive presentation
Speech, gesture, audio & text output; speech & haptic input
Project Proposal
Research and implement mobile intelligent multimedia presentation architecture
called TeleMorph
Dynamically generates multimedia presentation determined by bandwidth
available; also other constraints:
Network latency, bit error rate
Mobile device display, available output abilities, memory, CPU
user modality preferences, cost incurred
Cognitive Load Theory (CLT)
Causal Probabilistic Networks (CPNs) for analysing union of constraints giving
optimal multimodal output presentation
Implement TeleTuras, a tourist information guide for city of Derry providing
testbed for TeleMorph incorporating:
route planning, maps, spoken presentations, graphics of points of interest &
animations
Output modalities used & effectiveness of communication
TeleTuras examples:
“Where is the Millenium forum?”
“How do I get to the GuildHall?”
“What buildings are of interest in this area?”
“Is there a Chinese restaurant in this area?”
Architecture of TeleMorph
Data flow of TeleMorph
High level :
Media Analysis :
Comparison of Mobile Intelligent
MultiMedia Systems
Comparison of Intelligent
MultiMedia Systems
Software Analysis
Client output:
SMIL media player (InterObject)
Java Speech API Markup Language (JSML)
Autonomous agent (MSAgent)
Client input:
Java Speech API Grammar Format (JSGF)
J2ME graphics APIs
J2ME networking
Client device status:
SysInfo MIDlet - (type/memory/screen/protocols/input abilities/CPU speed)
TeleMorph server tools:
SMIL & MPEG-7
HUGIN (CPNs)
JATLite/OAA
Project Schedule
Conclusion
A Mobile Intelligent MultiModal Presentation Architecture called TeleMorph will
be developed
Dynamically morphing between output modalities depending on available
network bandwidth in conjunction with other relevant constraints
CPNs for analysing union of constraints giving optimal multimodal output
presentation
TeleTuras will be used as testbed for TeleMorph
Corpora of questions to test TeleTuras (prospective users/tourists)