What`s new in Wideband Audio?

Download Report

Transcript What`s new in Wideband Audio?

What’s new
in Wideband
Audio?
Wideband Audio
• VoIP is indeed a disruptive technology, but has
it changed the life of the average consumer?
– Cost?
– Quality?
– Features?
• Wideband Audio codecs and improved handling
of music could soon change this dynamic
• Let’s discuss
– Technology behind the codecs
– Real-world implementations
Telecom Audio Spectrum
• Human voice: 80 Hz to 14,00Hz
• Narrowband: 8 kHz sampling (300-3400 Hz bandwidth)
–
Used in PSTN, mostly intelligible
• Wideband: 16 kHz sampling (50-7000 Hz bandwidth)
–
Used in VoIP
Wideband Audio?
• Captures significantly more speech information
– Significant improvement in speech quality over
traditional PSTN
– Improved naturalness & presence below 200Hz
– Increased intelligibility above 3,400Hz
• Improves user experience and satisfaction
– New applications – voice recognition
– Customer retention
– Fewer misunderstandings
Wideband Enablers
• Telecom was about minimizing transport cost
– Now about differentiation and enhancing the user experience
• Access bandwidth was limited
– Broadband access now a reality: high bandwidth delivered at
low cost
• 1 - 10 Mbits/s
• Cost of WB is similar to NB @ 64kbps
• Endpoints and Network were not wideband capable
Now:
– VoIP, Wideband DECT, Skype, Microsoft OCS
– Wireless deployments: wideband, music codecs
– Private / corporate networks, Tandem Free Operation (TFO),
Wideband extension, Wideband SLICS
The Technology
Lossy Codec Classes
• Speech communication codecs (G.72X, AMR et.al)
–
–
–
–
Designed for “real-time” speech, music handled poorly
Low sampling rate (8-16KHz), low fidelity
Low-medium delay (10-30 ms)
Mostly time-domain (CELP is the most popular)
• Music codecs (MP3, AAC, Vorbis)
– Can encode any signal (not optimal for speech) – designed for
entertainment
– Up to 48 kHz sampling rate (full bandwidth), high fidelity (“CDquality”
– High delay (>100 ms)
– Mostly frequency domain (MDCT-based)
Speech Codec Spectrum
Example Codec
Applications
Deployed
More than 15Khz
Full Band (20Khz)
AAC-LD
Presence
14Khz
Super Wideband
G.722.1C (Siren14),
SILK
VoIP, Audio
Conf
7Khz
Wideband
G.722.2 (AMR-WB),
SVOPC
BB VoIP &
Audio Chat
3.5Khz
Narrowband
G.729, G.723.1
G.711, iSAC
PSTN &VoIP
Bandwidth
(Video Conf)
Super wideband
ITU and 3GPP codec
roadmap
narrowband
wideband
EV-VBR
2008
G.722
1988
G.726
1984
GSM-FR G.728
1987
1992
G.722.2
AMR-WB
2002
G.722.1
1999
GSMHR
1994
GSMEFR
1995
ITU
G.729
1995
G.729.1
2007
AMR-NB
1999
3GPP
Legend:
3GPP
&
ITU
Years
Embedded Speech Codecs
• ITU-Super WB
–
–
–
–
Provides extended bandwidth and stereo capabilities
16 KHz audible bandwidth
Stereo extension
Generic extension applicable to wideband codecs e.g.. ITU
G.729.1 & EV-VBR
• 3GPP-EPS (evolved packet system) (aka LTE)
– ITU EV-VBR is well positioned to meet future EPS requirements
– Interoperable with 3GPP AMR-WB.
• Open Codecs
– Speex (4 to 42Kbps)
• Royalty free but limited to non patented techniques (ACELP for
example)
Music Codecs
• MPEG-1 Layer III (aka MP3)
– Built on top of Layers I and II
– First-generation, very inefficient
• AAC
–
–
–
–
Second generation, much better than MP3
Flexible, kitchen-sink type of approach
Tons of tools and partially incompatible profiles
Variants: AAC-LC, AAC-LD, AAC-HE, ...
• Vorbis
– Second-generation, similar quality to AAC
– Open-source, royalty-free (Xiph.Org Foundation)
Future of codecs
• Improving quality
–
–
–
–
Super-wideband, coding of music
The gap between speech and music codecs is closing
AMR-WB+, G.722.1x moving to music, higher quality
AAC-LD moving to lower delay
• Reducing delay
• Increasing robustness
– Shift from bit-error robustness to packet loss
robustness
Improved Music Handling
• Background music is poorly handled
– Most speech codecs (AMR-NB, G.729, AMRWB, Speex etc) are derivatives based on CELP
– CELP makes assumptions that are only valid for
speech (and single-note music)
– CELP does not perform well on music –
especially at low bit-rate
– Music codecs are not suitable for speech
Improved Music Handling
How do you improve the handling of
background music?
• Three strategies:
1. Increase the bit-rate
2. Dual-mode codecs (e.g. AMR-WB+)
3. Use non-CELP codecs (AAC-LD, G.722.1x,
G.711.1, CELT, …)
Wideband Extension (WEx)
as an interim solution
How do you provide a wideband experience when
linking a wideband-capable client to the PSTN?
• Current solution: up-sample the narrowband
speech to 16 kHz
• Better solution: Create wideband “artificially”
from the narrowband speech
• Support becoming available
– WEx capable handsets (Philips for example)
– WEx enabled Media Gateway (Vocallo for example)
The Implementations
a.k.a The Role of the Media Gateway
Wideband VoIP DECT France Telecom
Mobile
Platform
IAD
Access
Platform
IP
Network
TDM
Network
DLC
IMS GW
Access
Platform
IAD
Wide Band Extension (WBE)
Mobile
Platform
Wide Band Extension
Expand the signal to
create impression of
wideband.
AEC
ANR
NLE
IP
Network
WBE
LEC
IP/DLC
IMS GW
TDM
Network
DLC
Access
Platform
IAD
Improving the User
Experience
Wideband Lite Acoustic Echo Canceller
acts as a complement to badly designed
handset
Wideband Natural Level
Enhancement, uses info from
intensity of the voice and SNR to
compensate for loud environment of
the talker
Mobile
Platform
AEC
ANR
NLE
IP
Network
Wideband Adaptive Noise
Reduction reduces noise of
mobile handset environment.
IMS GW
IP/DLC
TDM
Network
DLC
Access IAD
Platform
The role of the MGW
• When selecting MGW solutions:
– Don’t just look for checklist of codecs!
– Look for solutions that provide wideband
extension, wideband ECAN, ANR, etc.
– Select solutions that incur low latency when
transcoding IP-to-IP communications
Summary
• Clear benefit to the users
– Skype changed expectation levels
• Technology enablers already in place
– VoIP deployment
– CODECS
– WB-enabled end-points and MGWs available