Speech Quality
Download
Report
Transcript Speech Quality
Overcoming VoIP
Quality Challenges
Dr. Jan Linden, VP of Engineering
Global IP Solutions
3
Outline
VoIP Quality Challenges
Latency
Codec Choice
Conferencing
How to Measure Speech Quality
4
VoIP Design Considerations
Speech Quality
Time to Market
Ease of Use
Flexibility
Network
Impairments
Power
Consumption
Cost
Quality
Cost
Signaling
Infrastructure
Features
Device
Considerations
5
Major Challenges for VoIP End-point
Design
Both Sides of the Call Need to be Considered
Speech Codec
Hardware Issues
(Processor, OS,
Acoustics, etc.)
Codec
Hardware
Network
Coping with Network
Degredation
Power Consumption
VoIP Design
Challenges
Power
Echo
Echo Cancellation
Additional Voice
Processing Components
Voice
Environment
Environment –
Background Noise,
Room Acoustics, etc.
6
Delay
Major effect is “stepping on each other’s talk”
Usage scenario affects annoyance factor – higher
delay can be tolerated for mobile devices
Long delays make echo more annoying
Impact of IP Networks
Packet Loss
Smooth concealment
necessary
Network Jitter
Jitter buffer necessary to ensure continuous playout
Trade-off between delay and quality
7
Sources of Latency
Codec
Capture
Playout
Network delay
Jitter buffer
OS interaction
Transcoding
A/D
A/D
PrePreprocessi
Processing
ng
Speech
Speech
Encoding
encoding
IP
IP
Interface
interface
IP
Network
IP Network
D/A
D/A
PostPostprocessi
Processing
ng
Speech
Speech
Decoding
decoding
Jitter
Jitter
Butter
buffer
8
Impact of Delay on Voice Quality
Mean Opinion Score
4
3
2
1
0
250
500
One-w ay transmission time [ms]
750
Data from ITU-T G.114
ITU-T (G.114) recommends:
– Less than 150 ms one-way delay for most applications (up to 400
ms acceptable in special cases)
Users have got used to longer delays
– Still, low delay very important for high quality
9
Speech Codec
Many conflicting parameters
affect choice of codec
Determines upper limit of
quality
Complexity
Memory
Delay
Speech
Codec
Support of several codecs
necessary
– Interoperability
Input Signal
Robutness
Bit-rate
– Usage scenario
IPR issues a significant
concern
Packet-loss
Robustness
Quality
Sampling
Rate
10
Audio Spectrum
Better than PSTN quality is
achievable in VoIP
– Utilizing full 0 – 4 kHz
band in narrowband
– Wideband coding offers
more natural and crispier
voice
Telephony band
11
Audio Spectrum vs. Speech Quality
Speech Quality
Wideband
Speech
CD
Speech
Super
Wideband
Speech
Narrowband
Speech
(PSTN)
Frequency
4 kHz
8 kHz 10 kHz
16 kHz
22.1 kHz
12
Speech Codec Design for VoIP
Many standard codecs designed for bit errors, not
packet loss
– Error propagation issue for CELP codecs
Variable bit rate attractive for IP networks
Packet overhead significant (5 – 32 kb/s)
– Makes low bit rate codecs less attractive
Packet loss concealment a must
Jitter buffer design has significant impact on quality
Alternatives to standards
– De-facto standards like iSAC
– Open source like Speex
Echo Cancellation
High delay in VoIP makes echo problem more prominent
Network/Line echo cancellation for gateways
Acoustic echo cancellation
– Hands-free/speakerphone
– Small devices
Biggest challenge is AEC for PC
– Acoustic setup unknown and changing
– Wideband speech
– Very few solutions on the market
14
Effects of Transcoding
Transcoding occurs when the endpoints are using different codecs
– Every transcoding introduces distortion
– Low bit-rate codecs very sensitive to transcoding
Transcoding between networks
VoIP to PSTN
Limited quality
degradation since
G.711 used on the
PSTN side
VoIP to Cellular
Severe quality
degradation common
since low bit-rate
codecs typically used
on both sides
VoIP to VoIP
Usually occurs in
Session Border
Controllers
Can normally be
avoided
Transcoding in conferencing
– Mixing done in decoded domain results in transcoding
15
How to Make the VoIP Software Robust?
Very Quick Jitter Buffer
Adaptation – Conditions
Change Very Rapidly (on a
milisecond basis)
Minimize Delay
Everywhere – every
milisecond counts
Spot Jitter Patterns Increase Delay to Keep
Good Quality when
Unavoidable
Packet Loss
Concealment - Capable
of Handling Several Lost
Packets in a Row
16
Measuring Voice Quality
Subjective Methods
Test the “right thing”, i.e. subjective
quality
Takes all types of degradation into
account
Time consuming and costly
Lack of repeatability
Objective Methods
Simple and affordable
Inaccurate but repeatable results
Sensitive to any processing (nonlinear filtering, echo cancellation,
time warping etc.)
– Time synchronization major
challenge not yet solved
Sensitive to background and
equipment impairments
One step behind development of
codecs and error concealment
Next generation algorithm in
standardization process (P.OLQA)
Audio Conferencing
Design includes a trade-off between quality and
scalability
A
Client based or server based
–
–
Server based offers better scalability than client based
Can be combined
Transcoding often unavoidable
Two strategies:
–
–
Mix incoming signals to form one output signal
Only relay packets and mix at client side
Multi-codec support
–
In relay mode all endpoints need to support all codecs
Narrowband and wideband
–
–
–
Both can be present in a conference
Narrowband participant will hear everything in narrowband
Wideband participant hears others in narrowband or
wideband
A+B+C+D
E
A+B+C+E
B+C+D+E
D
A+B+D+E
B
A+C+D+E
C
18
Conclusions
Latency has a significant impact on the perceived quality
in VoIP
– Low latency, high quality (e.g. NetEQ) jitter buffer necessary
Choose the right codec for the usage scenario
– Or a codec that can adapt like iSAC
Transcoding should be avoided, if possible
Significantly better quality than PSTN possible
– Wideband coding
No good objective measure for speech quality exists
– Always combine with subjective evaluation