ICA 2004, Kyoto, Japan, April 4 - EE-IITB

Download Report

Transcript ICA 2004, Kyoto, Japan, April 4 - EE-IITB

ICA 2004, Kyoto, April 4-9, 2004 / Session: HRS43 Paper No. 00413 (Tu.5.X2.5)
Evaluation of Speech Processing
Schemes Using Binaural Dichotic
Presentation to Reduce the Effect of
Masking in Hearing-Impaired Listeners
By
A.N. Cheeran
P.C. Pandey
IIT Bombay, India
http://www.ee.iitb.ac.in
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
1
Outline of the presentation
● Introduction
● Processing schemes
● Evaluation
● Test results
● Conclusion
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
2
Introduction (1/9)
Peripheral
auditory
system
Outer ear
Middle ear
Inner ear
Vestibular apparatus
with semicircular
canals
Vestibular
nerve
Malleus
Incus
Stapes
Pinna
Cochlear
nerve
External
canal
Cochlea
Eardrum
(Tympanic
membrane)
Oval window
Round window
Eustachian tube
Nasal cavity
Hearing
impairments
• Conductive
• Sensorineural • Central • Functional
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
3
Introduction (2/9)
Causes of Sensorineural Loss
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
• Damage to hair cells in the cochlea
• Degeneration of auditory nerve fibers
Sensorineural Loss Characteristics
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
• Increased hearing thresholds
• Reduced dynamic range, loudness recruitment
• Poor freq. selectivity, increased spectral masking
• Poor time resolution, increased temporal masking
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
4
Introduction (3/9)
Effects of Increased Spectral Masking
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Broader auditory filters
 Smearing of spectral peaks & valleys
 Reduction in internal spectral contrast.
 Reduced discrimination of consonantal place feature.
Effects of Increased Temporal Masking
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Forward & backward masking of weak segments
by strong ones.
 Reduced ability to discriminate sub-phonemic segments
like noise bursts, voice-onset time, formant transitions.
 Reduced discrimination of duration & other
consonantal features.
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
5
Introduction (4/9)
Dichotic Presentation for
Bilateral Residual Hearing
 Masking at the peripheral level.
 Information integration from the two ears at higher levels.
 Binaural dichotic presentation for bilateral residual hearing:
speech signal split into a complementary form.
 Signal components likely to mask each other
presented to different ears.
 Improved speech recognition due to reduction in the
effects of masking.
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
6
Introduction (5/9)
Binaural Presentation Schemes
Spectral Splitting
Filtering by two complementary comb filters:
 Sensory cells corresponding to alternate filter bands
on the basilar membrane not stimulated.
 Reduction in spectral masking.
 Better consonantal place reception.
Temporal Splitting
Gating by two complementary fading functions:
 Cyclic stimulation and relaxation of all the sensory cells
 Reduction in temporal masking
 Better duration and place reception
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
7
Introduction (6/9)
Binaural Presentation Schemes (contd.)
Combined Splitting
Processing by two time-varying complementary comb filters
 Sensory cells in alternate bands in a given ear stimulated at
a time. All the sensory cells on the basilar membrane get
periodic relaxation from stimulation.
 Reduction in spectral and temporal masking
 Better reception of consonantal duration, place, and other
features.
 Better overall speech recognition & reduction in load on
perception process
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
8
Introduction (7/9)
Earlier Work
Lunner et al. [Linkoping, 1993].
• Comb filters using filter bank with constant BW (700 Hz) filters
& band gain adjustment : Improvement of 2 dB in SNR.
• Combined splitting by symmetrical switching (at 50 Hz) of
comb filter bands: No further improvement. Poor sound quality
due to switching.
Chaudhari & Pandey [IIT Bombay, 1998].
• Comb filters with auditory critical bands and sharp transitions.
Improvement in reception of primarily the consonantal place
feature. Reduction in response time.
• Individually adjustable frequency response  further
improvement, but not distinctly for place.
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
9
Introduction (8/9)
Earlier Work (contd.)
Jangamashetti & Pandey [IIT Bombay, 2003]
 Temporal splitting using symmetrical inter-aural switching of
20 ms along with overlap and various fading functions.
Improvement mainly in the reception of the duration feature
Jangamashetti, Cheeran & Pandey [IIT Bombay, 2003]
 Combined splitting using time-varying comb filters with
constant sweep cycle of 20 ms and with 2, 4, 8, & 16 shiftings
Improvement in reception of both duration & place features
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
10
Introduction (9/9)
Objective of the present study
Implementation of binaural dichotic presentation
schemes & evaluation through listening tests
 Spectral splitting with perceptually balanced comb filters
 Temporal splitting with trapezoidal fading functions of interaural switching periods of 20, 40, 80 ms
 Combined splitting with cyclically swept comb filters with
sweep cycle of 20, 40, 80 ms and 4, 8, and 16 shiftings
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
11
Processing schemes (1/3)
Spectral Splitting with Perceptually
Balanced Comb Filters
Parameters:
Transition width: 78 Hz – 117 Hz, Inter-band crossover gains: -4 to -6 dB,
Pass band ripple < 1 dB.
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
12
Processing schemes (2/3)
Temporal Splitting with Trapezoidal Fading
Parameters:
Inter-aural switching period (N ) : 20, 40, 80 ms
Duty cycle (L) : 70 % of N ,
Transition duration (M ) : 3 ms
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
13
Processing schemes (3/3)
Combined Splitting with Time-varying
Comb Filters
Parameters:
Sweep cycle duration :
20, 40, 80, 120, 160 ms
No of shiftings (m) : 4, 8, 16
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
14
Evaluation
Listening Tests for Evaluation
Type of evaluation: Open-set
Test material: Phonetically balanced set of monosyllables (used
at AYJNIHH, Mumbai) in the first language of the subjects.
Subjects & listening condition:
• Seven normal hearing S’s with loss simulated by adding
Gaussian noise with short-time SNRs of , 3, 0, -3, -6, -9 dB.
MCL (70-75 dB SPL).
Performance measurement:
• Response time (indication of load on perceptual processing)
• Recognition scores
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
15
Results (1/4)
Listening test results
Averaged scores for normal hearing S’s (7) with simulated loss
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
16
Results (2/4)
Averaged Recogn. Scores vs SNR
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
17
Results (3/4)
Relative improvements in recogn. scores
Unprocessed (% rec. score) = 23.9 at –9dB, 39.9 dB at –6 dB &
66.3 at –3 dB
Processing scheme
Rel. imp. in rec. score (%)
-9 dB SNR -6 dB SNR -3 dB SNR
SS
TS-20
TS-40
TS-80
CS-20/4
CS-20/8
CS-20/16
CS-40/4
142.8
31.8
23.4
-30.4
82.2
83.8
90.3
97.5
93.2
26.9
25.4
-21.3
54.5
61.4
54.1
69.3
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
30.1
8.1
4.6
-9.5
10.5
12.5
17.4
17.4
18
Results (4/4)
Relative improvements in recogn. Scores (contd.)
Unprocessed (% rec. score) = 23.9 at –9dB, 39.9 dB at –6 dB &
66.3 at –3 dB
Processing scheme
Rel. imp. in rec. score (%)
-9 dB SNR -6 dB SNR -3 dB SNR
CS-40/8
CS-40/16
CS-80/4
CS-80/8
CS-80/16
CS-120/8
CS-120/16
CS-160/8
CS-160/16
88.6
110.6
106.8
130.2
118.2
100.9
108.5
104.3
103.4
64.0
68.1
71.6
75.4
81.0
70.1
65.0
79.0
73.1
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
16.4
21.6
27.1
26.3
27.4
22.6
24.6
24.8
22.8
19
Conclusion (1/2)
Summary of results
• Spectral splitting showed highest relative improvement at all
SNR condition
• For 60 % recognition score spectral splitting gave an
improvement of  5 dB SNR
• Combined splitting at sweep cycle 80 ms with 8 and 16 shiftings
showed improvements close to spectral splitting
• Results similar for 8 and 16 shiftings
• Temporal splitting was best with inter-aural switching of 20 ms
• Inter-aural switching of 80 ms showed degradation
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
20
Conclusion (2/2)
Further work
 Evaluation by conducting listening tests on moderate bilateral
sensorineural hearing-impaired persons.
 Combining audiogram dependant filtering and multi-band
compression with dichotic presentation.
 Implementation of dichotic presentation schemes in wearable
hearing aids, with personalized parameter setting.
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
21
 • Intro. • Proc. Schemes • Evaluation • Results • Conclusion 
22