pcp_binaural
Download
Report
Transcript pcp_binaural
SPEECH PROCESSING FOR
BINAURAL HEARING AIDS
Dr P. C. Pandey
EE Dept., IIT Bombay
Feb’03
1
R&D activities
in SPI Lab, EE Dept, IIT Bombay
• Speech & hearing
• Healthcare instrumentation
• Impedance cardiography
• Industrial instrumentation
2
Speech & hearing
• Speech processing for improving perception by persons with
sensori-neural hearing loss:
- Consonantal enhancement (with Prof SD Agashe)
- Binaural dichotic presentation
• Vocal tract shape estimation for speech training of deaf children
• Speech synthesis and study of phonemic features using HNM
• Cancellation of background noise in alaryngeal speech using
spectral subtraction
3
Healthcare instrumentation
• Low cost diagnostic audiometer
• Impedance glottograph for voice pitch
• Impedance cardiograph for sports medicine.
• Intravenous drip rate indicator
• Communicator for children with cerebral palsy (with Prof GG Ray)
• Non-invasive ultrasonic thermometry system (with Prof T Anjaneyulu)
• Myoelectric hand (with Prof SR Devasahayam & R Lal)
4
Impedance cardiography
Signal processing for improving the estimation of stroke
volume from impedance cardiogram
Industrial Instrumentation
Noninvasive m/s of single phase fluid flow using ultrasonic
crosscorrelation technique (with Prof T Anjaneyulu)
Online measurement of dielectric dissipation factor for
condition monitoring of high voltage insulation (with Prof SV Kulkarni)
5
6
7
8
9
10
Speech Processing for Binaural Hearing Aids
Hearing system
Outer ear Middle ear Inner ear Cochlear nerve
Brain
Hearing impairments
• Conducrtive
• Sensorineural • Central • Functional
Sensory Aids for the hearing impaired
• Hearing aids
• Cochlear prosthesis
• Visual & tactile aids
11
Causes of sensorineural loss
• Loss of sensory hair cells in cochlea
• Degeneration of auditory nerve fibers
Characteristics of sensorineural loss
•
•
•
•
Frequency dependent shifts in hearing thresholds
Reduced dynamic range, loudness recruitment
Poor frequency selectivity & increased spectral masking
Reduced temporal resolution & increased temporal masking
12
Effects of increased spectral masking
• Smearing of spectral peaks and valleys due to broader
auditory filters
• Reduction of internal spectral contrast
• Reduced discrimination of consonantal place feature
Effects of increased temporal masking
• Forward and backward masking of weak segments by
strong ones
• Reduced ability to discriminate sub-phonemic segments
like noise bursts, voice-onset-time, and formant transitions
13
Speech processing for dichotic presentation for
binaural hearing aids to reduce the effects of masking
Masking takes place at the peripheral level of the auditory system
Information from the two ears gets integrated at higher levels in the
perception process
Binaural dichotic presentation for persons with bilateral residual hearing:
- Speech signal split in a complementary form,
- Signal components likely to mask each other presented to different ears, - Information integrated at higher levels, for better speech perception
14
Binaural dichotic presentation schemes
Spectral splitting
Filtering by 2 complementary comb filters: better place reception
Temporal splitting
Gating by 2 complementary fading functions: better duration
reception
Combined splitting
Processing by 2 time-varying comb filters
All the sensory cells of the basilar membrane get periodic
relaxation from stimulation: better perception of consonantal
duration, place, and other features
15
TEMPORAL SPLITTING WITH TRAPEZOIDAL FADING
w1(n)
s (n)
s1(n)
s2(n)
Temporal splitting of the signal for dichotic
presentation using w1(n) and w2(n)
Inter-aural switching period = 20 ms,
Duty cycles = 70%,
Transition durations = 0, 1, 2, 3 ms
w2(n)
w1(n)
w2(n)
N
M
L
M
N
n
L
Inter-aural fading with trapezoidal
transition and inter-aural overlap
n
16
Investigations with spectral splitting
• Auditory filter bandwidth based comb filters
18 bands over 5 kHz, 256 coefficient linear phase filters, designed using
frequency sampling technique
• Listening tests with hearing impaired subjects:
improvement in response time, recognition scores, & reception of place
feature
• Better results with perceptually balanced filters
1 dB ripple, 30 dB attenuation, 4-6 dB crossover
• Filters with personalized frequency response
Overall improvement, but not particularly for place
17
Combined splitting with time-varying filters
s1(n)
m/2 +2
m/2 +1
m/2
s(n)
m
1
set of filter
coefficients
2
Time varying
comb filter 2
1
m
Magnitude
Time varying
comb filter 1
2
s2(n)
Sweep cycle duration = 20 ms.
With m shiftings, each pair of comb
filter processes for 20/m ms
1
Frequency
18
Frequency (kHz)
Frequency (kHz)
5
4
3
2
1
0
5
4
3
2
1
0
Inten. dB
0
0
0
5
5
10
10
15
Time in ms
(a)
15
Time in ms
(b)
20
25
30
-40
Inten. dB
0
20
25
30
-40
An idealized representation of magnitude response of the pair of timevarying comb filters using 4 shiftings for the (a) left ear (b) right ear.
19
1
Magnitude (dB)
4
3
2
1
Normalized frequency
20
Time-varying comb filters
Set of linear phase 256-coeff. FIR filters with pre-calculated coefficients
(designed using iterative use of frequency sampling technique).
Comb filter responses optimized for min. perceived spectral distortion:
low passband ripple & high stopband attenuation, inter-band crossover
gains adjusted for loudness balance.
Pass band ripple < 1 dB, Stop band attenuation > 30 dB
Gain at inter-band crossovers: -4 to -6 dB
Sweep cycle duration : 20 ms
Number of shiftings: 2, 4, 8, 16
21
Listening tests for evaluation of the schemes
Test material: Closed set of 12 VCV syllables, formed with
consonants / p, t, k, b, d, g, m, n, s, z, f, v / and vowel / a /
Subjects & listening condition:
• Normal hearing subjects with loss simulated by Gaussian noise
with short-time (~10 ms) SNRs of 6 : -15 dB. MCL( 70–75 dB SPL)
• Hearing impaired subjects with bilateral sensorineural loss. MCL.
Performance measurement
• Response time statistics
• Stimulus-response confusion matrix
• Recognition scores
• Rel. information trans. of consonantal features
22
Acoustically Isolated Chamber
PC
RS232C
PCL208
D/A
Ports
Lowpass Filter and
Audio Amplifier
s1(t)
Subject
terminal
s2(t)
Lowpass Filter and
Audio Amplifier
Subject
:
Listening test set-up
23
Conclusions
• All the three schemes improve response time, recognition scores, &
rel. info. tr. for overall and various speech features.
• Extent of improvement with a scheme related to nature of the loss
- Severe high frequency hearing loss :
Max. improvement with temporal splitting (17.9%).
- Symmetrically low frequency hearing loss and symmetrically
sloping high frequency hearing loss: max improvement with spectral
splitting (17.5%) & combined splitting with 8 shiftings (20.5%).
-Asymmetrical high frequency loss: temporal splitting (7.6%) &
combined splitting (7.6%)
(contd.)
24
• Spectral splitting more effective in reducing perceptual load.
• Overall max improvement in rec. scores with combined splitting
with 8 shiftings.
• Temporal splitting mainly improved the duration feature perception.
• Spectral splitting mainly improved the the place feature perception.
• Combined splitting with 8 improved perception of both duration and place.
• Reception of the relatively robust consonantal features
(voicing, manner, and nasality) not adversally affected by splitting.
• Personalized filter response gives additional improvement
25
Next
• Listening tests with a larger number of S’s to establish
relationship between processing parameters & nature of loss.
• Individualized multi-band compression.
• Implementation of the processing schemes as part of wearable
hearing aids, with personalized parameter setting.
• Effect of binaural dichotic listening on non-speech signals &
source localization to be investigated.
• Investigations with combination of consonant enhancement with
dichotic presentation.
26
THANK YOU
27