pcp_binaural

Download Report

Transcript pcp_binaural

SPEECH PROCESSING FOR
BINAURAL HEARING AIDS
Dr P. C. Pandey
EE Dept., IIT Bombay
Feb’03
1
R&D activities
in SPI Lab, EE Dept, IIT Bombay
• Speech & hearing
• Healthcare instrumentation
• Impedance cardiography
• Industrial instrumentation
2
Speech & hearing
• Speech processing for improving perception by persons with
sensori-neural hearing loss:
- Consonantal enhancement (with Prof SD Agashe)
- Binaural dichotic presentation
• Vocal tract shape estimation for speech training of deaf children
• Speech synthesis and study of phonemic features using HNM
• Cancellation of background noise in alaryngeal speech using
spectral subtraction
3
Healthcare instrumentation
• Low cost diagnostic audiometer
• Impedance glottograph for voice pitch
• Impedance cardiograph for sports medicine.
• Intravenous drip rate indicator
• Communicator for children with cerebral palsy (with Prof GG Ray)
• Non-invasive ultrasonic thermometry system (with Prof T Anjaneyulu)
• Myoelectric hand (with Prof SR Devasahayam & R Lal)
4
Impedance cardiography
 Signal processing for improving the estimation of stroke
volume from impedance cardiogram
Industrial Instrumentation
 Noninvasive m/s of single phase fluid flow using ultrasonic
crosscorrelation technique (with Prof T Anjaneyulu)
 Online measurement of dielectric dissipation factor for
condition monitoring of high voltage insulation (with Prof SV Kulkarni)
5
6
7
8
9
10
Speech Processing for Binaural Hearing Aids
Hearing system
Outer ear  Middle ear  Inner ear  Cochlear nerve
Brain
Hearing impairments
• Conducrtive
• Sensorineural • Central • Functional
Sensory Aids for the hearing impaired
• Hearing aids
• Cochlear prosthesis
• Visual & tactile aids
11
Causes of sensorineural loss
• Loss of sensory hair cells in cochlea
• Degeneration of auditory nerve fibers
Characteristics of sensorineural loss
•
•
•
•
Frequency dependent shifts in hearing thresholds
Reduced dynamic range, loudness recruitment
Poor frequency selectivity & increased spectral masking
Reduced temporal resolution & increased temporal masking
12
Effects of increased spectral masking
• Smearing of spectral peaks and valleys due to broader
auditory filters
• Reduction of internal spectral contrast
• Reduced discrimination of consonantal place feature
Effects of increased temporal masking
• Forward and backward masking of weak segments by
strong ones
• Reduced ability to discriminate sub-phonemic segments
like noise bursts, voice-onset-time, and formant transitions
13
Speech processing for dichotic presentation for
binaural hearing aids to reduce the effects of masking
 Masking takes place at the peripheral level of the auditory system
 Information from the two ears gets integrated at higher levels in the
perception process
 Binaural dichotic presentation for persons with bilateral residual hearing:
- Speech signal split in a complementary form,
- Signal components likely to mask each other presented to different ears, - Information integrated at higher levels, for better speech perception
14
Binaural dichotic presentation schemes
 Spectral splitting
Filtering by 2 complementary comb filters: better place reception
 Temporal splitting
Gating by 2 complementary fading functions: better duration
reception
 Combined splitting
Processing by 2 time-varying comb filters
 All the sensory cells of the basilar membrane get periodic
relaxation from stimulation: better perception of consonantal
duration, place, and other features
15
TEMPORAL SPLITTING WITH TRAPEZOIDAL FADING
w1(n)
s (n)

s1(n)

s2(n)
Temporal splitting of the signal for dichotic
presentation using w1(n) and w2(n)
Inter-aural switching period = 20 ms,
Duty cycles = 70%,
Transition durations = 0, 1, 2, 3 ms
w2(n)
w1(n)
w2(n)
N
M
L
M
N
n
L
Inter-aural fading with trapezoidal
transition and inter-aural overlap
n
16
Investigations with spectral splitting
• Auditory filter bandwidth based comb filters
18 bands over 5 kHz, 256 coefficient linear phase filters, designed using
frequency sampling technique
• Listening tests with hearing impaired subjects:
improvement in response time, recognition scores, & reception of place
feature
• Better results with perceptually balanced filters
1 dB ripple, 30 dB attenuation, 4-6 dB crossover
• Filters with personalized frequency response
Overall improvement, but not particularly for place
17
Combined splitting with time-varying filters
s1(n)
m/2 +2
m/2 +1
m/2
s(n)
m
1
set of filter
coefficients
2
Time varying
comb filter 2
1
m
Magnitude
Time varying
comb filter 1
2
s2(n)
Sweep cycle duration = 20 ms.
With m shiftings, each pair of comb
filter processes for 20/m ms
1
Frequency
18
Frequency (kHz)
Frequency (kHz)
5
4
3
2
1
0
5
4
3
2
1
0
Inten. dB
0
0
0
5
5
10
10
15
Time in ms
(a)
15
Time in ms
(b)
20
25
30
-40
Inten. dB
0
20
25
30
-40
An idealized representation of magnitude response of the pair of timevarying comb filters using 4 shiftings for the (a) left ear (b) right ear.
19
1
Magnitude (dB)
4
3
2
1
Normalized frequency
20
Time-varying comb filters
Set of linear phase 256-coeff. FIR filters with pre-calculated coefficients
(designed using iterative use of frequency sampling technique).
Comb filter responses optimized for min. perceived spectral distortion:
low passband ripple & high stopband attenuation, inter-band crossover
gains adjusted for loudness balance.
Pass band ripple < 1 dB, Stop band attenuation > 30 dB
Gain at inter-band crossovers: -4 to -6 dB
Sweep cycle duration : 20 ms
Number of shiftings: 2, 4, 8, 16
21
Listening tests for evaluation of the schemes
Test material: Closed set of 12 VCV syllables, formed with
consonants / p, t, k, b, d, g, m, n, s, z, f, v / and vowel / a /
Subjects & listening condition:
• Normal hearing subjects with loss simulated by Gaussian noise
with short-time (~10 ms) SNRs of 6 : -15 dB. MCL( 70–75 dB SPL)
• Hearing impaired subjects with bilateral sensorineural loss. MCL.
Performance measurement
• Response time statistics
• Stimulus-response confusion matrix
• Recognition scores
• Rel. information trans. of consonantal features
22
Acoustically Isolated Chamber
PC
RS232C
PCL208
D/A
Ports
Lowpass Filter and
Audio Amplifier
s1(t)
Subject
terminal
s2(t)
Lowpass Filter and
Audio Amplifier
Subject
:
Listening test set-up
23
Conclusions
• All the three schemes improve response time, recognition scores, &
rel. info. tr. for overall and various speech features.
• Extent of improvement with a scheme related to nature of the loss
- Severe high frequency hearing loss :
Max. improvement with temporal splitting (17.9%).
- Symmetrically low frequency hearing loss and symmetrically
sloping high frequency hearing loss: max improvement with spectral
splitting (17.5%) & combined splitting with 8 shiftings (20.5%).
-Asymmetrical high frequency loss: temporal splitting (7.6%) &
combined splitting (7.6%)
(contd.)
24
• Spectral splitting more effective in reducing perceptual load.
• Overall max improvement in rec. scores with combined splitting
with 8 shiftings.
• Temporal splitting mainly improved the duration feature perception.
• Spectral splitting mainly improved the the place feature perception.
• Combined splitting with 8 improved perception of both duration and place.
• Reception of the relatively robust consonantal features
(voicing, manner, and nasality) not adversally affected by splitting.
• Personalized filter response gives additional improvement
25
Next
• Listening tests with a larger number of S’s to establish
relationship between processing parameters & nature of loss.
• Individualized multi-band compression.
• Implementation of the processing schemes as part of wearable
hearing aids, with personalized parameter setting.
• Effect of binaural dichotic listening on non-speech signals &
source localization to be investigated.
• Investigations with combination of consonant enhancement with
dichotic presentation.
26
THANK YOU
27