ica2010_DSJ_pres
Download
Report
Transcript ica2010_DSJ_pres
ICA 2010 : 20th Int. Congress on Acoustics, 23-27 August 2010, Sydney, Australia
[Mon, 23rd Aug, R.101, Physiological Acoustics 2, 17:00]
Simulation of Increased Masking in
Sensorineural Hearing Loss for a Preliminary
Evaluation of Speech Processing Schemes
D. S. Jangamashetti
A. N. Cheeran
P. N. Kulkarni
P. C. Pandey
[email protected], [email protected], {pnkulkarni,pcpandey}@ee.iitb.ac.in,
http://www.ee.iitb.ac.in/~spilab
IIT Bombay, India
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
1/17
OUTLINE
1. Introduction
2. Listening tests
3. Results & discussion
4. Conclusion
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
2/17
Intro.1/5
1 INTRODUCTION
Sensorineural hearing loss
▪ Increased hearing thresholds
▪ Reduced dynamic range of hearing & loudness recruitment
▪ Increased temporal & spectral masking
Speech processing for reducing the effects of
increased spectral masking
▪ Spectral contrast enhancement
▪ Multi–band frequency compression
▪ Dichotic binaural presentation
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
3/17
Intro.2/5
Evaluation of speech processing schemes
▪ Listening tests on hearing-impaired S’s, for the different
combinations of processing parameters: time consuming, tedious,
may cause fatigue.
▪ Preliminary evaluation for assessing the effects of processing
parameters through listening tests conducted on normal-hearing
S’s, with simulation of specific characteristics of hearing loss.
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
4/17
Intro. 3/5
Some earlier studies on simulation of hearing loss
Villchur (1974): Loudness recruitment simulated by 3–band
dynamic range expansion with different ratios. Tested on 4 S’s
with unilateral loss (processed stimuli to the normal ear,
unprocessed stimuli to the impaired ear).
Leek et al. (1987): Elevated hearing thresholds simulated by
addition of road-band noise.
ter Keurs et al. (1992): Reduced freq. resolution simulated by
smoothening of short time spectral envelope by convolving with a
Gaussian shaped filter followed by spectral smearing. Tested by
presenting processed stimuli to the normal-ear and unprocessed
stimuli to the impaired ear.
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
5/17
Intro. 4/5
Dubno & Schaefer (1992): Addition of Spectrally shaped broadband noise. Comaprision of scores by h.i. S’s & the scores for
noise-masked n.h. S’s.
Moore & Glasberg (1993): Speech signal split into 13 freq. bands,
envelope in each band processed to simulate loudness recruitment.
Nejime & Moore (1997): Loudness recruitment & reduced frequency
selectivity simulated by filtering the speech signal into different
bands, raising the temporal envelop of the filtered signal to a power
greater than one followed by smearing of short-time power
spectrum.
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
6/17
Intro. 5/5
Present study
Effect of increased masking simulated by adding broad-band
noise, band-limited to speech frequency range, at a specific SNR
with respect to short-time (10 ms) energy of the signal (no noise
during silence).
Evaluation by conducting consonant recognition tests on n.h. with
different levels of masking noise and S’s with moderate
sensorineural loss.
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
7/17
List. tests 1/3
2 LISTENING TESTS
Consonant recognition tests
Phonetically balanced (PB) words
Modified rhyme test (MRT) with CVC monosyllabic words
VCV utterances with vowel /a/
Presentation level: most comfortable level (MCL) of individual S’s.
Performance measures: recognition score (RS), response time (RT)
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
8/17
List. tests 2/3
Phonetically balanced (PB) words
▪ Three sets, each having 50 - 60 words of approximately same
intensity.
▪ 7 n.h. S’s, masking noise with SNR of 3, 0, -3, -6, -9 dB.
▪ 13 S’s with moderate bilateral loss.
Modified rhyme test (MRT)
▪ 300 words, embedded in a carrier phrase “would you write …”,
presented randomly in six test lists of 50 words.
▪ 6 n.h. S’s, masking noise with SNR of 6, 3, 0, -3, -6, -9, -12, -15 dB.
▪ 11 S’s with moderate bilateral loss.
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
9/17
List. tests 3/3
VCV utterances
Features
Consonant groups
Voicing (2)
Unvoiced: / p t k s f /
Voiced: / b d g m n z v /
Place (3)
▪ 5 n.h. S’s, masking noise
with SNR of 6, 3, 0, -3, -6,
-9, -12, -15 dB.
Front: / p b m f v /
Middle: / t d n s z /
Back: / k g /
Manner (3)
Oral stop: / p b t d k g /
Fricative: / s z f v /
Nasals: / m n /
▪ 5 h.i. S’s.
Nasality (2)
Oral: / p b t d k g s z f v /
Nasal: /m n /
▪ Calculation of relative
transmission of
information for different
features.
Frication (2)
Stop: / p b t d k g m n /
Fricative: / s z f v /
Duration (2)
Short: / p b t d k g m n f v /
Long: / s z /.
▪ 12 consonants
/p,b,t,d,k,g,f,v,s,z,m,n / in
VCV context with vowel /a/.
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
10/17
Results 1/5
3 RESULTS
I. PB test results
Mean (7 n. h. S’s) response
time & recognition score
RT increased from 2.09 s for
no-noise to 2.83 s at -9 dB
SNR.
RS decreased from 99.8 % for
no-noise to 23.9 % at -9 dB
SNR .
Hearing-impaired subjects
RT: 2.1 – 6.6 s, mean: 3.05 s.
♠♠
RS: 20.6 – 90.1 %, mean 62.7 %.
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
11/17
Results 2/5
II. MRT Results
Mean (6 n. h. S’s) response
time & recognition score
RT increased from 2.64 s for
no-noise to 3.45 s at -15 dB
SNR.
RS decreased from 97.1 % for
no-noise to 45.3 % at -15 dB
SNR.
Hearing-impaired subjects
RT: 3.47 – 4.10 s, mean: 3.80 s.
♠♠
RS: 50.3 – 67.3 %, mean: 60.8 % .
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
12/17
Results 3/5
III. VCV test results
Mean (5 n. h. S’s) response time (RT),
recognition score, (RS) and rel.
information transmitted for overall
and feature groupings
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
13/17
Results 4/5
RS vs. SNR
for three types of
test material
▪ Scores for PB words lower than those for VCV and MRT.
▪ Matching of mean RS of h.i. S’s and n.h. S’s
PB words: at SNR = -3 dB.
MRT & VCV: at SNR = -9 dB.
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
14/17
Results 5/5
Equivalent SNR for VCV test
(SNR for matching the n.h. score to the the h.i. score)
Relative information transmitted (%)
RS
(%)
Ov
Vo
Pl
Mn
Na
Fri
Du
Avg.
H.I.
81
84
85
58
71
94
56
49
Eqt.
SNR
-9
-8
-15
-7
-12
-12
-11
-9
Effects of masking on the reception of features
• Maximal on place and duration.
• Moderate on nasality and manner.
• Negligible on voicing.
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
15/17
Concl. 1/1
4 CONCLUSION
Addition of broad-band noise with constant SNR on short-time
(10 ms) basis, simulated the effect of increased temporal &
spectral masking.
Simulation may be useful in preliminary evaluation and
optimizing the processing parameters in developing speech
processing schemes for improving speech perception by
persons with sensorineural hearing loss.
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
16/17
THANK YOU
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
17/17
D. S. Jangamashetti, A. N. Cheeran, P. N. Kulkarni, P. C. Pandey, “Simulation of increased masking in
sensorineural hearing loss for a preliminary evaluation of speech processing schemes”, Proc. 20th
International Congress on Acoustics ( ICA 2010), 23-27 August 2010, Sydney, Australia.
Abstract -- Sensorineural loss is characterized by increased hearing threshold, reduction in the
dynamic range of hearing and loudness recruitment, and increased temporal and spectral masking,
resulting in degraded speech perception. Several techniques including spectral contrast
enhancement, multi-band frequency compression, and dichotic binaural presentation have been
investigated for reducing the adverse effects of increased masking. Assessment of speech
processing techniques and optimization of processing parameters involves listening tests on
hearing-impaired listeners. These tests are time consuming and may cause a fatigue, particularly in
elderly subjects. A simulation of hearing loss, by processing the speech signal through a model of
the loss characteristics, is useful in conducting the listening tests on normal-hearing subjects, for a
preliminary evaluation of the schemes and particularly for selecting the processing parameters. The
present study used addition of broad-band noise, band-limited to speech frequency range, at a
specific SNR with respect to short-time (10 ms) energy of the signal. Different levels of loss were
simulated by varying the SNR. In this simulation, no noise gets added during silence segments.
Listening tests to assess the loss simulation were conducted using three types of test material:
vowel-consonant-vowel (VCV) utterances, phonetically balanced word lists, and modified rhyme test.
Recognition score from subject responses was used as a measure of speech intelligibility and
response time was used as a measure of load on the perception process. For all the three test
materials, decrease in the recognition scores and increase in response times for normal- hearing
subjects showed the same pattern as the corresponding results for subjects with moderate-to-severe
sensorineural loss. A relative information transmission analysis of the stimulus-response confusion
matrices for VCV utterances showed that the simulated loss did not affect reception of voicing and
nasality features and it had maximum adverse effect on the reception of place and duration features,
indicating that the addition of broadband noise with constant SNR with respect to short-time signal
energy simulated an increased spectral and temporal masking.
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
18/17
REFERENCES
1. B. C. J. Moore, An Introduction to the Psychology of Hearing. 4th ed. (Academic, London, 1997).
2. J. M. Pickett, The Acoustics of Speech Communication: Fundamentals, Speech Perception Theory, and Technology, (Allyn Bacon,
Boston, Mass., 1999).
3. B. R. Glasberg and B. C. J. Moore, “Auditory filter shapes in subjects with unilateral and bilateral cochlear impairments,” J. Acoust. Soc.
Am. 79, 1020 – 1033 (1986).
4. J. R. Dubno and A. B. Schaefer, “Comparison of frequency selectivity and consonant recognition among hearing-impaired and masked
normal-hearing listeners,” J. Acoust. Soc. Am. 91 Pt.1., 2110–2121 (1992).
5. H. T. Bunnel, “On enhancement of spectral contrast in speach for hearing-impaired listeners,” J. Acoust. Soc. Am. 88, 2546-2556 (1990).
6. T. Baer, B. C. J. Moore, and S. Gatehouse, “Spectral contrast enhancement of speech in noise for listeners with sensorineural hearing
impairment: effects on intelligibility, quality, and response times,” J. Rehabil. Res. Dev. 30, 49-72 (1993).
7 I. Cohen, “Speech spectral modeling and spectral enhancement based on autoregressive conditional heteroscedasticity models,”
Signal Processing. 86, 698-709 (2006).
8 K. Yasu, K. Kobayashi, K. Shinohara, M. Hishitani, T. Arai, and Y. Murahara, “Frequency compression of critical band for digital hearing
aids,” Proc. China-Japan Joint Conf. Acoustics. 159 – 162 (2002).
9 P. N. Kulkarni, P. C. Pandey, and D. S. Jangamashetti, D. “Multi-band frequency compression for reducing the effects of spectral
masking,” Int. J. Speech Tech. 10, 219-227 (2007).
10 P. E. Lyregaard, “Frequency selectivity and speech intelligibility in noise,” Scand. Audiol. Suppl. 15, 113 – 122 (1982).
11 T. Lunner, S. Arlinger, J. Hellgren “8-channel digital filter bank for hearing aid use: preliminary results in monaural, diotic, and dichotic
modes,” Scand. Audiol. Suppl. 38, 75 – 81 (1993).
12 D. S. Chaudhari, and P. C. Pandey, “Dichotic presentation of speech signal with critical band filtering for improving speech perception,”
Proc. IEEE ICASSP, 3601 – 3604 (1998).
13 A. Murase, F. Nakajima, S. Sakamoto, Y. Suzuki, and T. Kawase, T. “Effect and sound localization with dichotic-listening hearing aids,”
Proc. 18th Int. Congress Acoust. (ICA), Kyoto, Japan, II-1519 – 1522 (2004).
14 E. Villchur, “Simulation of the effect of recruitment on loudness relationships in speech,” J. Acoust. Soc. Am. 56, 1601-1611 (1974).
15 E. Villchur, “Electronic models to simulate the effect of sensory distortions on speech perception by the deaf,” J. Acoust. Soc. Am. 62,
665-674 (1977).
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
19/17
16 B. C. J. Moore, and B. R. Glasberg. “Simulation of the effects of loudness recruitment and threshold elevation on the
intelligibility of speech in quiet and in a background of speech,” J. Acoust. Soc. Am. 94, 2050–2062 (1993).
17 M. ter Keurs, J. M. Festen, and R. Plomp, “Effect of spectral envelope smearing on speech reception. I,” J. Acoust. Soc. Am. 91,
2872 – 2880 (1992).
18 Y. Nejime, and B. C. J. Moore, “Simulation of the effect of threshold elevation and loudness recruitment combined with reduced
frequency selectivity on the intelligibility of speech in noise,” J. Acoust. Soc. Am. 102, 603 – 615 (1997).
19 L. E. Humes, B. Espinoza-Varas, and C. S. Watson, “Modelling sensorineural hearing loss-I. Model and retrospective
evaluation,” J. Acoust. Soc. Am. 83, 188 – 202 (1988).
20 M. R. Leek, M. F. Dorman, and Q. Summerfield, “Minimum spectral contrast for vowel identification by normal-hearing and
hearing-impaired listeners,” J. Acoust. Soc. Am. 81, 148 – 154 (1987).
21 D. A. Nelson, S. J. Chargo, J. G. Kopun, and R. L. Freyman, “Effect of forward-masked psychophysical tuning curves in quiet
and noise,” J. Acoust. Soc. Am. 88, 2143 – 2151 (1990).
22 W. Jesteadt. Modeling Sensorineural Hearing Loss. (Lawrence Erlbaum, Mahwah, New Jersey, 1997).
23 S. Gordon-Salant and P. J. Fitzgibbons, “Temporal factors and speech recognition performance in young and elderly listeners,”
J. Speech Hear. Res. 36, 1276–1285 (1993).
24 G. A. Miller, and P. E. Nicely, “An analysis of perceptual confusions among some English consonants,” J. Acoust. Soc. Am. 72,
338 – 352 (1955).
25 W. D. Voiers, “Evaluating processed speech using the diagnostic rhyme test,” Speech Tech. 55, 30 – 39 (1983).
26 A. S. House, C. E. Williams, M. H. L. Hecker, and K. D. Kryter, “Articulation testing methods: consonantal differentiation with
closed-response set,” J. Acoust. Soc. Am. 37, 158 – 166 (1965).
27 K. D. Kryter, and E. C. Whitman, “Some comparisons between rhyme and PB-word intelligibility tests,” J. Acoust. Soc. Am. 37,
1146 (1965).
28 S. Gatehouse, and J. Gordon, J. “Response times to speech stimuli as measures of benefit from amplification,” Br. J. Audiol.
24, 63 – 68 (1990).
29 F. Apoux, O. Crouzet, and C. Lorenzi, “Temporal envelope expansion of speech in noise for normal-hearing and hearingimpaired listeners: Effects on identification performance and response times,” Hear. Res. 153, 123 – 131 (2001).
30 A. N. Cheeran, Speech processing with dichotic presentation for binaural hearing aids for moderate bilateral sensorineural loss,
Ph.D. Thesis, School of Biosciences and Bioengineering, Indian Institute of Technology Bombay, India, (2005).
31 D. S. Jangamashetti, Binaural dichotic presentation to reduce the effects of temporal and spectral masking due to
sensorineural hearing loss, Ph.D. Thesis, Dept. of Elect. Engg., Indian Institute of Technology Bombay, India (2003).
♠♠
1.Intro 2. List. tests 3. Results
4 Concl.
♥♥
◄◄
►►
20/17