19-Vocal-Tract

Download Report

Transcript 19-Vocal-Tract

Auditory Remnants
April 5, 2012
Equal Loudness Curves
• Perceived loudness also depends on frequency.
Audiograms
• When an audiologist tests your hearing, they determine
your hearing threshold at several different frequencies.
• They then chart how much your hearing threshold differs
from that of a “normal” listener at those frequencies in an
audiogram.
• Noise-induced
hearing loss tends
to affect higher
frequencies first.
• (especially
around 4000 Hz)
Age
• Sensitivity to higher frequencies also diminishes with
age. (“Presbycusis”)
Note: the
“teen buzz”
Otitis Media
• Kids often get ear infections, which are technically
known as otitis media.
• = fluid fills the middle ear
• This leads to a form of conduction deafness, in which
sound is not transmitted as well to the cochlea.
• Auditorily, frequencies from 500 to 1000 Hz tend to drop
out.
Check out a Praat
demo.
Loudness
• The perceived loudness of a sound is measured in units
called sones.
• The sone scale also exhibits a non-linear relationship
with respect to absolute pressure values.
Masking
• Another scale for measuring auditory frequency
emerged in the 1960s.
• This scale was inspired from the phenomenon of
auditory masking.
• One sound can “mask”, or obscure, the perception of
another.
• Unmasked:
• Masked:
• Q: How narrow can we make the bandwidth of the
noise, before the sinewave becomes perceptible?
• A: Masking bandwidth is narrower at lower frequencies.
Critical Bands
• Using this methodology, researchers eventually
determined that there were 24 critical bands of hearing.
• The auditory system integrates all acoustic energy
within each band.
•  Two tones within the same critical band of
frequencies sound like one tone
• Ex: critical band #9 ranges from 920-1080 Hz
•  F1 and F2 for
might merge together
• Each critical band  0.9 mm on the basilar membrane.
•  The auditory system consists of 24 band-pass filters.
• Each filter corresponds to one unit on the Bark scale.
Bark Scale of Frequency
• The Bark scale converts acoustic frequencies into
numbers for each critical band
Bark Table
Band
Center Bandwidth
Band
Center Bandwidth
1
50
20-100
13
1850
1720-2000
2
150
100-200
14
2150
2000-2320
3
250
200-300
15
2500
2320-2700
4
350
300-400
16
2900
2700-3150
5
450
400-510
17
3400
3150-3700
6
570
510-630
18
4000
3700-4400
7
700
630-770
19
4800
4400-5300
8
840
770-920
20
5800
5300-6400
9
1000
920-1080
21
7000
6400-7700
10
1170
1080-1270
22
8500
7700-9500
11
1370
1270-1480
23
10500
9500-12000
12
1600
1480-1720
24
13500
12000-15500
Spectral Differences
• Acoustic vs. auditory spectra of
F1 and F2
Cochleagrams
• Cochleagrams are spectrogram-like representations
which incorporate auditory transformations for both pitch
and loudness perception
• Acoustic spectrogram vs. auditory cochleagram
representation of Cantonese word
• Check out Peter’s vowels in Praat.
Hearing Aids et al.
• Generally speaking, a hearing aid is simply an
amplifier.
• Old style: amplifies all frequencies
• New style: amplifies specific frequencies, based
on a listener’s particular hearing capabilities.
• More recently, profoundly deaf listeners may regain
some hearing through the use of a cochlear implant
(CI).
• For listeners with nerve deafness.
• However, CIs can only transmit a degraded signal to
the inner ear.
Cochlear Implants
A Cochlear Implant artificially stimulates the nerves
which are connected to the cochlea.
Nuts and Bolts
•
The cochlear implant chain of events:
1. Microphone
2. Speech processor
3. Electrical stimulation
•
What the CI user hears is entirely determined by the
code in the speech processor
•
Number of electrodes stimulating the cochlea ranges
between 8 to 22.
•
•
 poor frequency resolution
Also: cochlear implants cannot stimulate the low
frequency regions of the auditory nerve
Noise Vocoding
• The speech processor operates like a series of critical
bands.
• It divides up the frequency scale into 8 (or 22) bands and
stimulates each electrode according to the average
intensity in each band.
This results in what sounds (to us) like a highly degraded
version of natural speech.
What CIs Sound Like
• Check out some nursery rhymes which have been
processed through a CI simulator:
CI Perception
• One thing that is missing from vocoded speech is F0.
• …It only encodes spectral change.
• A former honors student, Aaron Byrnes, put together an
experiment testing intonation perception in CI-simulated
speech for his honors thesis.
• Tested: discrimination of questions vs. statements
• And identification of most prominent word in a
sentence.
• 8 channels:
• 22 channels:
The Findings
• CI User:
• Excellent identification of the most prominent word.
• At chance (50%) when distinguishing between
statements and questions.
• Normal-hearing listeners (hearing simulated speech):
• Good (90-95%) identification of the prominent word.
• Not too shabby (75%) at distinguishing statements
and questions.
• Conclusion 1: F0 information doesn’t get through the CI.
• Conclusion 2: Noise-vocoded speech might not be a
completely accurate CI simulation.
Mitigating Factors
• The amount of success with Cochlear Implants is highly
variable.
• Works best for those who had hearing before they
became deaf.
• Depends a lot on the person
• Possibly because of reorganization of the brain
• Works best for (in order):
• Environmental Sounds
• Speech
• Speaking on the telephone (bad)
• Music (really bad)
Critical Period?
• For congentially deaf users, the Cochlear Implant
provides an unusual test of the “forbidden experiment”.
• The “critical period” is extremely early-• They perform best, the earlier they receive the implant
(12 months old is the lower limit)
• Steady drop-off in performance thereafter
• Difficult to achieve natural levels of fluency in speech.
• Depends on how much they use the implant.
• Partially due to early sensory deprivation.
• Also due to degraded auditory signal.
Practical Considerations
• It is largely unknown how well anyone will perform with a
cochlear implant before they receive it.
• Possible predictors:
• lipreading ability
• rapid cues for place are largely obscured by the
noise vocoding process.
• fMRI scans of brain activity during presentation of
auditory stimuli.
One Last Auditory Thought
• Frequency
coding of
sound is
found all the
way up in
the auditory
cortex.
• Also: some
neurons
only fire
when
sounds
change.
Vocal Tract Physiology
April 5, 2012
The Toolkit
•
There are four primary active articulators in speech.
•
(articulators we can move around )
1. The lips
2. The lower jaw (mandible)
3. The tongue
4. The velum
•
The pharynx can also be constricted, to some extent.
•
Separate sets of muscles control each articulator...
Articulatory Speed
• The gold medal goes to the tongue tip...
• which is capable of 7.2 - 9.6 movements per
second.
• The rest:
• Mandible
5.9 - 8.4 movements per second
• Back of tongue 5.4 - 8.9
• Velum
5.2 - 7.8
• Lips
5.7 - 7.7
• Note: lips can be raised and lowered faster than they
can be protruded and rounded.
1. The Lips
• The orbicularis oris
muscle surrounds the lips.
• Contraction compresses
and rounds the lips.
• A muscle called the
mentalis also protrudes
the lips.
• Contraction of the
risorius muscle retracts
the corners of the lips...
• and spreads them.
By the way...
• The vowel [i] is typically produced with active lip
spreading.
• “Say cheese!”
• What acoustic effect would this have?
• Lips Normal:
• Lips Spread:
• Check ‘em out in Praat.
2. The Jaw
• Several different muscles are used to both lower and
raise the mandible.
• Primary raisers:
• Masseter
• Temporalis
• Internal
pterygoid
2. The Jaw
• Several different muscles are used to both lower and
raise the mandible.
• Lowerers:
• Anterior belly
digastricus
• Geniohyoid
• Mylohyoid
• Note: in lowering, the mandible also retracts.
Articulatory Control
• People can produce vowels perfectly fine even when
a bite block holds their jaws open. (Lindblom, 1979)
• Adults get the formants right, right from the start...
• But kids need a little time to adjust.
• Abbs et al. (1984) experimented with pulling down
people’s jaws...
• when they had to say sequences like [aba] and
[afa]!
Abbs et al. EMG data
• Lip muscles
adjust
immediately for
the sudden jaw
lowering...
• Adjustment
happens faster
than electrical
signals can
travel to the
motor cortex
and back!
3. The Tongue
•
The muscles controlling the tongue consist of:
1. Intrinsic muscles
•
(completely within the tongue)
2. Extrinsic muscles
•
•
(connect the tongue to outside structures)
The intrinsic muscles include:
1. The superior longitudinal muscle
2. The inferior longitudinal muscle
3. Transverse muscles
4. Vertical muscles
Tongue: Sagittal View
• The superior
longitudinal muscle
pulls the tongue tip up
and back.
• Instrumental in
producing alveolars
and retroflexes.
• The inferior
longitudinal muscle
pulls the tongue tip
down and back.
• Helps with tongue
blade articulations.
Tongue: Coronal View
• The transverse muscles pulls in the edges of the
tongue, and also lengthens the tongue to some extent.
• Helpful in producing laterals.
• Contraction of the vertical muscles flattens the tongue.
• Interdentals?
Extrinsic #1: Genioglossus
• The genioglossus
connects the tongue to
both the mandible and the
hyoid.
• Contraction of the
posterior genioglossus
moves the whole tongue
up and forwards.
• Crucial in palatals.
• Contraction of the
anterior genioglossus
curls the tongue tip down
and back.
Gene-ioglossus
Gene Simmons, of the rock band KISS, is
famous for his use of the genioglossus muscle.
Extrinsic #2: Styloglossus
• The styloglossus
connects the tongue to the
“styloid process” in front of
the ear.
• Pulls the tongue up and
back.
• ...for velar articulations.
• May also help groove
(sulcalize) the tongue.
Extrinsic #3: Hyoglossus
• The hyoglossus
connects the tongue to the
hyoid bone.
• Pulls the tongue down
and back.
• = pharyngeals
• Can also pull the sides of
the tongue down.
Extrinsic #4: Palatoglossus
• The palatoglossus connects
the tongue to the soft palate.
• Can be used to raise the back
of the tongue.
• And also to lower the
velum!
• Lowering the back of the
tongue may inadvertently pull
the velum down...
• leading to passive
nasalization of low vowels.
• Note: Great Lakes vowel shift
Chain Shifting
• The Great Lakes Shift is called a chain shift, because
first one vowel moves...
• And then a series of others follow.
• In this case, the first shift was:
• Theory: vowels have to stay distinct from one another.
• So listeners can understand what’s being said.
Back to the Shift
• The Great Lakes Shift was first noticed in the 1960s.
The Shift, Diagrammed
4. Velar Muscles
• The levator palatini
raises the velum.
• (connects the velum to
the temporal bone)
• The velum is lowered by
both the palatoglossus and
the palatopharyngeus...
• which connects the
palate to the pharynx.