26-Motor-Theory

Download Report

Transcript 26-Motor-Theory

The Motor Theory of
Speech Perception
April 1, 2013
Palatography Preparations!
• First off: turn in your course project reports!
• Secondly: we will do the palatography demo on
Wednesday.
• We’ve already gotten volunteers for speakers
• We also need someone to volunteer for:
• Photography
• Transcription
• I’ll bring the goodies!
• Now: let’s watch some dogs playing the piano.
The Next Level
• Interestingly, categorical perception is not found for
non-speech stimuli.
• Miyawaki et al: tested perception of an F3 continuum
between /r/ and /l/.
The Next Level
• They also tested perception of the F3 transitions in
isolation.
• Listeners did not perceive these transitions categorically.
The Implications
• Interpretation: we do not perceive speech in the same
way we perceive other sounds.
• “Speech is special”…
• and the perception of speech is modular.
• A module is a special processor in our minds/brains
devoted to interpreting a particular kind of environmental
stimuli.
Module Characteristics
•
You can think of a module as a “mental reflex”.
•
A module of the mind is defined as having the following
characteristics:
1. Domain-specific
2. Automatic
3. Fast
4. Hard-wired in brain
5. Limited top-down access (you can’t “unperceive”)
•
Example: the sense of vision operates modularly.
A Modular Mind Model
central
judgment, imagination,
memory, attention
processes
modules
vision
hearing
touch
speech
transducers
eyes
ears
skin
etc.
external, physical reality
Remember this stuff?
• Speech is a “special” kind of sound because it exhibits
spectral change over time.
•  it’s processed by the speech module, not by the
auditory module.
SWS Findings
• The uninitiated either hear sinewave speech as speech or
as “whistles”, “chirps”, etc.
• Claim: once you hear it as speech, you can’t go back.
• The speech module takes precedence
• (Limited top-down access)
• Analogy: it’s impossible to not perceive real speech as
speech.
• We can’t hear the individual formants as whistles,
chirps, etc.
• Motor theory says: we don’t perceive the “sounds”, we
perceive the gestures which shape the spectrum.
More Evidence for Modularity
• It has also been observed that speech is perceived
multi-modally.
• i.e.: we can perceive it through vision, as well as
hearing (or some combination of the two).
•  We’re perceiving “gestures”
• …and the gestures are abstract.
• Interesting evidence: McGurk Effect
McGurk Effect, revealed
Audio
Visual
Perceived
ba
+
ga

da
ga
+
ba

bga
• Some interesting facts:
• The McGurk Effect is exceedingly robust.
• Adults show the McGurk Effect more than children.
• Americans show the McGurk Effect more than
Japanese.
Original McGurk Data
• Stimulus:
Auditory
Visual
ba-ba
ga-ga
• Response types:
Auditory: ba-ba
Fused:
da-da
Visual:
Combo:
gabga, bagba
ga-ga
Age
Auditory
Visual
Fused
Combo
3-5
19%
36
81
0
7-8
36
0
64
0
0
98
0
18-40 2
Original McGurk Data
• Stimulus:
Auditory
Visual
ga-ga
ba-ba
• Response types:
Auditory: ba-ba
Fused:
da-da
Visual:
Combo:
gabga, bagba
ga-ga
Age
Auditory
Visual
Fused
Combo
3-5
57%
10
0
19
7-8
36
21
11
32
18-40 11
31
0
54
Audio-Visual Sidebar
• Visual cues affect the perception of speech in nonmismatched conditions, as well.
• Scientific studies of lipreading date back to the early
twentieth century
• The original goal: improve the speech perception
skills of the hearing-impaired
• Note: visual speech cues often complement audio
speech cues
• In particular: place of articulation
• However, training people to become better lipreaders
has proven difficult…
• Some people got it; some people don’t.
Sumby & Pollack (1954)
• First investigated the influence of visual information on the
perception of speech by normal-hearing listeners.
• Method:
• Presented individual word tokens to listeners in noise,
with simultaneous visual cues.
• Task: identify spoken word
• Clear:
• +10 dB SNR:
• + 5 dB SNR:
• 0 dB SNR:
Sumby & Pollack data
Auditory-Only
Audio-Visual
• Visual cues provide an intelligibility boost equivalent to
a 12 dB increase in signal-to-noise ratio.
Tadoma Method
• Some deaf-blind people learn to perceive speech
through the tactile modality, by using the Tadoma
method.
Audio-Tactile Perception
• Fowler & Dekle: tested ability of (naive) college students
to perceive speech through the Tadoma method.
• Presented synthetic stops auditorily
• Combined with mismatched tactile information:
• Ex: audio /ga/ + tactile /ba/
• Also combined with mismatched orthographic information:
• Ex: audio /ga/ + orthographic /ba/
• Task: listeners reported what they “heard”
• Tactile condition biased listeners more towards “ba”
responses
Fowler & Dekle data
read “ba”
orthographic
mismatch
condition
felt “ba”
tactile
mismatch
condition
Another Piece of the Puzzle
• Another interesting finding which has been used to
argue for the “speech is special” theory is duplex
perception.
• Take an isolated F3 transition:
and present it to one ear…
Do the Edges First!
• While presenting this spectral frame to the other ear:
Two Birds with
One Spectrogram
• The resulting combo is perceived in duplex fashion:
• One ear hears the F3 “chirp”;
• The other ear hears the combined stimulus as “da”.
Duplex Interpretation
• Check out the spectrograms in Praat.
• Mann and Liberman (1983) found:
• Discrimination of the F3 chirps is gradient when
they’re in isolation…
• but categorical when combined with the spectral
frame.
• (Compare with the F3 discrimination experiment with
Japanese and American listeners)
• Interpretation: the “special” speech processor puts
the two pieces of the spectrogram together.
fMRI data
• Benson et al. (2001)
• Non-Speech stimuli = notes, chords, and chord
progressions on a piano
fMRI data
• Benson et al. (2001)
• Difference in activation for natural speech stimuli versus
activation for sinewave speech stimuli
Mirror Neurons
• In the 1990s, researchers in Italy discovered what they
called mirror neurons in the brains of macaques.
• Macaques had been trained to make grasping motions
with their hands.
• Researchers recorded the activity of single neurons
while the monkeys were making these motions.
• Serendipity:
• the same neurons fired when the monkeys saw the
researchers making grasping motions.
•  a neurological link between perception and action.
• Motor theory claim: same links exist in the human brain,
for the perception of speech gestures
Motor Theory, in a nutshell
•
The big idea:
•
•
We perceive speech as abstract “gestures”, not
sounds.
Evidence:
1. The perceptual interpretation of speech differs
radically from the acoustic organization of speech
sounds
2. Speech perception is multi-modal
3. Direct (visual, tactile) information about gestures can
influence/override indirect (acoustic) speech cues
4. Limited top-down access to the primary, acoustic
elements of speech