music and speech
Download
Report
Transcript music and speech
Human Cognition: Decoding Perceived,
Attended, Imagined Acoustic Events and
Human-Robot Interfaces
The Team
•
•
•
•
•
•
•
•
Adriano Claro Monteiro
Alain de Cheveign
Anahita Mehta
Byron Galbraith
Dimitra Emmanouilidou
Edmund Lalor
Deniz Erdogmus
Jim O’Sullivan
•
•
•
•
•
•
Mehmet Ozdas
Lakshmi Krishnan
Malcolm Slaney
Mike Crosse
Nima Mesgarani
Jose L Pepe ContrerasVidal
• Shihab Shamma
• Thusitha Chandrapala
The Goal
• To determine a reliable measure of imagined
audition using electroencephalography (EEG).
• To use this measure to communicate.
What types of imagined audition?
• Speech:
– Short (~3-4s) sentences
• “The whole maritime population of Europe and America.”
• “Twinkle-twinkle little star.”
• “London bridge is falling down, falling down, falling down.”
• Music
– Short (~3-4s) phrases
• Imperial March from Star Wars.
• Simple sequence of tones.
• Steady-State Auditory Stimulation
– 20 s trials
• Broadband signal amplitude modulated at 4 or 6 Hz
The Experiment
• 64 – channel EEG system (Brain Vision LLC – thanks!)
• 500 samples/s
• Each “trial” consisted of the presentation of the actual
auditory stimulus (“perceived” condition) followed (2 s later)
by the subject imagining hearing that stimulus again
(“imagined” condition).
The Experiment
• Careful control of experimental timing.
• Perceived...2s... Imagined...2 s x 5
... Break... next stimulus
4, 3, 2, 1,
+
Data Analysis - Preprocessing
• Filtering
• Independent Component Analysis (ICA)
• Time-Shift Denoising Source Separation (DSS)
– Looks for reproducibility over stimulus repetitions
Data Analysis: Hypothesis-driven.
• The hypothesis:
– EEG recorded while people listen to (actual) speech varies in a
way that relates to the amplitude envelope of the presented
(actual) speech.
– EEG recorded while people IMAGINE speech will vary in a way
that relates to the amplitude envelope of the IMAGINED
speech.
Data Analysis: Hypothesis-driven.
• Phase consistency over trials...
• EEG from same sentence
imagined over several trials
should show consistent phase
variations.
• EEG from different imagined
sentences should not show
consistent phase variations.
Data Analysis: Hypothesis-driven.
Actual speech
Consistency in theta (4-8Hz) band
Imagined speech
Consistency in alpha (8-14Hz) band
Data Analysis: Hypothesis-driven.
Data Analysis: Hypothesis-driven.
• Red line – perceived music
• Green line – imagined music
Data Analysis - Decoding
Data Analysis - Decoding
Original
Reconstruction
London’s Bridge
1
Twinkle Twinkle
1.2
0.9
1
0.8
0.7
0.8
0.6
0.6
0.5
0.4
0.4
0.3
0.2
0.2
0
0.1
0
0
20
40
60
80
100
120
r = 0.30, p = 3e-5
140
160
180
200
-0.2
0
20
40
60
80
100
120
140
r = 0.19, p = 0.01
160
180
200
Data Analysis - SSAEP
Data Analysis - SSAEP
Perceived
4Hz
6Hz
Imagined
Data Analysis
• Data Mining/Machine Learning Approaches:
Data Analysis
• Data Mining/Machine Learning Approaches:
SVM Classifier Input
Class Labels
EEG data (channels × time) :
𝑒 𝑡 1
𝐸𝐸𝐺 = ⋮
𝑒(𝑡)64
⋮
Concatenate channels:
𝐸𝐸𝐺 = 𝑒 𝑡
1
1
1
1
1
1
0
0
…
𝑒(𝑡)64
1
0
0
0
0
0
1
Group N trials:
𝐸𝐸𝐺1
𝑋= ⋮
𝐸𝐸𝐺𝑁
Input covariance matrix:
𝐶𝑋 = 𝑋𝑋 𝑇
Predicted Labels
0
0
1
0
1
⋮
1
1
0
1
1
SVM Classifier Results
Decoding imagined speech and music:
Mean DA = 87%
Mean DA = 90%
Mean DA = 90%
DCT Processing Chain
DSS Output
(Look for repeatability)
Raw EEG Signal
(500Hz data)
Mean input1
DCT Output
(Reduce dimensionality)
Mean DSS result for out1
DCT Model for class 1
0.5
10
2
1
20
4
30
1.5
6
40
8
50
60
2
10
200
400
600
800
1000
1200
1400
1600
1800
200
400
Mean input2
600
800
1000
1200
1400
1600
1800
2.5
1
2
3
Mean DSS result for out2
4
5
6
7
8
9
10
8
9
10
DCT Model for class 2
0.5
10
2
1
20
4
30
1.5
6
40
50
8
60
10
200
400
600
800
1000
1200
1400
1600
1800
2
200
400
600
800
1000
1200
1400
1600
1800
2.5
1
2
3
4
5
6
7
Percentage accuracy
DCT Classification Performance
Data Analysis
• Data Mining/Machine Learning Approaches:
– Linear Discriminant Analysis on Different Frequency Bands
Music vs Speech
Speech 1 vs Speech 2
Music 1 vs Music 2
Speech vs Rest
Music vs Rest
- results ~ 50 – 66%
Summary
• Both hypothesis drive and machine-learning approaches indicate
that it is possible to decode/classify imagined audition
• Many very encouraging results that align with our original
hypothesis
• More data needed!!
• In a controlled environment!!
• To be continued...