Neural Encoding of Speech in Auditory Cortex Jonathan Z. Simon Department of Biology Department of Electrical & Computer Engineering Institute for Systems Research University of Maryland University College London, 22 June 2015 http://www.isr.umd.edu/Labs/CSSL/simonlab
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Neural Encoding of Speech in Auditory Cortex
Jonathan Z. SimonDepartment of BiologyDepartment of Electrical & Computer EngineeringInstitute for Systems Research
University of Maryland
University College London, 22 June 2015http://www.isr.umd.edu/Labs/CSSL/simonlab
Reconstruction accuracy comparable to single unit & ECoG recordings
(up to ~ 10 Hz)
MEG Responses
...
DecoderSpeech Envelope
Neural Representation of Speech: Temporal
Speech in Noise
Ding & Simon, J Neuroscience (2013)
Speech in Noise
Ding & Simon, J Neuroscience (2013)
Speech in Noise: Results
+6 dB
-6 dB1 s
A Neural Reconstruction ofUnderlying Speech Envelope
B Reconstruction Accuracy
corre
latio
n
SNR (dB)Q +6 +2 −3 −6 −9
0
.1
.2
0 25 50 75 1000
.1
.2
.3
C Correlation with Intelligiblity
intelligiblity (%)
reco
nstru
ctio
n ac
cura
cy
Ding & Simon, J Neuroscience (2013)
Speech in Noise: Results
+6 dB
-6 dB1 s
A Neural Reconstruction ofUnderlying Speech Envelope
B Reconstruction Accuracy
corre
latio
n
SNR (dB)Q +6 +2 −3 −6 −9
0
.1
.2
0 25 50 75 1000
.1
.2
.3
C Correlation with Intelligiblity
intelligiblity (%)
reco
nstru
ctio
n ac
cura
cy
Ding & Simon, J Neuroscience (2013)
Speech in Noise: Results
+6 dB
-6 dB1 s
A Neural Reconstruction ofUnderlying Speech Envelope
+6 dB
-6 dB1 s
A Neural Reconstruction ofUnderlying Speech Envelope
B Reconstruction Accuracy
corre
latio
n
SNR (dB)Q +6 +2 −3 −6 −9
0
.1
.2
0 25 50 75 1000
.1
.2
.3
C Correlation with Intelligiblity
intelligiblity (%)
reco
nstru
ctio
n ac
cura
cy
Ding & Simon, J Neuroscience (2013)
Speech in Noise: Results
+6 dB
-6 dB1 s
A Neural Reconstruction ofUnderlying Speech Envelope
+6 dB
-6 dB1 s
A Neural Reconstruction ofUnderlying Speech Envelope
B Reconstruction Accuracy
corre
latio
n
SNR (dB)Q +6 +2 −3 −6 −9
0
.1
.2
0 25 50 75 1000
.1
.2
.3
C Correlation with Intelligiblity
intelligiblity (%)
reco
nstru
ctio
n ac
cura
cy
Ding & Simon, J Neuroscience (2013)
Speech in Noise: Results
+6 dB
-6 dB1 s
A Neural Reconstruction ofUnderlying Speech Envelope
+6 dB
-6 dB1 s
A Neural Reconstruction ofUnderlying Speech Envelope
B Reconstruction Accuracy
corre
latio
n
SNR (dB)Q +6 +2 −3 −6 −9
0
.1
.2
0 25 50 75 1000
.1
.2
.3
C Correlation with Intelligiblity
intelligiblity (%)
reco
nstru
ctio
n ac
cura
cy
+6 dB
-6 dB1 s
A Neural Reconstruction ofUnderlying Speech Envelope
B Reconstruction Accuracy
corre
latio
n
SNR (dB)Q +6 +2 −3 −6 −9
0
.1
.2
0 25 50 75 1000
.1
.2
.3
C Correlation with Intelligiblity
intelligiblity (%)
reco
nstru
ctio
n ac
cura
cyacross Subjects
Ding & Simon, J Neuroscience (2013)
Noise-Vocoded Speech
Ding, Chatterjee & Simon, NeuroImage (2014)
“in noise” = +3 dB SNR
Noise-Vocoded Speech: Results
• Cortical entrainment to natural speech robust to noise• Cortical entrainment to vocoded speech is not• Not explainable by passive envelope tracking mechanisms
- noise vocoding does not directly affect the stimulus envelope
Noise-Vocoded Speech: Results
Cortical Speech Representations
• Neural Representations: Encoding & Decoding
• Linear models: Useful & Robust
• Speech Envelope only (as seen by MEG)
• Envelope Rates: ~ 1 - 10 Hz
Alex Katz, The Cocktail Party
The Cocktail Party
Alex Katz, The Cocktail Party
The Cocktail Party
Alex Katz, The Cocktail Party
The Cocktail Party
Alex Katz, The Cocktail Party
The Cocktail Party
Alex Katz, The Cocktail Party
The Cocktail Party
speech
competing speech
Experiments
speech
competing speech
Experiments
reverberation
Experiments in Progress
speech
competing speech
Experiments in Progress
olderlistener
speech
competing speech
Experiments in Progress
competing speech
speech
competing speech
Two Competing Speakers
Selective Neural Encoding
Selective Neural Encoding
Selective Neural Encoding
Unselective vs. Selective Neural Encoding
Unselective vs. Selective Neural Encoding
Selective Neural Encoding
Stream-Specific Representation
grand average over subjects
representative subject
Identical Stimuli!
reconstructed from MEG
attended speech envelopes
reconstructed from MEG
attending tospeaker 1
attending tospeaker 2
Ding & Simon, PNAS (2012)
Stream-Specific Representation
grand average over subjects
representative subject
Identical Stimuli!
reconstructed from MEG
attended speech envelopes
reconstructed from MEG
attending tospeaker 1
attending tospeaker 2
Ding & Simon, PNAS (2012)
Single Trial Speech Reconstruction
Ding & Simon, PNAS (2012)
Single Trial Speech Reconstruction
Ding & Simon, PNAS (2013)
Overall Speech Reconstruction
0.2
0
0.1
corre
latio
n
attended speechreconstruction
backgroundreconstruction
attended speech background
Distinct neural representations for different speech streams
Invariance Under Relative Loudness Change?
Invariance Under Relative Loudness Change?
Invariance under Relative Loudness Change
attended
backgroundcorr
elat
ion
.1
.2
-8 -5 0 5 8Speaker Relative Intensity (dB)
Neural Results
• Neural representation invariant to relative loudness change
•M100STRF strongly modulated by attention, but not M50STRF
attended
.2
.5
1
3
0 100 200
Background
fre
qu
en
cy (
kH
z)
.2
.5
1
3
0 100 200
Attended
time (ms) time (ms)
background
Neural Sources
RightLeft
anterior
posterior
medial
M50STRFM100STRFM100
•M100STRF source near (same as?) M100 source: Planum Temporale
•M50STRF source is anterior and medial to M100 (same as M50?): Heschl’s Gyrus
5 mm
•PT strongly modulated by attention, but not HG
Cortical Object-Processing Hierarchy
0 100 200 400time (ms)
0
attendedbackground
Attentional Modulation
0 100 200 400
0
time (ms)
clean
-5 dB-8 dB
Influence of Relative Intensity
0 dB5 dB8 dB
•M100STRF strongly modulated by attention, but not M50STRF.•M100STRF invariant against acoustic changes.•Objects well-neurally represented at 100 ms, but not 50 ms.