Top Banner
Speech perception as a window into language processing: Real-time spoken word recognition, specific language impairment, and CIs Bob McMurray Dept. of Psychology Dept. of Communication Sciences and Disorders
197

Speech perception as a window into language processing:

Jan 14, 2016

Download

Documents

dolan

Speech perception as a window into language processing: Real-time spoken word recognition, specific language impairment, and CIs. Bob McMurray Dept. of Psychology Dept. of Communication Sciences and Disorders. Thanks to. Richard N. AslinMichael K. Tanenhaus Meghan Clayards. Jennifer - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Speech perception as a window into language processing:

    Real-time spoken word recognition, specific language impairment, and CIs Bob McMurray Dept. of PsychologyDept. of Communication Sciences and Disorders

  • Thanks toJoe ToscanoKeith ApfelbaumGwyn RostAshley Farris-TrimbleMarcus GalleCheyenne MunsonDan McEchronJoel DennhardtJessica WalkerLea Greiner

  • IntroductionsI do bas and pas

  • PsychologyLinguisticsPhilosophyComputerScienceNeuroscienceCognitive Science Diverse fields are united by their commitment to understand the basic mechanisms or processes that underlie perception, cognition, and language wherever they occur.

  • Language disorders Useful way to justify basic research to NIH...

  • GenesBehaviorPhys./Social EnviromentBehaviorBehaviorBehaviorPhys./Social EnviromentPhys./Social EnviromentPhys./Social EnviromentPhys./Social EnviromentIntroductionsDevelopmental Science

    Development is Multiply determined. Product of interactions between levels of analysis. Characterized by non-obvious causation. Has no single end-state

  • Language disorders Useful way to justify basic research to NIH... Useful reminder of the multi-potential nature of language developmentbut not a rigorous way to approach basic theoretical questions.

  • Carl SeashoreCora Busey Hillis & Beth Wellman Boyd McCandless & Charlie Spiker Wendell Johnson E.F. Lindquist Bruce Tomblin

  • Language disorders Useful way to justify basic research to NIH... Useful reminder of the multi-potential nature of language developmentbut not a rigorous way to approach basic theoretical questions.

  • Individual differences (including disorders) in language development and outcomes:

    Reveal range of variation that our theories must account for. Allows examination of the consequences of variation in the internal structure of the system.

  • Light SensorWireWheel/MotorBraitenberg (1984) VehiclesLight SeekingLight AvoidingSimple mechanisms give rise to complex behavior.But many such mechanisms are possible.-Easier to understand mechanism by building outward, rather than observing inward.Disordered language users allow us to observe consequences of a change in mechanism.

  • Individual differences (including disorders) in language development and outcomes:

    Reveal range of variation that our theories must account for. Allow us to examine the consequences of variation in the internal structure of the system.

  • PsychologyLinguisticsPhilosophyComputerScienceNeuroscienceThe Future of Cognitive Science, UC Merced,May 2008

  • PsychologyLinguisticsPhilosophyComputerScienceNeuroscienceEducationAnthropologyMathematicsBiologyRoboticsSpeech Pathology Clinical Psych Movement Sci.Psychiatry

  • Individual differences (including disorders) in language development and outcomes:

    Reveal range of variation that our theories must account for. Allow us to examine the consequences of variation in the internal structure of the system.

    But simultaneously Detailed understanding of the process of language use and development may enable us to better understand disorders.

  • A process-oriented approach toindividual differences.beachDefine the process:What steps does the brain/mind/language system/child take to get from some clearly defined input to some clearly defined output?2) How can we measure this process as it happens?3) Identify a population:What will we relate variation in process to?4) What dimensions can vary within that process?Which covary with outcome variables?

  • Goal today:Show how understanding the real-time (and developmental) processes that underlie language in normal listeners can offer an important (complementary) viewpoint on individual differences.

    But first: I have to show you what those processes look like (and dispel a few misconceptions about speech perception).

    DisclaimerMuch of this work is examines adults or older kids.Easier to measure real-time process using more complex tasks Easier to conceptualize process without having to worrying about development (as much).Consequently, we take an individual differences approach, rather developmental (but ask me about development).

  • Speech perception as a language processProblems of Speech and word recognitionFine-grained detail and word recognition.Revisiting categorical perceptionUsing acoustic detail over time.The beginnings of a comprehensive approach.

    2) Individual differencesA process view of individual differences.Case study 1: SLIEye-movement methods for individual differences.Case study 2: Cochlear Implants.Overview

  • Speech perception as a language processProblems of Speech and word recognitionFine-grained detail and word recognition.Revisiting categorical perceptionUsing acoustic detail over time.The beginnings of a comprehensive approach.

    2) Individual differencesA process view of individual differences.Case study 1: SLIEye-movement methods for individual differences.Case study 2: Cochlear Implants.Overview

  • Speech perception, word recognition and their development are an ideal domain for these questions.

    Excellent understanding of inputAcoustics of a single word.Statistical properties of a language

    The Domain: Speech & Wordsbeach

  • Major theoretical issue: lack of invariance.

    Acoustic cues do not directly distinguish categories due toTalker variation (Allen & Miller, 2003; Jongman, Wayland & Wang, 2000; Peterson & Barney, 1955).Influence of neighboring phonemes (coarticulation) (Fowler & Smith, 1986; Delattre, Liberman & Cooper, 1955)Speaking rate variation (Miller, Green & Reeves, 1986; Summerfield, 1981)Dialect variation (Clopper, Pisoni & De Jong, 2006)

  • Allen & Miller, 1999

  • 40050060070080090010001000120014001600180020002200F2 (Hz)F1 (Hz)Cole, Linebaugh, Munson & McMurray, 2010, J. Phon

  • 40050060070080090010001000120014001600180020002200F2 (Hz)F1 (Hz)Cole, Linebaugh, Munson & McMurray, 2010, J. Phon

  • Speech perception, word recognition and their development are an ideal domain for these questions.

    Excellent understanding of inputAcoustics of a single word.Statistical properties of a languageBut difficult problem to solve

    Tractable output units.

    The Domain: Speech & Wordsbeach

  • The Domain: Speech & WordsActivate wordsIdentify phonemesExtract acoustic cuesBut phonemes:Have no meaning in isolation.Theoretically controversial (Port, 2007; Pisoni, 1997)Hard to measure directly (e.g., Norris, McQueen & Cutler, 2000; Pisoni & Tash, 1974; Schouten, Gerrits & Van Hessen, 2003) particularly in populations with poor phoneme awareness, metalinguistic ability.particularly in a way that gives online (moment-by-moment) measurement.

  • Activate wordsIdentify phonemesExtract acoustic cuesThe Domain: Speech & Words

  • Meaning (semantics)Reference(pragmatics)Sentence Processing(syntax)Activate wordsThe Domain: Speech & WordsWordsFunctionally relevant: Crucial for semantics, sentence processing, reference.Most everyone agrees on them (but see Elman, 2008, SRCLD)

  • Online Word RecognitionInformation arrives sequentiallyAt early points in time, signal is temporarily ambiguous.Later arriving information disambiguates the word.Major theoretical issue in word recognition: time

  • Online Word RecognitionIf input is phonemic, word recognition is characterized by:ImmediacyActivation BasedtimeInput:s... n d l sandal Parallel Processing Competitionsoup

  • How do we measure unfolding activation? Eye-movements in the Visual World ParadigmSubjects hear spoken language and manipulate objects in a visual world. Visual world includes set of objects with interesting linguistic properties.a sandal, a sandwich, a candle and an unrelated items.Eye-movements to each object monitored throughout task.Measuring Temporal DynamicsTanenhaus, Spivey-Knowlton, Eberhart & Sedivy, 1995Allopenna, Magnuson & Tanenhaus, 1998

  • A moment to view the itemsTask

  • Task

  • TaskSandal

  • TaskRepeat 200-1000 times

  • TaskBearRepeat 200-1000 times

    (new words, locations, etc)

  • Relatively natural task.Easy to use with clinical populations: - Children with dyslexia (Desroches, Joanisse, & Robertson, 2006), - Autistic children (Brock, Norbury, Einav, & Nation, 2008; Campana, Silverman, Tanenhaus, Bennetto, & Packard, 2005)- People with aphasia (Yee, Blumstein, & Sedivy, 2004, 2008). -Children with SLI (Nation, Marshall, & Altmann, 2003)Why use eye-movements and visual world paradigm?

  • Relatively natural task.Easy to use with clinical populations:

    Eye-movements generated very fast (within 200ms of first bit of information).Eye movements time-locked to speech.Subjects arent aware of eye-movements.Fixation probability maps onto lexical activation..Measures a functional language ability.Why use eye-movements and visual world paradigm?

  • Eye movement analysisTarget = SandalCohort = SandwichRhyme = CandleUnrelated = Necklace200 ms12345Trials

  • Allopenna, Magnuson & Tanenhaus, 1998McMurray, Samelson, Lee & Tomblin, 2010s nd l

  • 00.10.20.30.40.50.60.70.80.90500100015002000Time (ms)Fixation Proportion

  • Meaning (semantics)Reference(pragmatics)Sentence Processing(syntax)Activate wordsWordsFunctionally relevant: Crucial for semantics, sentence processing, reference.Most everyone agrees on them (but see Elman, 2009, SRCLD)Easy to measure directly (e.g., Tanenhaus, Spivey-Knowlton, Sedivy & Eberhart, 1995; Allopenna, Magnuson & Tanenhaus, 1998; even in populations with poor phoneme awareness, metalinguistic ability.online (moment-by-moment) data.Measuring speech perception through the lens of spoken word recognitionEnsures that whatever differences we find matter for the next level up.Theoretically more grounded.Multi-dimensional online measure

  • Speech perception, word recognition and their development are an ideal domain for these questions.

    Excellent understanding of input Acoustics of a single word. Statistical properties of a language But difficult problem to solve

    Tractable output units. Spoken word recognition But problem of time

    The Domain: Speech & Wordsbeach

  • Speech perception, word recognition and their development are an ideal domain for these questions.

    Excellent understanding of input Acoustics of a single word. Statistical properties of a language But difficult problem to solve

    Tractable output units. Spoken word recognition But problem of time

    Associated with many impairments.

    The Domain: Speech & Wordsbeach

  • TaskAuditory, Speech or Lexical Deficits have been reported in a variety of clinical populationsSpecific / Non-specific Language ImpairmentDyslexia / Struggling ReadersAutismCerebellar DamageBrocas AphasiaDowns SyndromeHard of HearingCochlear Implant UsersCognitive DeclineSchizophrenia

  • ? So whats the process?How do listeners map a highly variable acoustic input onto lexical candidates as the input unfolds over time?

  • Activate wordsIdentify phonemesExtract acoustic cuesVariance Reduction in SpeechNormalizationWarping perceptual spaceCompetitionGraded ActivationCategorical Perception

  • Problems with Variance ReductionProblemsContinuous detail could be useful (Martin & Bunnel, 1981; Gow, 2001; McMurray et al, 2009).bak bas bar

  • Problems with Variance ReductionProblemsContinuous detail could be useful (Martin & Bunnel, 1981; Gow, 2001; McMurray et al, 2009).Some useful variation is not phonemic (Salverda, Dahan & McQueen, 2003; Gow & Gordon, 1995)Acoustic cues are spread out over time how do you know when you are done with a phoneme and ready for word recognition?

  • dgnFowler, 1984; Cole, Linebaugh, Munson & McMurray, 2010Problems with Variance ReductionProblemsContinuous detail could be useful (Martin & Bunnel, 1981; Gow, 2001; McMurray et al, 2009).Some useful variation is not phonemic (Salverda, Dahan & McQueen, 2003; Gow & Gordon, 1995)Acoustic cues are spread out over time how do you know when you are done with a phoneme and ready for word recognition?

  • The alternativeFine-grained detail can bias lexical activation.Let lexical competition sort it out.AdvantagesHelps with invariance not making a firm commitment on any given cue. Lexicon may offer more support.Helps with time use fine-grained detail to make earlier commitments.

    ButThis stands in stark contrast to findings of categorical perception (Liberman, Harris, Hoffman & Griffith, 1957)

  • Categorical PerceptionSubphonemic variation in VOT is discarded in favor of a discrete symbol (phoneme).

  • Categorical PerceptionEvidence against categorical perception fromDiscrimination task variants (Schouten, Gerrits & Van Hessen, 2003; Carney, Widden & Viemeister, 1977)Training studies (Carney et al., 1977; Pisoni & Lazarus, 1974)Rating tasks (Massaro & Cohen, 1983)But no evidence that this fine grained detail actually affects higher level (lexical) processes.

  • Speech perception as a language processProblems of Speech and word recognitionFine-grained detail and word recognition.Revisiting categorical perceptionUsing acoustic detail over time.The beginnings of a comprehensive approach.

    2) Individual differencesA process view of individual differences.Case study 1: SLIEye-movement methods for individual differences.Case study 2: Cochlear Implants.Overview

  • Speech perception as a language processProblems of Speech and word recognitionFine-grained detail and word recognition.Revisiting categorical perceptionUsing acoustic detail over time.The beginnings of a comprehensive approach.

    2) Individual differencesA process view of individual differences.Case study 1: SLIEye-movement methods for individual differences.Case study 2: Cochlear Implants.Overview

  • Integrating speech and wordsMy intuition:word recognition mechanisms can cope with variability.sensitivity to gradient acoustic detail can help solve problem of time.

    But only if word recognition and perception are continuously coupled:

    If, activation for lexical candidates gradiently reflects continuous acoustic detail.

    Then, these mechanisms can help sort it out.

  • Does activation for lexical competitors reflect continuous detail?- during online recognitionNeed:

    tiny acoustic gradationsonline, temporal word recognition taskSee Also Andruski, Blumstein & Burton (1994)Utman, Blumstein & Burton (2002)McMurray, Tanenhaus & Aslin (2002)McMurray, Tanenhaus, Aslin & Spivey (2003)

  • beach/peachbale/palebear/pearbump/pumpbomb/palmbutter/putterGradations in the Signal

  • TaskBearRepeat 1080 times

  • By subject:17.25 +/- 1.33ms By item:17.24 +/- 1.24ms High agreement across subjects and items for category boundary.VOT (ms)proportion /p/BPIdentification Results

  • Time (ms)More looks to competitor than unrelated items.VOT=0 Response=Fixation proportionp
  • Gradiency?Given that the subject heard bear clicked on bearHow often was the subject looking at the pear?Categorical ResultstargetcompetitortargetcompetitorVOT0100PB% /p/ID (%/pa/)VOT0100PB% /p/ID (%/pa/)

  • Gradiency?Given that the subject heard bear clicked on bearHow often was the subject looking at the pear?Categorical ResultsGradient Effecttargetcompetitortargetcompetitorcompetitorcompetitor

  • 0 ms5 msVOTVOTCompetitor FixationsTime since word onset (ms)Response=Response=Long-lasting gradient effect: seen throughout the timecourse of processing.

  • VOT (ms)CategoryBoundaryResponse=Response=Looks to Looks to Competitor Fixations p=.017 p
  • VOT (ms)CategoryBoundaryResponse=Response=Looks to Looks to Competitor Fixations p=.014 p=.001VOTLinear Trendp=.009p=.007Unambiguous Tokens:BP

  • SummarySubphonemic acoustic differences in VOT have gradient effect on lexical activation.Gradient effect of VOT on looks to the competitor. - Refutes strong forms of categorical perceptionFine-grained information in the signal is not discarded prior to lexical activation.

  • SummarySubphonemic acoustic differences in VOT have gradient effect on lexical activation.Extends to vowels, l/r, d/g, b/w, s/z (Clayards, Toscano, McMurray, Tanenhaus & Aslin, in prep; Galle & McMurray, in prep) Does not work with phoneme decision task (McMurray, Aslin, Tanenhaus, Spivey & Subik, 2008)8.5 month old infants (McMurray & Aslin, 2005)Color Categories (Huette & McMurray, 2010)

  • Activate wordsIdentify phonemesExtract acoustic cuesNormalizationWarping perceptual spaceCompetitionGraded ActivationCategorical Perception

  • Activate wordsIdentify phonemesExtract acoustic cuesNormalizationWarping perceptual spaceCompetitionGraded Activation

  • Speech perception as a language processProblems of Speech and word recognitionFine-grained detail and word recognition.Revisiting categorical perceptionUsing acoustic detail over time.The beginnings of a comprehensive approach.

    2) Individual differencesA process view of individual differences.Case study 1: SLIEye-movement methods for individual differences.Case study 2: Cochlear Implants.Overview

  • Speech perception as a language processProblems of Speech and word recognitionFine-grained detail and word recognition.Revisiting categorical perceptionUsing acoustic detail over time.The beginnings of a comprehensive approach.

    2) Individual differencesA process view of individual differences.Case study 1: SLIEye-movement methods for individual differences.Case study 2: Cochlear Implants.Overview

  • Variance Reduction in SpeechContinuous perceptual cue(e.g., VOT)PsychologicalResponseCategorical perception predicts a warping in the sensory encoding of the stimulus.

  • Variance Reduction in SpeechContinuous perceptual cue(e.g., VOT)PsychologicalResponseContinuous perception allows system to veridically encode what was heard. How can we measure perceptual encoding of continuous cues?

  • BehaviorCategorical PerceptionDiscrete CategoriesEncoding continuous cuesIt is difficult to measure cue encoding behaviorally.(Pisoni, 1973; Pisoni & Tash, 1974)Solution: go direct to the brain.Event related potentials.

  • Categorical PerceptionThe Electroencephalogram (EEG)Systematic fluctuations in voltage over time can be measured at the scalp (Berger, 1929)Related to underlying brain activity (though with a lot of filtering and scattering).

  • Categorical PerceptionEvent-Related Potentials (ERPs)Consistent patterns of EEG are triggered by a stimulus and are embedded in the overall EEG.Stim 1Stim 2Stim NEEGStim 1Stim 2Stim NAveraged ERP WaveformTime (ms)2004006000Voltage (V)+-P1N1P2N2P3

  • Perception vs. CategorizationAuditory N1: Low level auditory processes Generated in Heschls gyrus (auditory cortex / STG) Responds to pure tones and speech. Responds to change

  • How does the auditory N1 respond to continuous changes in VOT?Toscano, McMurray, Dennhardt & Luck, 2010, PsychSci

  • N1 (Auditory Encoding) shows linear effect of VOT.

  • Linear effect of VOT. Not artifact of averaging across subjects.Affected by place of articulation.No effect of target-type, response.

  • Early brain responses encode speech cues veridically.N1: low-level encoding is not affected by categories at all.

    Veridical encoding leads to graded categorization.Eye-movement results: categories are graded.

    Gradiency in the input is preserved throughout the processing stream.Experiment 1: SummaryEncoding continuous cuesCategories

  • Activate wordsIdentify phonemesExtract acoustic cuesVariance Reduction in SpeechNormalizationWarping perceptual spaceCompetitionGraded Activation

  • Speech perception as a language processProblems of Speech and word recognitionFine-grained detail and word recognition.Revisiting categorical perceptionUsing acoustic detail over time.The beginnings of a comprehensive approach.

    2) Individual differencesA process view of individual differences.Case study 1: SLIEye-movement methods for individual differences.Case study 2: Cochlear Implants.Overview

  • Speech perception as a language processProblems of Speech and word recognitionFine-grained detail and word recognition.Revisiting categorical perceptionUsing acoustic detail over time.The beginnings of a comprehensive approach.

    2) Individual differencesA process view of individual differences.Case study 1: SLIEye-movement methods for individual differences.Case study 2: Cochlear Implants.Overview

  • dgnCole, Linebaugh, Munson & McMurray, 2010Why Speech Perception?ProblemsContinuous detail could be useful (Martin & Bunnel, 1981; Gow, 2001; McMurray et al, 2009).Some useful variation is not phonemic (Salverda, Dahan & McQueen, 2003; Gow & Gordon, 1995)Acoustic cues are spread out over time how do you know when you are done with a phoneme and ready for word recognition?

  • Activate wordsIdentify phonemesExtract acoustic cuesCompetitionGraded ActivationIs phoneme recognition done before word recognition begins?

  • Temporal IntegrationMcMurray, Clayards, Tanenhaus & Aslin (2008, PB&R)Toscano & McMurray (submitted)

  • LexicontimeBuffer modelProblemsVowel length not be available until the end of the word.How do you know when the buffer has enough information?What about early lexical commitments?

  • LexicontimeVOTVowelLengthtimeVowel LengthVOT

    Integration at the LexiconBuffer model

  • ?When do effects on lexical activation occur?

    VOT effects cooccurs with vowel length.(Buffered Integration)

    VOT precedes vowel length.(Lexical integration)McMurray, Clayards, Tanenhaus & Aslin (2008, PB&R)Toscano & McMurray (submitted)

  • 2 Vowel LengthsxThe usual task1080 Trials9-step VOT continua (0-40 ms) beach/peachbeak/peakbees/peas

  • Mouse click results

  • 00.020.040.060.080.10.120.14-30-25-20-15-10-50Distance from Boundary (VOT)Competitor FixationsVOT: Regression slope of competitor fixations as a function of VOT.Compute 2 effect sizes at each 20 ms time slice.

  • 00.020.040.060.080.10.120.14-30-25-20-15-10-50Distance from Boundary (VOT)Competitor FixationsCompute 2 effect sizes at each 20 ms time slice.VOT: Regression slope of competitor fixations as a function of VOT.

  • VoicingVOT: 228 msVowel: 548 ms

  • Temporal Integration SummaryVOT used as soon as it is available:Replicates with b/w.Replicates for natural continua (Toscano & McMurray, submitted)Also shown when the primary cue comes after the secondary cue (Galle & McMurray, in prep)

    Preliminary decisions cascade all the way to lexical processes.

    Make a partial (lexical) commitmentUpdate as new information arrives.Lexical competition processes are primary.

  • Activate wordsIdentify phonemesExtract acoustic cuesVariance Reduction in SpeechLexical activation is sensitive to information that should have been lost during categorization.

    Integrating low-level material seems to occur at lexical level.

    What are the role of phonemic representations in speech perception?

  • Speech perception as a language processProblems of Speech and word recognitionFine-grained detail and word recognition.Revisiting categorical perceptionUsing acoustic detail over time.The beginnings of a comprehensive approach.

    2) Individual differencesA process view of individual differences.Case study 1: SLIEye-movement methods for individual differences.Case study 2: Cochlear Implants.Overview

  • Speech perception as a language processProblems of Speech and word recognitionFine-grained detail and word recognition.Revisiting categorical perceptionUsing acoustic detail over time.The beginnings of a comprehensive approach.

    2) Individual differencesA process view of individual differences.Case study 1: SLIEye-movement methods for individual differences.Case study 2: Cochlear Implants.Overview

  • How do we approach the lack of invariance?

    Hedge our bets: make graded commitments and wait for more information.

    Use multiple sources of information.How far can this get us?

  • McMurray & Jongman (2011, Psychological Review) Collected 2880 recordings of the 8 fricatives.- 20 speakers, 6 vowels. Measured 24 different cues for each token. Humans classified a subset of 240 tokens.

  • 8 Categories24 CuesLogistic RegressionIs the information present in the input sufficient to distinguish categories?

    All cues reported in literature (+5 new ones) Overly powerful learning model.

    Asymptotic statistical learning model

  • Human performance: 91.2% correct.74.5% 83.3%79.2% 85.0%More information Better performance. But still not as good as listeners.

    Why shouldnt it be? We measured everything. Supervised learning. Optimal statistical classifier.

  • Still need to compensate for variability in cues due to speaker, vowel.Raw ValuesCole, Linebaugh, Munson & McMurray (2010)see also Fowler & Smith (1986), Gow (2003)Simple compensation scheme:Listener identifies speaker, vowel.Recodes cues relative to expectations for that speaker/vowel.Crucially: this maintains a continuous representation and does not discard fine-grained detail.

  • Still need to compensate for variability in cues due to speaker, vowel.Raw ValuesCole, Linebaugh, Munson & McMurray (2010)see also Fowler & Smith (1986), Gow (2003)Crucially: this maintains a continuous representation and does not discard fine-grained detail.

  • Measurements used as input to a logistic regression classifier.Matched to human performance on the same recordings: 91.2% correct.79.2% 85.0%All 24 cues87.0% 92.9%+compensation

  • Measurements used as input to a logistic regression classifier.Matched to human performance on the same recordings: 91.2% correct.All 24 cues+compensation

  • We can match human performance with a simple model as long as:

    System codes many sources of information.- No single cue is crucial.- Redundancy is the key.

    2)Cues are encoded veridically and continuously- Need to preserve as much information as possible.

    3)Cues are encoded relative to expected values derived from context (e.g. speaker and vowel).

  • Speech and Word RecognitionSo what is the process of speech perception?

    Early perceptual processes are continuous.- Many many cues are used.- Cues are coded relative to expectations about talker, neighboring phonemes (etc).

    2) Make graded commitment at lexical level.- Update when more information arrives.

    3) Competition between lexical items sorts it out.- Language processes are essential for speech perception.

  • Online Word RecognitiontimeInput:s... n d l sandalsoup

  • Speech perception as a language processProblems of Speech and word recognitionFine-grained detail and word recognition.Revisiting categorical perceptionUsing acoustic detail over time.The beginnings of a comprehensive approach.

    2) Individual differencesA process view of individual differences.Case study 1: SLIEye-movement methods for individual differences.Case study 2: Cochlear Implants.Overview

  • A process-oriented approach to individual differences.beachDefine the process:What steps does the brain/mind/language system/child take to get from some clearly defined input to some clearly defined output?2) How can we measure this process as it happens?3) Identify a population:What will we relate variation in process to?4) What dimensions can vary within that process?Which covary with outcome variables?

  • What type of individual differences should we be studying?

    Variation that is: Wide-spread Related to broad-based language skills? Empirically correlated with speech perception?The Domain: Speech & Words

  • Specific language impairment (SLI) has often been associated with phonological deficits (Bishop & Snowling, 2004; Joanisse & Seidenberg, 2003; Sussman, 1993)

    Generalized language deficits: morphology, word learning, perception without any obvious causal factors

    Normal non-verbal IQNo speech motor problemsNo hearing impairmentNo developmental disorderNo neurological problems Language Impairment

  • Affects 7-8% of children. Remarkably stable over development.

  • A wealth of evidence suggests a perceptual / phonological deficit associated with SLI. Impaired categorical perceptionGodfrey et al (1981); Thibodeau & Sussman (1987); Werker & Tees (1987); Leonard & McGregor (1992); Manis et al (1997); Nittrouer (1999); Blomert & Mitterer (2001); Serniclaes et al (2001); Sussman (2001); Van Alphen et al (2004); Serniclaes et al (2004); but see Coady, Kluender & Evans (2005), Gupta & Tomblin (in prep);

  • A wealth of evidence suggests a perceptual / phonological deficit associated with SLI. Impaired categorical perception

  • Dimensions of Individual DifferencesBut, given evidence against categorical perception as an organizing principle of speech perception, what does this mean?

  • Dimensions of Individual DifferencesCandidate dimensions for individual differences:

    1) Auditory processes responsible for encoding cues.But: signal is highly redundant.Listeners dont rely on any single cue (or type of cue).Auditory disruption would have to be massive.

    2) Processes ofGradually committing to a word.Updating activation as new information arrives.Competition between words.

  • Dimensions of Individual DifferencesCandidate dimensions for individual differences:

    1) Auditory processes responsible for encoding cues.But: signal is highly redundant.Listeners dont rely on any single cue (or type of cue).Auditory disruption would have to be massive.

    2) Processes ofGradually committing to a word.Updating activation as new information arrives.Competition between words.

  • 41 sets.

    Known words to our subjects (familiarity survey)

    All items appear as targets.

    Natural recordings.MethodsMcMurray, Samelson, Lee & Tomblin (2010, Cognitive Psychology)

  • Individual differences approach.Separate effects of language impairment cognitive impairmentLanguage abilityPerformance IQSLIN=20NLIN=17SCIN=16ControlsN=40

  • % CorrectRTNormal99.21429SCI99.01493SLI98.21450NLI98.21635

  • Normal Subjects

  • NLI (Language + Cognition Impaired)Overall

    All four groups perform well in task.All four groups show incremental processing Parallel activation of cohorts/rhymes.

  • Variance Reduction in Speech

  • LanguageIQSlopep=.002n.s.Asymptotep=.004n.s.Cross-overn.s.n.s.

  • Effects on target were unexpected.Why would subjects be fixating the target less (given that they correctly identified it)?

    Not due to Calibration accuracy of eye-tracker Knowledge of target words. Inability to recognize competitors.

    Suggests target may be less active.

  • Variance Reduction in Speech

  • LanguageIQOnset slopen.s.n.s.Peak Locationn.s.n.s.Peakn.s.n.s.Offset slopep=.005n.s.Baselinep=.064+n.s.

  • LanguageIQOnset slopen.s.n.s.Peak Locationn.s.n.s.Peakn.s.n.s.Offset slopen.s.n.s.Baselinep=.045n.s.

  • Summary

    IQ showed few effects.Target: lower peak fixations/activation for LICohort: higher peak fixations for LI.Rhyme: higher peak fixations for LI.

    What computational differences could account for this timecourse of activation?

  • TRACE

  • Fixation probability maps onto lexical activation (transformed via a simple linking hypothesis). (Allopenna, Magnuson & Tanenhaus, 1998; Dahan, Magnuson & Tanenhaus, 2001; McMurray, Samelson, Lee & Tomblin, 2010)Probability of FixationActivations in TRACETRACE Activations: 99% of the variance

  • TRACELexical Parameters Lexical Inhibition Phoneme->Word DecayGlobal Parameters Maximum Activation # of known wordsPerceptual Parameters Input Noise Feature Spread Feature DecayPhonological Parameters Phoneme Inhibition Feature->Phoneme Phoneme Decay

  • Global Parameters Generalized slowing # of known words

    Perceptual Parameters Input Noise Feature Spread Feature Decay

    Phonological Parameters Phoneme Inhibition Feature->Phoneme Phoneme Decay

    Lexical Parameters Lexical Inhibition Phoneme->Word Lexical DecayStrategy:

    Vary parameter.Does it yield the same kind of variability we observed in SLI?

    Summary:Most parameters failed.

  • Phoneme activation.(robustness of phonology)

  • Higher level processes (e.g. word recognition) are largely immune to variation in phoneme processing.Phoneme Inhibition(categorical perception)

  • 00.10.20.30.40.50.60.70.80.910500100015002000Time (Frames)Fixation ProbabilityLexical Decay

  • 00.010.020.030.040.05Lexical SizePhoneme InhibitionPhoneme ActivationGeneral InhibitionGeneral SlowingFeature SpreadLexical ActivationFeature DecayLexical InhibitionInput noisePhoneme DecayLexical DecayModel Fit (RMS Error)0.020.030.040.050.060.070.08Lexical SizeFeature SpreadPhoneme InhibitionGeneral InhibitionLexical InhibitionPhoneme DecayFeature DecayInput noisePhoneme ActGeneral SlowingLexical ActLexical DecayModel Fit (RMS Error)

  • Robust deficit in lexical competition processes associated with SLI. Late in processing. Too much competitor activation / not enough target.

    TRACE modeling indicates a lexical, not perceptual locus. Dynamics / stability of lexical activation over time.

  • Provides indirect evidence against a speech perception deficit in accounting for word recognition deficits in SLI.?Can we ask more directly:

    Are SLI listeners speech categories structured gradiently (like normals)?

    Are SLI listeners overly sensitive or insensitive to within-category detail?

  • Munson, McMurray & Tomblin (submitted)

  • VOT (ms)Competitor Fixations Poorly structured phonological categories?

  • VOT (ms)Competitor Fixations Improperly tuned lexical competition?

  • Subjects (IQ uncontrolled): 42 normal35 language impairedSubjects run in mobile lab at their homes and schools.

  • Normal looking identification (mouse click) functions.Few observable differences.

  • -30-20-100102030rVOT (ms)Category BoundaryLI: More looks to competitor.No effect on sensitivity to VOT.Problem is not Sensitivity to VOTNature of phonetic categories.

    Rather, lexical candidates show heightened competition.

  • SummaryRobust deficit in lexical competition associated with SLI. Late in processing. Too much competitor activation / not enough target.

    TRACE modeling indicates a lexical, not perceptual locus. Dynamics / stability of lexical activation over time.

    LI listeners do not show unique differences in their response to phonetic cues (as reflected in lexical activation).What is the source of their deficit? Are they just developmentally delayed?

  • DevelopmentDo the changes in lexical activation dynamics over development match the changes with SLI? N=17 TD adolescents. Target/Cohort/Rhyme unrelated paradigmMcMurray, Walker & Greiner (in preparation)SLI

  • DevelopmentDo the changes in lexical activation dynamics over development match the changes with SLI? N=17 TD adolescents. Target/Cohort/Rhyme unrelated paradigmSLI

  • SummaryRobust deficit in lexical competition associated with SLI (see also Dollaghan, 1998; Montgomery, 2000; Mainela-Arnold, Evans & Coady, 2008). Late in processing. Too much competitor activation / not enough target.

    TRACE modeling indicates a lexical, not perceptual locus. Dynamics / stability of lexical activation over time.

    LI listeners do not show unique differences in their response to phonetic cues (as reflected in lexical activation).There is still development in basic word recognition processes between 9 and 16.

    But: Development affects speed of target activation, early competitor activation. Different from LI.

  • Speech perception as a language processProblems of Speech and word recognitionFine-grained detail and word recognition.Revisiting categorical perceptionUsing acoustic detail over time.The beginnings of a comprehensive approach.

    2) Individual differencesA process view of individual differences.Case study 1: SLIEye-movement methods for individual differences.Case study 2: Cochlear Implants.Overview

  • Speech perception as a language processProblems of Speech and word recognitionFine-grained detail and word recognition.Revisiting categorical perceptionUsing acoustic detail over time.The beginnings of a comprehensive approach.

    2) Individual differencesA process view of individual differences.Case study 1: SLIEye-movement methods for individual differences.Case study 2: Cochlear Implants.Overview

  • ReliabilityWork on SLI and TD adolescents suggest that measuring the timecourse of word recognition can be sensitive to different causes of differences.Listeners can get to the same outcome (the word) via different routes.

    1) To what extent is this measure reliable across tests?Farris-Trimble & McMurray (in preparation)

  • 1) To what extent is this measure reliable across tests?2) To what extent is this measure about fixations and visual processes?

    Reliability

  • Test 1Test 2+1 weeksandalsandal1) To what extent is this measure reliable across tests?2) To what extent is this measure about fixations and visual processes?

    Reliability

  • Test 1Test 2+1 weeksandalsandal1) To what extent is this measure reliable across tests?2) To what extent is this measure about fixations and visual processes?

    +1 weekTest 3Reliability

  • Test 1Test 2+1 week+1 weeksandalsandalTest 31) To what extent is this measure reliable across tests?2) To what extent is this measure about fixations and visual processes?

    Reliability

  • Test 1Test 2+1 week+1 weeksandalsandalTest 3Reliability1) To what extent is this measure reliable across tests?2) To what extent is this measure about fixations and visual processes?

  • Variance Reduction in SpeechTargetCohort

    R2PredictorAuditoryVisualCross-over.63**.30**Slope.43**.01Max.28**.18**Peak.52**.43**Peak Time.37**.11*Baseline.35**.17**

  • Variance Reduction in Speech0.60.70.80.910.650.70.750.80.850.90.951Max: Auditory-1Max: Auditory-20.60.70.80.910.650.70.750.80.850.90.951Max: Auditory-1Max: Auditory-2

  • Variance Reduction in Speech00.00050.0010.00150.0020.00250.0030.003500.0010.0020.0030.0040.0050.006Slope: Auditory-1Slope: Auditory-200.00050.0010.00150.0020.00250.0030.003500.0010.0020.0030.0040.0050.006Slope: Visual-1Slope: Auditory-2

  • SummaryWork on SLI and TD adolescents suggest that measuring the timecourse of word recognition can be sensitive to different profiles of online processing..Listeners can get to the same outcome (the word) via different routes.

    1) This measure is reliable across testsSome components had correlations upward of .8

    2) Visual processes (eye movements, visual search, decision making) account for some of thisBut some is uniquely due to auditory/lexical processes.

  • Speech perception as a language processProblems of Speech and word recognitionFine-grained detail and word recognition.Revisiting categorical perceptionUsing acoustic detail over time.The beginnings of a comprehensive approach.

    2) Individual differencesA process view of individual differences.Case study 1: SLIEye-movement methods for individual differences.Case study 2: Cochlear Implants.Overview

  • Speech and Word RecognitionCandidate dimensions for individual differences in processing

    1) Auditory processes responsible for encoding cues.But: signal is highly redundant.Listeners dont rely on any single cue (or type of cue).Auditory disruption would have to be massive.

    2) Processes ofGradually committing to a word.Updating activation as new information arrives.Competition between words.SLICochlear Implants?

  • Speech and Word Recognition

  • Speech and Word RecognitionCochlear Implant usersShould show a deficit in spoken word recognition (Helms et al., 1997; Balkany et al., 2007; Sommers, Kirk & Pisoni, 1996)Temporal dynamics of lexical activation may follow a different profile of online activation.

    In addition: to what extent are differences driven by Poor signal encoding?Adapting / learning to cope with the implant?

  • Speech and Word Recognitionsandal29 Adult CI users (postlingually deafened)26 NH listeners

    29 Word Sets x 5 reps580 trials

    Farris-Trimble & McMurray (submitted)

  • Speech and Word Recognition00.20.40.60.810500100015002000Time (ms)Fixations to TargetCI adultsNH AdultsSignificant differences in Slope (p
  • Speech and Word RecognitionSignificant differences in Slope (p=.001) Peak location (p=.004) Offset slope (p=.007) Baseline (p
  • Speech and Word Recognition31 NH Listeners Normal words (N=15) 8-channel CI simulation (N=16)CI listenersshow effects both early and late in the timecourse.Are delayed to get going: require more information to start activating words.maintain competitor activation more than NH listeners.

    Which of these is driven by poor signal, which by adaptation?

  • Speech and Word RecognitionSignificant differences in Cross-over / Delay (p
  • Speech and Word RecognitionSignificant differences in Slope (p=.015)Marginal effects in Peak location (p=.067) Baseline (p=.058)No effect on offset slope (T
  • CI Adult SummaryCI listenersshow effects both early and late in the timecourse.require more information to start activating words.maintain competitor activation more than NH listeners.Degraded signal slows growth of activation for targets and competitors.Also increases chance of misidentifying segments.2) Listeners adapt by keeping competitors aroundIn case they need to revise due to later material.What about pediatrically deafened child users?They face an additional problem:Learning language with a degraded signal.

  • Ongoing workCI usersNH controls N 24 13 Age17 (12-26) 15.5 (12-17)CI KidsLooks to targetSame effects as adults:Slower, later, lower Cross-over / Delay (p
  • CI KidsLooks to cohortsSimilar to adultsbut with

    Reduced peak fixation.

    Slope (p

  • CI SummaryDegraded input effects early portions of the timecourse of processing.Delay to get startedSlower activation growth.

    Adaptation to the input affects later componentsIncreased competitor activation (hedging your bets).

    Children show all these effects in the extreme.And with reduced competitor activation.

  • Basic speech perception findingsFine-grained detail is crucial for word recognition.Available in sensory encoding of cues.Preserved up to level of lexical activation.Compensating for speaker/coarticulation in a way that preserves it allows for excellent speech recognition.Perception is not about coping with irrelevant variation. Conclusions

  • ConclusionsBasic speech perception findingsFine-grained detail is crucial for word recognition.Perception is not about coping with irrelevant variation.

    2) Lexical activation makes a graded commitment on the basis of partial information and waits for more.

    Do people need to make a discrete phoneme decision as a precursor to word recognition?

  • Basic speech perception findingsFine-grained detail is crucial for word recognition.Perception is not about coping with irrelevant variation.

    2)Lexical activation makes a graded commitment on the basis of partial information and waits for more.

    Speech perception must harness massively redundant sources of information.Only by harnessing 24 cues + compensation could we achieve listener performance on fricative categorization.Conclusions

  • Basic speech perception findingsFine-grained detail is crucial for word recognition.Perception is not about coping with irrelevant variation.

    2) Lexical activation makes a graded commitment on the basis of partial information and waits for more.

    Speech perception must harness massively redundant sources of information.

    Implications for impairmentSingle cue explanations of SLI dont make sense.Impairments in categorical perception may be impairments in ability to do that task.Are phonological representations causally related to word recognition?Conclusions

  • ConclusionsSpecific Language ImpairmentSLI (functional language outcomes) more related to lexical deficits than perceptual/ phonological.Consistent with work challenging causal role for phonology in word recognition.Different effects than for development.

    2) Could this have effects down-stream (e.g., syntax/morphology/learning)?If word recognition is not outputting a single candidate this would make parsing much harder (and see Levy, Bicknell, Slattery & Rayner, 2008)- Generalized deficit in decay/maintaining activation in multiple components of the system.

  • ConclusionsCochlear ImplantsTimecourse of word recognition is shaped by -Degraded input- Listeners adaptation to that at a lexical level.- Development

    2) CI outcomes are as much a cognitive (e.g., lexical) issue as a perceptual one (see Conway, Pisoni & Kronenberger, 2009).

    3) Cascading processes can have unexpected consequence.- Child CI users activate words so slowly they appear to have less competition!

  • ConclusionsIndividual Differences more broadlyDifferent populations get to the same outcome via vastly different mechanisms.

    - Need to use measures sensitive to online processing (in conjunction with speech and language outcomes)- Need to consider how children accomplish a language goal rather than language as a measurable outcome.Typical DevelopmentLanguage ImpairmentCochlear Implants

  • ConclusionsIndividual Differences more broadlyDifferent populations get to the same outcome via vastly different mechanisms.

    Gradations in dynamics of lexical activation / competition can a good way to describe individual differences at a mechanistic level.

    Whats the developmental cause of such differences?

  • ConclusionsIndividual Differences more broadlyDifferent populations get to the same outcome via vastly different mechanisms.

    Gradations in dynamics of lexical activation / competition can a good way to describe individual differences at a mechanistic level.

    3) Multiple-population work can reveal broader mechanisms at play over language development- SLI: known language deficit, maybe perceptual deficit.- CI users: known perceptual deficit, maybe language.- Child CI users: Both?

  • ConclusionsBy looking at how children use language in real-time, we might understand better how language develops.

    orDevelopment orProcessingOnly by studying both can we form a model of change we can believe in.