Speaker and Language Recognition A Guided Safari Doug Reynolds 2008 Odyssey Workshop This work was sponsored by the Department of Defense under Air Force contract F19628-00-C-0002. Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the United States Government.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Speaker and Language RecognitionA Guided Safari
Doug Reynolds
2008 Odyssey Workshop
This work was sponsored by the Department of Defense under Air Force contract F19628-00-C-0002 Opinions interpretations conclusions and recommendations are those of the authors and are not necessarily endorsed by the United States Government
MIT Lincoln Laboratory2
Odyssey 2008
Roadmap
bull The odyssey from 1994 to 2008
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory3
Odyssey 2008
The OdysseyMartigny Switzerland ndash April 5-7 1994
bull First workshop focused solely on speaker recognition
ndash Helped form working relationships among international SID community
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory2
Odyssey 2008
Roadmap
bull The odyssey from 1994 to 2008
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory3
Odyssey 2008
The OdysseyMartigny Switzerland ndash April 5-7 1994
bull First workshop focused solely on speaker recognition
ndash Helped form working relationships among international SID community
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory3
Odyssey 2008
The OdysseyMartigny Switzerland ndash April 5-7 1994
bull First workshop focused solely on speaker recognition
ndash Helped form working relationships among international SID community
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory4
Odyssey 2008
The OdysseyAvignon France ndash April 20-23 1998
bull Focus on forensic and commercial applications
bull 40 papers 5 keynotesbull 78 participantsbull Technologies More emphasis on statistical
approaches (HMM GMM AHS)bull Corpora Still diverse small set (less home-
grown) more TIMIT and SPIDRE (SWB)ndash Europeans showing lead in common
corporaexperiments (POLYCOST VERIVOX CAVE)
bull Increasing buzz about dot-com speechspeaker companies
bull Some lasting themes in talksndash Doddington getting to know the speakerndash Champod LRs as evidence in Baysian
framework
bull Some friction between automatic speaker recognition community and expert human speaker examiner community
ndash ASR crowd pressed for measured error rate
ndash Examiner crowd pressed for transparency and explanation in results
MIT Lincoln Laboratory5
Odyssey 2008
The OdysseyCrete Greece ndash June 18-22 2001
bull Start of official ldquoOdysseyrdquo workshop series
ndash Originally set for Tel-Aviv Israelbull 40 papers 3 keynotesbull 75 participantsbull Technologies More papers on new
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory5
Odyssey 2008
The OdysseyCrete Greece ndash June 18-22 2001
bull Start of official ldquoOdysseyrdquo workshop series
ndash Originally set for Tel-Aviv Israelbull 40 papers 3 keynotesbull 75 participantsbull Technologies More papers on new
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory6
Odyssey 2008
The OdysseyToledo Spain ndash May 31- June 3 2004
bull Co-occurrence with NIST SRE 2004 workshop
bull 61 paper 4 keynotesbull 147 participantsbull Technologies GMM SVM NAP LFA
high-level features adaptation audio-video LID
bull Corpora SRE corporaprotocol dominant for TI-Telephone RT BNEWS data for diarization TNONFI field forensic corpus
bull Text-dependent work focusing more on user phrases (less digit strings)
MIT Lincoln Laboratory7
Odyssey 2008
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
bull Co-occurrence with NIST SRE 2006 workshop
ndash Followed LRE 2005 in December bull 60 papers 1 keynotebull 103 participantsbull Technologies GMM-SVM NAPLFA
GMM-MMI high-level features robustness
bull Corpora Dominated by SRE and LRE corporaprotocol
MIT Lincoln Laboratory8
Odyssey 2008
The OdysseyStellenbosch South Africa ndash January 21-24 2008
bull Expect to see continued trends inndash Common corporaevaluationsndash High-quality papers and novel topics
bull More fish pictures hellip
MIT Lincoln Laboratory9
Odyssey 2008
Roadmap
bull The odyssey from 1994 to 2008
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory10
Odyssey 2008
NIST SpeakerLanguage Recognition Evaluations
bull Recurring NIST evaluations of speakerlanguage recognition technology
bull Aim Provide a common paradigm for comparing technologies
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory7
Odyssey 2008
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
bull Co-occurrence with NIST SRE 2006 workshop
ndash Followed LRE 2005 in December bull 60 papers 1 keynotebull 103 participantsbull Technologies GMM-SVM NAPLFA
GMM-MMI high-level features robustness
bull Corpora Dominated by SRE and LRE corporaprotocol
MIT Lincoln Laboratory8
Odyssey 2008
The OdysseyStellenbosch South Africa ndash January 21-24 2008
bull Expect to see continued trends inndash Common corporaevaluationsndash High-quality papers and novel topics
bull More fish pictures hellip
MIT Lincoln Laboratory9
Odyssey 2008
Roadmap
bull The odyssey from 1994 to 2008
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory10
Odyssey 2008
NIST SpeakerLanguage Recognition Evaluations
bull Recurring NIST evaluations of speakerlanguage recognition technology
bull Aim Provide a common paradigm for comparing technologies
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory8
Odyssey 2008
The OdysseyStellenbosch South Africa ndash January 21-24 2008
bull Expect to see continued trends inndash Common corporaevaluationsndash High-quality papers and novel topics
bull More fish pictures hellip
MIT Lincoln Laboratory9
Odyssey 2008
Roadmap
bull The odyssey from 1994 to 2008
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory10
Odyssey 2008
NIST SpeakerLanguage Recognition Evaluations
bull Recurring NIST evaluations of speakerlanguage recognition technology
bull Aim Provide a common paradigm for comparing technologies
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory9
Odyssey 2008
Roadmap
bull The odyssey from 1994 to 2008
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory10
Odyssey 2008
NIST SpeakerLanguage Recognition Evaluations
bull Recurring NIST evaluations of speakerlanguage recognition technology
bull Aim Provide a common paradigm for comparing technologies
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory10
Odyssey 2008
NIST SpeakerLanguage Recognition Evaluations
bull Recurring NIST evaluations of speakerlanguage recognition technology
bull Aim Provide a common paradigm for comparing technologies
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory11
Odyssey 2008
NIST SRELREPre-history
1992 1993
Rutgers Summer Workshop
1994Informal LRE bull 4 sites OGI MITLL MIT ITTbull OGI 12 lang corpus
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory13
Odyssey 2008
NIST SRELRESteady Progress
Avignon Workshop
1997 1998SRE 3 bull 8 sites bull Pitch features handset mic
detectorcomp using more dev data
bull SWB2p1bull All speakers act as tgts
and imposters (current paradigm)
bull Train 2 min - 1 sess 1 handset 2 handset
bull Test 3s 10s 30sbull No cross-sex trials
matched and mismatched test phone
bull DET DCF
SRE 4bull 12 sitesbull Phone sequences (BBN)
sequence models (Dragon)
bull SWB2p2bull Train 2 min - 1 sess 2
sess all - 2 sessbull Test 3s 10s 30sbull SNDN and HS type side
knowledgebull Human performance
3s
1999SRE 5bull 13 sitesbull T-norm system fusionbull SWB2p3bull Train 2 min - 2 sessbull Test varying duration (0-
15 15-30 30-45gt45) diff number
bull New tasks 2-spkr test speaker tracking
MIT Lincoln Laboratory14
Odyssey 2008
NIST SRELRENew Directions
Odyssey Workshop
JHU SuperSIDWorkshop
2000 2001SRE 6bull 12 sites (First shark
sighting) bull SMS bull SWB2p1p2
AHUMADAbull Train 2 min - 1 sessbull Test variable 0-60 bull New tasks 2-spkr
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory14
Odyssey 2008
NIST SRELRENew Directions
Odyssey Workshop
JHU SuperSIDWorkshop
2000 2001SRE 6bull 12 sites (First shark
sighting) bull SMS bull SWB2p1p2
AHUMADAbull Train 2 min - 1 sessbull Test variable 0-60 bull New tasks 2-spkr
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory15
Odyssey 2008
NIST SRELRECurrent Period
Odyssey Workshop
Odyssey Workshop
2004 2005 20072006LRE 4bull 21 sitesbull SVM-GSV ho
ngrams fLFAfNAPbull Mixer5 OHSUbull 14 languages 5
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory16
Odyssey 2008
NIST SREHow are we doing
0
001002
003004
005
006007
008
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
Year
DC
F
Landline 1sp 2 min train 30 sec test
Cellular 1sp 2 min train 30 sec test
Landline 2-speaker detection
Ahumada(Spanish)
Multimodal (FBI)Landline 1sp (40
target speaker paradigm)
Cellularland 2-speaker detection
CellLand 1sp 8-conv train 1-conv test
Cross-mic1-conv train (tel) 1-conv test (mic)
Swb1 Swb2p1 Swb2p3 Swb2p4 Swb2p5 Mixer1 Mixer3
bull Sampling of tasks shown 28 in SRE04
Cross-language
Swb2p2 Mixer2 MMSR
MIT Lincoln Laboratory17
Odyssey 2008
0123456789
10
0
1
2
3
4
2001 2002 2003 2004 2005 2006
SRE Performance Trends 2001-2007Lincoln Systems
bull Consistent and steady improvement for datatask focus
EER ()1conv4w1conv4w
8conv4w1conv4w
minDCFx100
2001 2002 2003 2004 2005 2006SWB1 SWB2 MIXER2-3
bull New data sets designed to be more challenging
bull New features classifiers and compensations drive error rates down over time
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory17
Odyssey 2008
0123456789
10
0
1
2
3
4
2001 2002 2003 2004 2005 2006
SRE Performance Trends 2001-2007Lincoln Systems
bull Consistent and steady improvement for datatask focus
EER ()1conv4w1conv4w
8conv4w1conv4w
minDCFx100
2001 2002 2003 2004 2005 2006SWB1 SWB2 MIXER2-3
bull New data sets designed to be more challenging
bull New features classifiers and compensations drive error rates down over time
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory19
Odyssey 2008
Roadmap
bull The odyssey from 1994 to 2008
bull The scenic route through NIST speaker and language recognition evaluations
bull The expedition into future territories
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory20
Odyssey 2008
The ExpeditionEvaluations
bull The evaluation paradigm has clearly helped propel speaker and language RampD forward
ndash Common focus ndash Comparable results and repeatable experimentsndash Collaboration
bull But there are some issues to considerndash Proliferation of tasks and conditions can dilute and fragment
community effortndash Evaluations are application-dependent
The tasks conditions and data are representative of some application(s)
Are these being set in a meaningful wayndash Performance numbers need context
Time-pressed less-technical potential users want yesno to ldquowill it or wonrsquot it work for my applicationrdquo
ndash Speaker and language recognition research increasingly relies on data driven discovery
Does performance depend on highly matched dev data Are performance gains due to technology or data
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008
Roadmap
NIST SpeakerLanguage Recognition Evaluations
NIST SRELREPre-history
NIST SRELREFormal Start
NIST SRELRESteady Progress
NIST SRELRENew Directions
NIST SRELRECurrent Period
NIST SREHow are we doing
SRE Performance Trends 2001-2007Lincoln Systems
LRE Performance Trends 1996-2007Lincoln Systems
Roadmap
The ExpeditionEvaluations
The ExpeditionResearch
MIT Lincoln Laboratory21
Odyssey 2008
The ExpeditionResearch
bull Speaker and language research are built on three core areas
ndash Speech Science Understanding how speakerlanguage information is conveyed in the speech signal and how to robustly extract measures of this information
ndash Pattern Recognition Techniques and algorithms to effectively represent and compare salient patterns in data
ndash Data Driven Discovery Effectively using data to apply refine and improve systems built from above
bull Current speakerlanguage research is heavily weighted toward data driven discovery
ndash Cure or cursendash Are we discovering underlying problems to address in
research or just where we want more data
Speaker and Language RecognitionA Guided Safari
Roadmap
The OdysseyMartigny Switzerland ndash April 5-7 1994
The OdysseyAvignon France ndash April 20-23 1998
The OdysseyCrete Greece ndash June 18-22 2001
The OdysseyToledo Spain ndash May 31- June 3 2004
The OdysseySan Juan Puerto Rico ndash June 28-30 2006
The OdysseyStellenbosch South Africa ndash January 21-24 2008