Outline: Where have we been and were are we going? • We’re making consistent progress, or • We’re running around in circles, or – 1950s: Empiricism (Information Theory, Behaviorism) – 1970s: Rationalism (AI, Cognitive Psychology) – 1990s: Empiricism (Data Mining, Statistical NLP, Speech) – 2010s: Rationalism (TBD) • We’re going off a cliff… – Don’t worry; be happy 0% 20% 40% 60% 80% 100% 1985 1990 1995 2000 2005 ACL M eeting % Statistical Papers Bob M oore Fred Jelinek No matter what happens, it’s goin’ be great! Rising tide of data lifts all boats
24
Embed
Outline: Where have we been and were are we going? Were making consistent progress, or Were running around in circles, or –1950s: Empiricism (Information.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Outline: Where have we been and were are we going?
• We’re making consistent progress, or• We’re running around in circles, or
An Intelligence Bonanza • Some companies are collecting
information with technology designed to monitor incoming calls for service quality.
• Last summer, Continental Airlines Inc. installed software from Witness Systems Inc. to monitor the 5,200 agents in its four reservation centers.
• But the Houston airline quickly realized that the system, which records customer phone calls and information on the responding agent's computer screen, also was an intelligence bonanza, says André Harris, reservations training and quality-assurance director.
Speech Data Mining• Label calls as success or failure based on
some subsequent outcome (sale/no sale)
• Extract features from speech
• Find patterns of features that can be used to predict outcomes
• Hypotheses:– Customer: “I’m not interested” no sale– Agent: “I just want to tell you…” no sale
Inter-ocular effect (hits you between the eyes);Don’t need a statistician to know which way the wind is blowing
Outline
• We’re making consistent progress, or
• We’re running around in circles, or– Don’t worry; be happy
• We’re going off a cliff…
According to unnamed sources:Speech Winter Language Winter
Dot Boom Dot Bust
Sample of 20 Survey Questions(Strong Emphasis on Applications)
• When will– More than 50% of new PCs have dictation on them, either at
purchase or shortly after.– Most telephone Interactive Voice Response (IVR) systems
accept speech input.– Automatic airline reservation by voice over the telephone is the
norm.– TV closed-captioning (subtitling) is automatic and pervasive.– Telephones are answered by an intelligent answering machine
that converses with the calling party to determine the nature and priority of the call.
– Public proceedings (e.g., courts, public inquiries, parliament, etc.) are transcribed automatically.
• Two surveys of ASRU attendees: 1997 & 2003
Hockey StickBusiness Case
2003 2004 2005
t
$
LastYear
ThisYear Next
Year
2003 Responses ≈ 1997 Responses + 6 Years(6 years of hard work No progress)
Wrong Apps?
• New Priorities– Increase demand for
space >> Data entry• New Killer Apps
– Search >> Dictation• Speech Google!
– Data mining
• Old Priorities– Dictation app dates back to
days of dictation machines– Speech recognition has not
displaced typing• Speech recognition has
improved• But typing skills have
improved even more– My son will learn typing in
1st grade– Sec rarely take dictation
– Dictation machines are history• My son may never see one• Museums have slide rulers
and steam trains– But dictation machines?
Great Challenge: Annotating Data
• Produce annotated data with minimal supervision
• Active learning– Identify reliable labels– Identify best candidates for annotation
• Co-training• Bootstrap (project) resources from one
application to another
Borrowed Slide: Jelinek (LREC)
Self-organizing “Magic” ≠ Error Analysis
Great Strategy Success
Grand Challengesftp://ftp.cordis.lu/pub/ist/docs/istag040319-draftnotesofthemeeting.pdf