in collaboration with Hualin Gao, Richard Duncan, Julie A. Baca, Joseph Picone Human and Systems Engineering Center of Advanced Vehicular System Mississippi State University SIGNAL PROCESSING TOOLS FOR SPEECH RECOGNITION Presented by Richard Duncan Tablet PC Microsoft Corporation
39
Embed
In collaboration with Hualin Gao, Richard Duncan, Julie A. Baca, Joseph Picone Human and Systems Engineering Center of Advanced Vehicular System Mississippi.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
in collaboration with
Hualin Gao, Richard Duncan, Julie A. Baca, Joseph Picone
Human and Systems EngineeringCenter of Advanced Vehicular System
Mississippi State University
SIGNAL PROCESSING TOOLS FOR SPEECH RECOGNITION
Presented by Richard Duncan
Tablet PCMicrosoft Corporation
Page 2 of 38Signal Processing Tools for Speech Recognition
WHICH TWO ARE THE SAME PHONEME?
We need to extract meaningful information from the signal for a speech recognition system to model
Page 3 of 38Signal Processing Tools for Speech Recognition
WHICH TWO ARE THE SAME PHONEME?
a: “ow” b: “aa” c: “ow”
Page 4 of 38Signal Processing Tools for Speech Recognition
WHAT IS AN ACOUSTIC FRONT-END?
It encapsulates the signal processing of a speech recognition system.
It computes a sequence of feature vectors from an audio stream.
These vectors are then processed by HMMs, neural networks, or other classifiers.
Page 5 of 38Signal Processing Tools for Speech Recognition
WHY REINVENT THE WHEEL?
A Front-end has many areas of complexity:
•Run-time efficiency
•File I/O
•Data management (framing)
•DSP algorithm complexity
•Algorithm re-use
Our system abstracts the researcher/student from these mundane issues to so he or she can focus on the algorithms
Page 6 of 38Signal Processing Tools for Speech Recognition
DATA FRAMING
framen framen+1
windown
windown+1
New dataShared data
Page 7 of 38Signal Processing Tools for Speech Recognition
FEATURES OF ISIP FOUNDATION CLASSES
• Efficient memory management and tracking;
• System and I/O libraries that abstract details of the operating system;
• Math classes that provide basic linear algebra and efficient matrix manipulations;
• Generic data structures;
• Built-in unit tests to verify component correctness.
Page 8 of 38Signal Processing Tools for Speech Recognition
DESIGN REQUIREMENTS
• A library of standard algorithms provides basic digital signal processing (DSP) functions;
• New algorithms can be added without modifying existing classes;
• A block diagram tool allows rapid prototyping without programming or recompiling;
• The same system is used for offline feature extraction, recognition, and general DSP work.
Page 9 of 38Signal Processing Tools for Speech Recognition
BASIC DIGITAL PROCESSING FUNCTIONS
This example shows how to realize the basic digital signal processing functions. It computes the energy of input vector in dB using the SUM algorithm:
// declare an Energy object, input vector, and output vectorEnergy egy; VectorFloat output; VectorFloat input(L"0, 1, 2");