PRESENTATION ON SPEECH RECOGNITION Submitted To :- Ms. Saiyma Aisha professor of CSE department Submitted By :- Anshu Agrawal (k10778) B.tech , cs 7 th sem.
PRESENTATIONON
SPEECH RECOGNITION
Submitted To :-Ms. Saiyma Aisha professor of CSE department
Submitted By :-Anshu Agrawal (k10778)B.tech , cs 7th sem.
CONTENT
Introduction Principle Types Key termsflow processHow do humans do it ?ApplicationFuture scopeExample Key Challenges
INTRODUCTION
• It is also know as automatic speech recognition or computer speech recognition or voice recognition .
• Which means understanding voice by the computer and preforming any required task.
• A user gives a predefined voice instruction to the system through the microphone , the system understand the command and execute the require function .
• It facilities the user to run window through your voice without use of keyboard or mouse.
PRINCIPLE OF SR
The smallest unit of spoken language is know as a phoneme.
The English language contains 44 phonemes representing all the vowels and consonants that we use for speech.
We can take the example of a typical word such as moon which can be broken down into three phonemes: m, ue, n.
To create a speech recognition engine, a large database of models is created to match each phoneme.
When a comparison is performed, the most likely match is determined b/w the spoken phoneme & the stored one, further computations are performed.
TYPES OF SR SYSTEM
• Speaker dependent SR system :- work by learning the unique characteristics of a single person’s voice and depend on the speaker for training. It means that user have to read a few pages of text to the computer before they can use the speech recognition software.it is dictation s/w.
• Speaker independent SR system:-speaker independent s/w is designed to recognize anyone’s voice, so no training is involved. It means the only real option for applications such as interactive voice response systems.
KEY TERMS
Speaking modesSignal analyzer Acoustic modelLanguage modelDigitization PhoneticsPhonologySemantics & pragmaticsLexicology & syntax
Isolated wordsContinuous speech
KEY TERMS
SIGNAL ANALYZER: Analyses the speech signal and removes the background noise thus focusing only on the speaker’s speech.
ACOUSTIC MODEL: identifies phonemes from the speech sample using a probability based mathematical model
KEY TERMS
LANGUAGE MODEL :Identifies words and thus sentences uttered by the speaker from the phonemes by making use of a dictionary file and grammar file.
DIGITIZATION : Analogue to digital conversion.
• Sampling is converting a continuous signal into a discrete signal.
• Quantizing is the process of approximating a continuous range of values.
KEY TERMS
PHONETICS: It is variability in human speech.PHONOLOGY: It is recognizing individual sound distinctions. Its the systematic use of sound to encode meaning in any spoken human language.SEMANTICS & PRAGMATICS: • Semantics tell the meaning.• Pragmatics is concerned with bridging the
explanatory gap between sentence meaning and speaker’s meaning
KEY TERMS
LEXICOLOGY & SYNTAX: • Lexicology is that part of linguistics which
studies words, their nature & meaning.• Syntax tell about the arrangement of words
and phrases to create well formed sentences.
BAISC FLOW PROCESS
HOW DO HUMANS DO IT?
First articulation produce sound waves , which the ear conveys to the brain for processing.
APPLICATIONS
MILITARY (High performance aircraft, Helicopters)
People with disabilities Dyslexic people Computer & video games( Microsoft
Xbox, Sony ps2 consoles all offer games with speech i/p & o/p.
Medical transcription Mobile phone devices Voice security system
FUTURE SCOPE
Accuracy will become more and more. Small hand-held writing tablets for computer
speech recognition dictation and data entry will be developed, as faster processor and more memory become available.
Greater use will be made of “intelligent systems” which will attempt to guess what the speaker intend to say, rather than what was actually said , as people often misspeak and make unintentional mistakes.
Microphone and sound systems will be designed to adapt more quickly to changing background noise levels, different environments, with better recognition of extraneous material to be discarded.
LIKE EXAMPLE
EXAMPLE
Pain
ACOUSTIC MODEL
CORRECT
pain
pain
pain
TEXT OUTPUT
Lang. MODEL
KEY CHALLENGES
SR system have to deal with a large number of challenges like:- The speaker’s voice is often accompanied by
surrounding noise. Which makes their accurate recognition difficult.
A speaker may speak a number of different words and all of these words have to be accurately recognized.
Accent of speaking varies from person to person and this is very big challenge.
A speaker may speak something very quickly and all of the words spoken have to be individually recognized accurately
THANK YOU..!