Assign

PRESENTATIONON

SPEECH RECOGNITION

Submitted To :-Ms. Saiyma Aisha professor of CSE department

Submitted By :-Anshu Agrawal (k10778)B.tech , cs 7th sem.

CONTENT

Introduction Principle Types Key termsflow processHow do humans do it ?ApplicationFuture scopeExample Key Challenges

INTRODUCTION

• It is also know as automatic speech recognition or computer speech recognition or voice recognition .

• Which means understanding voice by the computer and preforming any required task.

• A user gives a predefined voice instruction to the system through the microphone , the system understand the command and execute the require function .

• It facilities the user to run window through your voice without use of keyboard or mouse.

PRINCIPLE OF SR

The smallest unit of spoken language is know as a phoneme.

The English language contains 44 phonemes representing all the vowels and consonants that we use for speech.

We can take the example of a typical word such as moon which can be broken down into three phonemes: m, ue, n.

To create a speech recognition engine, a large database of models is created to match each phoneme.

When a comparison is performed, the most likely match is determined b/w the spoken phoneme & the stored one, further computations are performed.

TYPES OF SR SYSTEM

• Speaker dependent SR system :- work by learning the unique characteristics of a single person’s voice and depend on the speaker for training. It means that user have to read a few pages of text to the computer before they can use the speech recognition software.it is dictation s/w.

• Speaker independent SR system:-speaker independent s/w is designed to recognize anyone’s voice, so no training is involved. It means the only real option for applications such as interactive voice response systems.

KEY TERMS

Speaking modesSignal analyzer Acoustic modelLanguage modelDigitization PhoneticsPhonologySemantics & pragmaticsLexicology & syntax

Isolated wordsContinuous speech

KEY TERMS

SIGNAL ANALYZER: Analyses the speech signal and removes the background noise thus focusing only on the speaker’s speech.

ACOUSTIC MODEL: identifies phonemes from the speech sample using a probability based mathematical model

KEY TERMS

LANGUAGE MODEL :Identifies words and thus sentences uttered by the speaker from the phonemes by making use of a dictionary file and grammar file.

DIGITIZATION : Analogue to digital conversion.

• Sampling is converting a continuous signal into a discrete signal.

• Quantizing is the process of approximating a continuous range of values.

KEY TERMS

PHONETICS: It is variability in human speech.PHONOLOGY: It is recognizing individual sound distinctions. Its the systematic use of sound to encode meaning in any spoken human language.SEMANTICS & PRAGMATICS: • Semantics tell the meaning.• Pragmatics is concerned with bridging the

explanatory gap between sentence meaning and speaker’s meaning

KEY TERMS

LEXICOLOGY & SYNTAX: • Lexicology is that part of linguistics which

studies words, their nature & meaning.• Syntax tell about the arrangement of words

and phrases to create well formed sentences.

BAISC FLOW PROCESS

HOW DO HUMANS DO IT?

First articulation produce sound waves , which the ear conveys to the brain for processing.

APPLICATIONS

MILITARY (High performance aircraft, Helicopters)

People with disabilities Dyslexic people Computer & video games( Microsoft

Xbox, Sony ps2 consoles all offer games with speech i/p & o/p.

Medical transcription Mobile phone devices Voice security system

FUTURE SCOPE

Accuracy will become more and more. Small hand-held writing tablets for computer

speech recognition dictation and data entry will be developed, as faster processor and more memory become available.

Greater use will be made of “intelligent systems” which will attempt to guess what the speaker intend to say, rather than what was actually said , as people often misspeak and make unintentional mistakes.

Microphone and sound systems will be designed to adapt more quickly to changing background noise levels, different environments, with better recognition of extraneous material to be discarded.

LIKE EXAMPLE

EXAMPLE

Pain

ACOUSTIC MODEL

CORRECT

pain

pain

pain

TEXT OUTPUT

Lang. MODEL

KEY CHALLENGES

SR system have to deal with a large number of challenges like:- The speaker’s voice is often accompanied by

surrounding noise. Which makes their accurate recognition difficult.

A speaker may speak a number of different words and all of these words have to be accurately recognized.

Accent of speaking varies from person to person and this is very big challenge.

A speaker may speak something very quickly and all of the words spoken have to be individually recognized accurately

THANK YOU..!

Assign

Engineering