seminar ppt on ineracting with computers

8/7/2019 seminar ppt on ineracting with computers

http://slidepdf.com/reader/full/seminar-ppt-on-ineracting-with-computers 1/21

Krishnananda Prabhu



Objective:yPractical Application of

Interacting with thecomputers using Voice



Key terms:yA utomatic Speech Recognition

Technique(A SR):This is the Technique

used to convert Speech to Text

yText To Speech ConversionTechnique(TTS): This is the Techniqueused to convert Text to Speech



[Contd]Hidden Markov Model:

oModern general-purpose speechrecognition systems are based on

HiddenMarkov Models. These are

statistical models which output asequence of symbols or quantities.

o Used in Real time Applications



Introduction:y Transfer of information between

human and machine is normally

accomplished via ones senses.

yTo communicate with our

environment, we send out signals orinformation visually, auditorily, andthrough gestures



[contd]yHumancomputer interactions often use

a mouse and keyboard as machineinput, and a computer screen or printeras output.

y One can read text and understand imagesmuch more quickly on a two- dimensional(2-D) computer screen than when listeningto a [one-dimensional (1-D)] speech signal.



[contd..]� However, most people can speak more

quickly than they can type, and are much

more comfortable speaking than typing� Henceforth we come across the technique of

Interacting with computers using Voice



Model for Speech Recognition



Automatic Speech

Recognition(ASR):Defintion: This is the Technique used to convert

Speech to Text

Automatic speech recognition is among otherthings useful in situations where an operator isinputting data to a computer in parallel with usinghis hands for other tasks.

The recognition strategy used can in short bedescribed as an extraction of a number of speech

parameters from the acoustic speech signal foreach word.



[contd]y In a training phase the operator will read all

the words of the vocabulary of the current

application. The word patterns are storedand later when a word is to be recognised itspattern is compared to the stored patternsand the word that gives the best

correspondence is selected. This techniqueis generally referred to as PatternRecognition.



[contd]



DESCRIPTION OF THE HARDWARE

BOARDSy The hardware of the word recogniser consists of a

general micro computer, and a signal processor for

the acoustic analysis of the speech signal.y The micro computer board consists of theMotorolaMC-68000 micro processor and also hasfacilities for the input and output of data andmemory managing circuits for the memory cards(to store the vocabulary).

y The speech analysis board implements a spectrumanalyser in the form of a 16 channel filter bank



Text toSpeech Conversion(TTS):



[contd.]yThe TTS is based on Speech synthesis by

diaphonic concatenation and consists of the

following three modules together with theuser interface module.

y Diaphone Database

y Text Processing moduley Speech Synthesiser.



y The three main parts of the TTSsystem comprises of theIt consists of three parts

1. Preprocessing module

2.Text analysis module3. Synthesizer module.



TTS representation using

concatenation:

Concatenation:A process/Technique for producing sound

from a text.It uses a set of basic sound elements for

Recognition.



Hidden MarkovModel:Definition:. These are statistical models which

output a sequence of symbols or quantities.

� Modern general-purpose speech recognitionsystems are based on Hidden Markov Models.

� HMMs are used in speech recognition because aspeech signal can be viewed as a piecewise

stationary signal or a short-time stationary signal.

� HMMs are popular is because they can be trainedautomatically and are simple and computationally

feasible to use



[contd.]y In speech recognition, the hidden Markov model

would output a sequence of n-dimensional real-

valued vectors (with n being a small integer, suchas 10), outputting one of these every 10milliseconds.

y The vectors would consist of cepstral coefficients,

which are obtained by taking a Fourier transformof a short time window of speech anddecorrelating the spectrum using a cosinetransform, then taking the first (most significant)

coefficients.



[contd]y A hidden Markov model for a sequence of words or

phonemes is made by concatenating the individual

trained hiddenM

arkov models for the separate wordsand phonemes.



Advantages and Disadvantages:Advantages:

1.)Phsyically disabled persons can also interact withthe computers using this technique.

2.)Typing can be done even faster without actual effort.3.)We can ask the computer to make anannouncement.eg:reading an e-mail.

Disadvantages:1.)This Mechanism doesnot help if the given word is out of its

vocabulary

2.)Doesnt take multiple inputs i.e when 2 persons are talkingsimultaneously.

3.)Doesnt work properly when there is cahnge of ascent in the givenword.



Conclusions:y The Possible ways of Interacting with a

Computer using Voice are discussed in the

Paper.y A utomatic speech recognition technique(A SR)

and Text to speech conversion techniques areused for its implementation .

y The concept of Hidden Markov Model is realtime used for the synthesis of this Real-Timeaplication.

seminar ppt on ineracting with computers

Documents