Top Banner
SEMINAR BY: GUIDED BY: JEEVITHA R Ms VIDYA S BENNUR 1ec08ec018
32
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Silent sound interface

SEMINAR BY: GUIDED BY:JEEVITHA R Ms VIDYA S BENNUR1ec08ec018

Page 2: Silent sound interface

CONTENTS•Introduction•What is speech?•Sources of information•Brain computer interface (BCI)•Speech synthesis•speech synthesis technologies•Block diagram•Features• Methods of producing

Electromyography Image processing

• Applications•In fiction •Reference

Page 3: Silent sound interface

•You are in a theatre or a noisy restaurant or a bus etc., where there is lot of noise around is a big issue while talking on a mobile phone. But in future this problem is eliminated with “silent sound technology”, a new technology unveiled at the CeBIT fare. It transforms lip movements into a computer generated voice for the listener at the other end of the listener• Silent speech is a device that allows speech communication without using the sound made when people vocalize their speech sounds. As such it is a type of electronic lip reader. It works by computer identifying phonemes that an individual pronounces from non auditory sources of information about their speech movements. These are then used to recreate the speech using speech synthesis

Introduction

Page 4: Silent sound interface

•The device uses electromyography, monitoring tiny muscular movements that occur when we speak and converting them into electrical pulses that can be turned into speech without a sound uttered. It also uses image processing technique that converts digital data into a film image with minimal corrections and calibration.

Page 5: Silent sound interface

SPEECH: Speech is the vocalized form of human communication. It is based upon the syntactic combination of lexical and names that are drawn from very large (usually to about 10,000 different words) vocabularies. A gestural form of human communication exists for the deaf in the form of sign language. Speech in some cultures has become the basis of a written language, often one that differs in its vocabulary, syntax and phonetics from its associated spoken one, a situation called diglossia

Page 6: Silent sound interface

Sources of information:

Vocal tract Bone conduction

Page 7: Silent sound interface

The vocal tract is the cavity in human beings and in animals where sound that is produced at the sound source (larynx in mammals; syrinx in birds) is filtered.

The Human Voice System.mp4

VOCAL TRACT:

Page 8: Silent sound interface

Bone conduction is the conduction of sound to the inner ear through the bones of the skull.Some hearing aids employ bone conduction, achieving an effect equivalent to hearing directly by means of the ears. A headset is ergonomically positioned on the temple and cheek and the electromechanical transducer, which converts electric signals into mechanical vibrations, sends sound to the internal ear through the cranial bones. Likewise, a microphone can be used to record spoken sounds via bone conduction. The first description, in 1923, of a bone conduction hearing aid was Hugo Gernsback’s "Osophone", which he later elaborated on with his "Phonosone".

Bone conduction

Page 9: Silent sound interface

Categories:•Ordinary products•Hearing aids•Specialized communication products

Advantages:Ears freeHigh sound clarity in very noisy environmentCan have a perception of stereo sound

Disadvantages:Some implementations require more power than headphones.Less clear recording and playback than headphones.

Page 10: Silent sound interface

A brain computer interface is often called as mind machine interface(MMI)or sometimes called direct neural interface is a direct communication pathway between the brain and an external device

The field of BCI research and development has since focused primarily on neuroprosthetics applications that aim at restoring damaged hearing, sight and movement. Thanks to the remarkable cortical plasticity of the brain, signals from implanted prostheses can, after adaptation, be handled by the brain like natural sensor or effecter channels. Following years of animal experimentation, the first neuroprosthetic devices implanted in humans appeared in the mid- 90s

Brain computer interface:

Page 11: Silent sound interface

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware. Synthesized speech can be created by concatenating pieces of recorded speech that are stored in a database. Systems differ in the size of the stored speech units; a system that stores phones or di phones provides the largest output range, but may lack clarity.

Speech synthesis:

Page 12: Silent sound interface

Speech synthesizing process:

The quality of a speech synthesizer is judged by its it’s similarity to the human voice and by its ability to be understood. An intelligible text-to-speech program allows people with visual impairments or reading disabilities to listen to written works on a home computer. Many computer operating systems have included speech synthesizers since the early 1980’s.

Page 13: Silent sound interface

The most important qualities of speech synthesis system are naturalness and intelligibility . Naturalness describes how closely the output sounds like human speech, while intelligibility is the ease with which the output is understood.

There are 8 types of Synthesizing technologies such that they are :

a) Concatenative synthesis b) Unit selection synthesisc) Di phone synthesisd) Domain-specific synthesise) Formant synthesisf) Articulatory synthesisg) HMM-based synthesish) Sine wave synthesis

Synthesizer Technologies:

Page 14: Silent sound interface

CONCATENATIVE SYNTHESIS:Concatenative synthesis is based on the concatenation (or

stringing together) of segments of recorded speech. Generally, Concatenative synthesis produces the most natural-sounding synthesized speech.

UNIT SELECTION SYNTHESIS:Unit selection synthesis uses large databases of recorded

speech. During database creation, each recorded utterance is segmented into some or all of the following: individual phones, di phones, half-phones, syllables, morphemes, words, phrases, and sentences.

DI PHONE SYNTHESIS:Di phone synthesis uses a minimal speech database containing

all the di phones(sound-to-sound transitions) occurring in a language. The number of di phones depends on the phonotactics of the language: for example, Spanish has about 800 di phones and German about 2500. In di phone synthesis, only one example of each di phone is contained in the speech database.

Page 15: Silent sound interface

Domain specific synthesis: Domain-specific synthesis concatenates prerecorded

words and phrases to create complete utterances. It is used in applications where the variety of texts the system will output is limited to a particular domain, like transit schedule announcements or weather reports.

Format synthesis:Format synthesis does not use human speech samples at

runtime. Instead the synthesized speech output is created using additive synthesis and an acoustic model (physical modeling synthesis). Parameters such as fundamental frequency, voicing, and noise levels are varied over time to create a waveform of artificial speech. This method is sometimes called rules-based synthesis

Page 16: Silent sound interface

ARTICULATORY SYNTHESIS:Articulatory synthesis refers to computational techniques

for synthesizing speech based on models of the human vocal tract and the articulation processes occurring there. Until recently, articulatory synthesis models have not been incorporated into commercial speech synthesis systems.

HMM BASED SYNTHESIS:HMM-based synthesis is a synthesis method based on hidden Markov models, also called Statistical Parametric Synthesis. In this system, the frequency spectrum (vocal tract), fundamental frequency (vocal source), and duration (prosody) of speech are modeled simultaneously by HMMs. Speech waveforms are generated from HMMs themselves based on the maximum likelihood criterion.

Page 17: Silent sound interface

SINE WAVE SYNTHESIS:Sine wave synthesis is a technique for

synthesizing speech by replacing the formants (main bands of energy) with pure tone whistles.

Page 18: Silent sound interface

BLOCK DIAGRAM :

Page 19: Silent sound interface

FEATURES: AUDIO SPOTLIGHT:The Audio Spotlight transmitters generate a column of sound between three and five degrees wider than the transmitter. It converts ordinary audio into high-frequency ultrasonic signals that are outside the range of normal hearing. As these sound waves push out from the source, they interact with air pressure to create audible sounds.Sound field distribution is shown with equal loudness contours for a standard 1 KHz tone. The center area is louder at 100% amplitude, while the sound level just outside the illustrated beam area is less than 10%.Audio spotlight systems are much sensitive to listener distance than traditional loudspeakers, but maximum performance is attained at roughly 1-2m (3-6feet) from the listener.Typical levels are 80dB SPL at 1 KHz for As-16 and 85dB SPL for AS-24 models. The larger AS-24 can output about twice the power and twice low frequency range.

Page 20: Silent sound interface

This simulation is fixed for fixed source size(0.4m/16”) with varying wavelength. From the statements above, we expect to see an unidirectional response for a large wavelength relative to source, and higher directivity as wavelength decreases.

Page 21: Silent sound interface

METHODS OF PRODUCING

ELECTROMYOGRAPHY

IMAGE PROCESSING

Page 22: Silent sound interface

ELECTROMYOGRAPHY:It is a technique for evaluating and recording the

electrical activity produced by skeletal muscles. EMG is performed using an instrument called an electromyography, to produce a record called an electromyogram. An electromyography detects the electrical potential generated by muscle cells when these cells are electrically or neurologically activated.

Page 23: Silent sound interface

Electromyographic sensors attached to the face records the electric signals produced by the facial muscles, compare them with pre recorded signal pattern of spoken words .

When there is a match that sound is transmitted on to the other end of the line and person at the other end listen to the spoken words.

Page 24: Silent sound interface

For such an interface ,we should use 4 kinds of TRANSDUCERS . They are as follows :- 1.Vibration sensors 2.Pressure sensor 3.Electromagnetic sensor 4.Motion sensor

IMAGE POCESSING:•The simplest form of image processing converts the data tape into a film image with minimal corrections and calibrations.

Page 25: Silent sound interface

Digital data

Pre processing

Feature extraction

Selection of training data

Image enhancement

Manual interpretation Decision and

classification Ancillary data

Unsupervised Supervised

Classification output

Post processing operation

Assess memory

Maps and imageries Reports Data

Page 26: Silent sound interface

APPLICATIONAs we know in space there is no medium for sound to travel therefore

this technology can be best utilized by astronauts.

We can make silent calls even if we are standing in a crowded place.

This technology is helpful for people without vocal cord or those who are suffering from Aphasia (speaking disorder ).

This technology can be used for communication In nasty environment.

To tell a secret PIN no. , or credit card no. on the phone now be easy as there is no one eavesdrop anymore.

Since the electrical signals are universal they can be translated into any language. Native speakers can translate it before sending it to the other side. Hence it can be converted into any language of choice currently being German, English & French.

Page 27: Silent sound interface

RESTRICTIONSTranslation into majority of languages but for languages such as Chinese different tone holds different meaning, facial movements being the same. Hence this technology is difficult to apply in such situations.

From security point of view recognizing who you are talking to gets complicated.

Even differentiating between people and emotions cannot be done. This means you will always feel you are talking to a robot.

This device presently needs nine leads to be attached to our face which is quite impractical to make it usable.

Page 28: Silent sound interface

FUTURE PROSPECTS Silent sound technology gives way to a bright future to speech

recognition technology from simple voice commands to memorandum dictated over the phone all this is fairly possible in noisy public places.

Without having electrodes hanging all around your face, these electrodes will be

• It may have features like lip reading based on image recognition & processing rather than electromyography.• Nano technology will be a mentionable step towards making the device handy.

Page 29: Silent sound interface

CONCLUSION: Engineers claim that the device is working with 99 percent efficiency.

It is difficult to compare SSI technologies directly in a meaningful way. Since many of the systems are still preliminary, it would not make sense, for example, to compare speech recognition scores or synthesis quality at this stage.

With a few abstractions, however, it is possible to shed light on the range of applicability and the potential for future commercialization of the different methods.

Page 30: Silent sound interface

IN FICTION:

Page 31: Silent sound interface
Page 32: Silent sound interface