1180 Datasheet

5/13/2018 1180 Datasheet - slidepdf.com

http://slidepdf.com/reader/full/1180-datasheet 1/9

Email: [email protected] or sunrom@gmail

Visit us at http://www.sunrom

Document: Datasheet Date: 2-Oct-09 Model #: 1180 Product’s Page: www.sunrom.com/p-762.html

Speech Recognition System

The speech recognition system is a completely assembledand easy to use programmable speech recognition circuit.Programmable, in the sense that you train the words (orvocal utterances) you want the circuit to recognize. Thisboard allows you to experiment with many facets of speechrecognition technology. It has 8 bit data out which can beinterfaced with any microcontroller for further development.Some of interfacing applications which can be made are

controlling home appliances, robotics movements, SpeechAssisted technologies, Speech to text translation, and manymore.

Features

• Self-contained stand alone speech recognition circuit

• User programmable

• Up to 20 word vocabulary of duration two second each• Multi-lingual

• Non-volatile memory back up with 3V battery onboard.Will keep the speech recognition data in memory even

after power off.• Easily interfaced to control external circuits &

appliances

Specification

Parameter Value Note

Input Voltage 9 to 15 V DC Use a commonly available 12V 500ma DC Adapter

Output Data 8 bits at 5VLogic Level

Any microcontroller like 8051, PIC or AVR can be interfaced to dataport to interpret and implement specialized applications

Applications

There are several areas for application of voice recognition technology.

• Speech controlled appliances and toys

• Speech assisted computer games

• Speech assisted virtual reality• Telephone assistance systems

• Voice recognition security

• Speech to speech translation

mailto:[email protected]


http://www.sunrom.com/

http://www.sunrom.com/p-762.html

http://www.sunrom.com/p-762.html






Sunrom Technologies Your Source for Embedded Systems Visit us at www.sunrom.com

2

Introduction

Speech recognition will become the method of choice for controlling appliances, toys, tools acomputers. At its most basic level, speech controlled appliances and tools allow the user to perforparallel tasks (i.e. hands and eyes are busy elsewhere) while working with the tool or appliance.

The heart of the circuit is the HM2007 speech recognition IC. The IC can recognize 20 words, eachword a length of 1.92 seconds.

Complete Schematic of System

Using the System

The keypad and digital display are used to communicate with and program the HM2007 chip. Thekeypad is made up of 12 normally open momentary contact switches. When the circuit is turned on“00” is on the digital display, the red LED (READY) is lit and the circuit waits for a command.

Training Words for RecognitionPress “1” (display will show “01” and the LED will turn off) on the keypad, then press the TRAIN ke( the LED will turn on) to place circuit in training mode, for word one. Say the target word into theonboard microphone (near LED) clearly. The circuit signals acceptance of the voice input by

TitleCode RevDate: Sheet of

1180 1Speech Recognition

Sunrom Technologies http://www.sunrom.c

1 1Wednesday, February 11, 2009

SW1 SW2

DB2

SW3

SW5

DATA OUT

SW6SW4

SW7

SW11 SW12

U1HM2007

GND1

X22

X13

S14

S25

S36

RDY7

K1

8

K29

K310

K411

TEST12

WLEN13

CPUM14

WAIT15

DEN16

SA017

SA118

SA219

SA320

SA421

SA522

SA623

SA724

VDD25GND26SA827SA928SA1029SA1130SA1231NC32NC33ME34MR/MW35D036D137D238D339D440D5

41D642D743VREF44LINE45MICIN46VDD47AGND48

SW8SW9

SW10

DB1

Y13.579 Mhz

DB0

U4HY6264

A010

A19

A28

A37

A46

A55

A64

A7

3

A825

A924

A1021

A1123

A122

D011

D112

D213

D315

D416

D517

D618

D7

19

V C C

2 8

G N D

1 4

OE22

WE27

CS120

CS226

BT13V BATT

D1LED

VCC

R4470R

READY

K4

K3

K1

K2

R522K

VCCC3

100nF+M1MIC

21 3

R16.8K

VCC

D7

5 6

9

4

TRAIN

7 8

CLEAR 0

C2100nF

D6K1

DB2

DB5

DB1

DB4

D4

D5

ME

DB6

DB0

DB3

DB7

D0

D3D2D1

K2K3K4

S1S2S3

S1S3 S2

D0

C1100nF

D1D2D3

D7

D5D4

D6

VCC

VCC

C5100nF

C6100nF

VCC VCC

C10

100n

VCC

D5CN2DC SOCKET

9-12V Input

+C11

100uF 16V

Powe r Supply

D6

C8

100n

+

C91000uF 25V

D7

C4100nF

D4

VCC

VCC

U274HC573

D02 D13 D24 D35 D46 D57 D68 D79

LE11

OE1

Q019Q118Q217Q316Q415Q514Q613Q712

G N D

1 0

V C C

2 0

VCC

VCC

D0

DEN

D3D2D1

D5D4

D7D6

U3CD4511B

B1

C2

D6

A7

G N

D

8

V D D

1 6

e9d10c11b12a13

g14f15

LT3

BI4

LE5

VCC

1

2

4

5

7

9

10

a b

c

de

f g

SL

1 2 3 4 5

6 7 8 9 1 0

R2220R

R3220R

R6220R

R7220R

R8220R

R9220R

R10220R

U5CD4511B

B1

C2

D6

A7

G N D

8

V D D

1 6

e9d10c11b12a13

g14f15

LT3

BI4

LE5

VCC

1

2

4

5

7

9

10

a b

c

de

f g

SL

1 2 3 4 5

6 7 8 9 1 0

R11220R

R12220R

R13220R

R14220R

R15220R

R16220R

R17220R

DEN

CN1SIP10

123456789

10

SA0SA1SA2SA3SA4

SA7

SA5SA6

SA9

SA11

SA8

SA10

SA12

SA0SA1SA2SA3

SA5

SA7SA6

SA4

SA9

SA11SA10

SA8

SA12

R18100K

C7100nF

IN OUT

GND

U6LM7805

1 3

2

VCC

MEMR

MR

D3

BAT85

VCC

D2

BAT85

DB7DB6DB5

3V

3V

DB4DB3








3

blinking the LED off then on. The word (or utterance) is now identified as the “01” word. If the LEDdid not flash, startover by pressing “1” and then “TRAIN” key.

You may continue training new words in the circuit. Press “2” then TRN to train the second wordand so on. The circuit will accept and recognize up to 20 words (numbers 1 through 20). It is notnecessary to train all word spaces. If you only require 10 target words that’s all you need to train.

Testing Recognition:Repeat a trained word into the microphone. The number of the word should be displayed on thedigital display. For instance, if the word “directory” was trained as word number 20, saying theword “directory” into the microphone will cause the number 20 to be displayed.

Error Codes:The chip provides the following error codes.

55 = word to long66 = word to short77 = no match

Clearing MemoryTo erase all words in memory press “99” and then “CLR”. The numbers will quickly scroll by on thedigital display as the memory is erased.

Changing & Erasing WordsTrained words can easily be changed by overwriting the original word. For instances suppose wordsix was the word “Capital” and you want to change it to the word “State”. Simply retrain the wordspace by pressing “6” then the TRAIN key and saying the word “State” into the microphone.If one wishes to erase the word without replacing it with another word press the word number (inthis case six) then press the CLR key. Word six is now erased.

Simulated Independent RecognitionThe speech recognition system is speaker dependant, meaning that the voice that trained thesystem has the highest recognition accuracy. But you can simulate independent speech recognitioTo make the recognition system simulate speaker independence one uses more than one wordspace for each target word. Now we use four word spaces per target word. Therefore we obtain fodifferent enunciation’s of each target word. (speaker independent). The word spaces 01, 02, 03 an04 are allocated to the first target word. We continue do this for the remaining word space. Forinstance, the second target word will use the word spaces 05, 06, 07 and 08. We continue in thismanner until all the words are programmed.

If you are experimenting with speaker independence use different people when training a targetword. This will enable the system to recognize different voices, inflections and enunciation's of thetarget word. The more system resources that are allocated for independent recognition the morerobust the circuit will become.

If you are experimenting with designing the most robust and accurate system possible, train targetwords using one voice with different inflections and enunciation's of the target word.






4

HomonymsHomonyms are words that sound alike. For instance the words cat, bat, sat and fat sound alike.Because of their like sounding nature they can confuse the speech recognition circuit. Whenchoosing target words for your system do not use homonyms.

The Voice with Stress & ExcitementStress and excitement alters ones voice. This affects the accuracy of the circuit’s recognition. For

instance assume you are sitting at your workbench and you program the target words like fire, left,right, forward, etc., into the circuit. Then you use the circuit to control a flight simulator game, Doomor Duke Nukem. Well, when you’re playing the game you’ll likely be yelling “FIRE! …Fire! ...FIRE!!...LEFT …go RIGHT!”. In the heat of the action you’re voice will sound much different than whenyou were sitting down relaxed and programming the circuit. To achieve a higher accuracy wordrecognition one needs to mimic the excitement in ones voice when programming the circuit.

These factors should be kept in mind to achieve the high accuracy possible from the circuit. Thisbecomes increasingly important when the speech recognition circuit is taken out of the lab and putto work in the outside world.

Error CodesWhen interfacing the external circuit through its data bus, The decoding circuit must recognize theword numbers from error codes. So the circuit must be designed to recognize error codes 55, 66and 77 and not confuse them with word spaces 5, 6 and 7.

Voice Security SystemThis circuit isn’t designed for a voice security system in a commercial application, but that shouldnot prevent anyone from experimenting with it for that purpose. A common approach is to use threor four keywords that must be spoken and recognized in sequence in order to open a lock or allowentry.

Aural InterfacesIt’s been found that mixing visual and aural information is not effective. Products that require visuaconfirmation of an aural command grossly reduces efficiency. To create an effective AUI productsneed to understand (recognize) commands given in an unstructured and efficient methods. The wain which people typically communicate verbally.

Learning to ListenThe ability to listen to one person speak among several at a party is beyond the capabilities oftoday’s speech recognition systems. Speech recognition systems can not (as of yet) separate andfilter out what should be considered extraneous noise.

Speech recognition is not understanding speech. Understanding the meaning of words is a higherintellectual function. Because a circuit can respond to a vocal command doesn’t mean itunderstands the command spoken. In the future, voice recognition systems may have the ability todistinguish nuances of speech and meanings of words, to “Do what I mean, not what I say!”

Speaker Dependent / Speaker IndependentSpeech recognition is divided into two broad processing categories; speaker dependent andspeaker independent.






5

Speaker dependent systems are trained by the individual who will be using the system. Thesesystems are capable of achieving a high command count and better than 95% accuracy for wordrecognition. The drawback to this approach is that the system only responds accurately only to theindividual who trained the system. This is the most common approach employed in software forpersonal computers.

Speaker independent is a system trained to respond to a word regardless of who speaks. Therefo

the system must respond to a large variety of speech patterns, inflections and enunciation's of thetarget word. The command word count is usually lower than the speaker dependent however highaccuracy can still be maintain within processing limits. Industrial applications more often requirespeaker independent voice recognition systems.

Recognition StyleIn addition to the speaker dependent/independent classification, speech recognition also contendswith the style of speech it can recognize. They are three styles of speech: isolated, connected andcontinuous.

Isolated: Words are spoken separately or isolated. This is the most common speech recognition

system available today. The user must pause between each word or command spoken.

Connected: This is a half way point between isolated word and continuous speech recognition. Itpermits users to speak multiple words. The HM2007 can be set up to identify words or phrases 1.9seconds in length. This reduces the word recognition dictionary number to 20.

Continuous: This is the natural conversational speech we use to in everyday life. It is extremelydifficult for a recognizer to sift through the sound as the words tend to merge together. For instanc"Hi, how are you doing?" to a computer sounds like "Hi,.howyadoin" Continuous speech recognitiosystems are on the market and are under continual development.

More On The HM2007 ChipThe HM2007 is a CMOS voice recognition LSI (Large Scale Integration) circuit. The chip containsan analog front end, voice analysis, regulation, and system control functions. The chip may be usein a stand alone or CPU connected.

Features:

• Single chip voice recognition CMOS LSI

• Speaker dependent

• External RAM support• Maximum 40 word recognition (.96 second)

• Maximum word length 1.92 seconds (20 word)

• Microphone support• Manual and CPU modes available

• Response time less than 300 milliseconds

• 5V power supply

More information on the HM2007 chip is available in the HM2007 data booklet (DS-HM2007) whiccan be downloaded below.http://www.sunrom.com/files/HM2007.pdf


http://www.sunrom.com/files/HM2007.pdf

http://www.sunrom.com/files/HM2007.pdf





6

Interfacing external circuits through data bus

This sample project will show how a circuit can be interfaced through the data bus of speechrecognition circuit. It will show messages and error codes on LCD. It will also operate four relays aper data from speech circuit.

Schematic of interfacing project

TitleCode RevDate: Sheet of

1180A 1Demo Project of Speech Recognition

Sunrom Technologies http://www.sunrom.com

1 1Thursday, F ebruary 26, 2009

R21K

D2

LED

VDD

R41K

VDD

D9

LED

R71K

VDD

D10

LED

R81K

D11

LED

VDD

RLY4

CN 1PBT2

LS1RELAY

35

412RLY1

VDD

RLY3

LS3RELAY

35

412

VDD

LS2RELAY

35

412

CN 3PBT2

RLY2

VDD

LS4RELAY

35

412

VDD

CN 4PBT2

CN 7PBT2

RLY4

U2ULN2803

COM10

G N D

9

IN11

IN22

IN33IN4

4

IN55

IN66

IN77

IN88

OUT118

OUT217

OUT3 16OUT4

15

OUT514

OUT613

OUT712

OUT811

RLY3

RLY1

VDD

RLY2

D1LED

R1470R

VCC

R310K

U3AT89S52

P3.1/TXD11

P3.2/INT012

P3.3/INT113

P3.4/T014

P3.5/T115

P3.6/WR 16P3.7/RD

17

X T A L 2

1 8

X T A L 1

1 9

G N D

2 0

P2.0/A821

P2.1/A922

P2.2/A1023

P2.3/A1124

P2.4/A1225

P2.5/A1326

P2.6/A1427

P2.7/A1528

PSEN29

ALE/PROG30

EA/VPP31

P0.7/AD732

P0.6/AD633 P0.5/AD534 P0.4/AD435

P0.3/AD336 P0.2/AD237 P0.1/AD138

P0.0/AD039

V C C

4 0

P1.0/T21

P1.1/T2EX2

P1.23

P1.34

P1.4/SS5

P1.5/MOSI6

P1.6/MISO7P1.7/SCK

8

RST9

P3.0/RXD10

Y1

11.0592C1333p

C1433p

VCC

VCC

+

C610uF

RN110K R-ARRAY

9 8 7 6 5 4 3 2

1

VCC

PR150K PRESET

LCD

U1LCD 16x2

D 0

7

D 1

8

D 2

9

D 3

1 0

D 4

1 1

D 5

1 2

D 6

1 3

D 7

1 4

V l e d

1 5

E n a

b l e

6

R / W

5

R S

4

V L

3

V d d

2

V s s

1

G l e d

1 6

VCC

VCC

Display Contrast

C1100n

C2100n

DB3

DB5DB6

DB2

DB4

DB7

DB1

SPEECH BOARD

DB0

CN 6SIP10

12345678910

C10

100n

VCC

D5CN2DC SOCKET

9-12V Input

+C11

100uF 16V

Powe r Supply

D6

C8

100n

+

C91000uF 25V

D7

D4

VDD

DATA FROM

IN OUT

GND

U6

LM78051 3

2








7

Sample Code of interfacing project

//main.c

#include <REGX51.H> // standard 8051 defines

// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

// -=-=-=-=- Include files -=-=-=-=-=-=-=

// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

#include "lcd.h"

#include "utils.h"

// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= // -=-=-=-=- Hardware Defines -=-=-=-=-=-=-=

// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

sfr DATA = P0;

sbit OUT1 = P3^4;

sbit OUT2 = P3^5;

sbit OUT3 = P3^6;

sbit OUT4 = P3^7;

// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

// -=-=-=-=- Variables -=-=-=-=-=-=-=

// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

char buf[20];

char code M1[] ="SPEECH: ONE";

char code M2[] ="SPEECH: TWO";

char code M3[] ="SPEECH: THREE";

char code M4[] ="SPEECH: FOUR";

// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

// -=-=-=-=- Main Program -=-=-=-=-=-=-=

// -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

void main()

{

unsigned char lastdata, datanow;

// -=-=- Intialize variables -=-=-=

OUT1 = 0;

OUT2 = 0;

OUT3 = 0;

OUT4 = 0;// -=-=- Intialise -=-=-=

lcdInit();

// -=-=- Welcome LCD Message -=-=-=

lcdClear();

lcdGotoXY(0,0); // 1st Line of LCD

lcdPrint("Speech Test");

lcdGotoXY(0,1); // 2nd Line of LCD

lcdPrint("System");

delayms(5000); // 5 sec

lcdClear();


lcdPrint("Train: 1-4 key >");

lcdGotoXY(0,1); // 2nd Line of LCDlcdPrint("Train>Speak Now");

// -=-=- Program Loop -=-=-=

lastdata=0xff;

while(1)

{

datanow=DATA; // read data from speech board

if(lastdata!=datanow) // if there is new data then,

{






8

lastdata=datanow;

switch(lastdata)

{

case 0x55:

lcdClear();


lcdPrint("Speech too Long");


lcdPrint("Try Again!");

break;case 0x66:

lcdClear();


lcdPrint("Speech too Short");



break;

case 0x77:

lcdClear();


lcdPrint("No Match");



break;

case 0x01:

if(OUT1==1)

OUT1 = 0;

else

OUT1 = 1;

lcdClear();


lcdPrint(M1);

break;

case 0x02:

if(OUT2==1)

OUT2 = 0;

elseOUT2 = 1;

lcdClear();


lcdPrint(M2);

break;

case 0x03:

if(OUT3==1)

OUT3 = 0;

else

OUT3 = 1;

lcdClear();


lcdPrint(M3); break;

case 0x04:

if(OUT4==1)

OUT4 = 0;

else

OUT4 = 1;

lcdClear();


lcdPrint(M4);






9

break;

}

}

}

}



1180 Datasheet

Documents