Top Banner
Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates [email protected]
40

Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates [email protected].

Dec 16, 2015

Download

Documents

Shyann Moore
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

Identification of voices in disguised speech

Jessica Clark* & Paul Foulkes**

* University of York

** University of York & JP French Associates

[email protected]

IAFPA, Göteborg 2006

Page 2: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

2

0.1 outline

• experiment to test ability of lay listeners to identify disguised familiar voices

• voices have been disguised artificially, as with commercially available voice changers– pitch modified

Page 3: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

3

0.2 structure

1. introduction– rationale for experiment

2. experimental design– speakers– listeners– Control condition– Experimental conditions

3. results

4. discussion & conclusion

Page 4: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

4

1. Introduction

Page 5: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

5

1. Introduction

• technical speaker identification is the most frequent task for the forensic phonetician

• lay identification is also common in legal cases

• many previous studies have thus examined lay listeners’ ability to identify voices and the factors which affect their ability

Page 6: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

6

1.1 previous studies

• identification is not automatic or flawless

• listeners can make errors even with highly familiar voices – Ladefoged did not recognise his mother from a short

sample (Ladefoged & Ladefoged 1980)

– flatmates scored only 68% with 10 second samples (Foulkes & Barron 2000)

Page 7: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

7

1.1 previous studies

• identification may be affected by [Bull & Clifford 1984]

– type of exposure (active/passive)– length of sample– nature of sample (phone, direct, shouting etc)– delay between exposure and test– age of listener– hearing ability– sightedness– natural variability across individual listeners– specific features of voice– degree of familiarity– nature and extent of any disguise

Page 8: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

8

1.2 degree of familiarity

• all things equal, more familiar voices are easier to identify

• e.g. Hollien, Majewski & Doherty (1982)– listening tests with 10 male voices

listener group N % correct

(normal condition)

familiar 10 98

trained 47 40

unfamiliar 14 27

Page 9: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

9

1.3 disguise

• all things equal, disguised voices are harder to identify

• e.g. Hollien, Majewski & Doherty (1982)– various forms of disguise used

listener group N % correct (normal)

% correct (disguised)

familiar 10 98 79

trained 47 40 21

unfamiliar 14 27 18

machine approach (LTAS) 30

Page 10: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

10

1.3 disguise

• previous studies have examined various types of disguise– whisper, pencils between teeth, hypernasality,

dialect change, rate change, professional mimics

• but little if any work on voice changers– hardware based– software based– easily available

Page 11: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

11

www.maplin.co.uk

www.crimebusters911.com

www.blazeaudio.com

Page 12: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

12

1.3 disguise

• in our study we chose not to use real voice changers, in favour of total control over effects

• pitch shift chosen as a universal function

Page 13: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

13

2. Experimental design

Page 14: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

14

2.1 design outline

• simple design

• listeners asked to identify samples of familiar voices

• Control condition unmodified stimuli

• 4 Experimental conditions modified stimuli

Page 15: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

15

2.1 design outline

• degree of familiarity known to affect rate of successful identification

• thus we trained listeners to identify a group of speakers– controls degree of familiarity– all listeners had exactly the same exposure in terms

of length & quality of samples– identification task carried out under same conditions

Page 16: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

16

2.2 speakers

• 4 male speakers– 16-18 years old

• taken from IViE corpus (Grabe, Post & Nolan 2001)

– Leeds dialect (nearest to York)– reading text of Cinderella story

IViE speaker Experimental name

JP Edward

JW Matthew

MD Harry

RP David

Page 17: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

17

2.2 speakers

• training materials created for each speaker– c. 90 seconds of Cinderella (302 words)– edited out disfluencies, non-speech sounds, long

pauses– samples normalised for amplitude with Audacity

1.2.5

Page 18: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

18

2.3 listeners

• 36 listeners• variety of regional/social backgrounds• York residents• age range 19-55• 10 male, 26 female

Page 19: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

19

2.4 Control condition

• all 36 listeners – 4 voices * 90 seconds = c. 6 minutes

– presented by PowerPoint with speakers’ names

– Toshiba laptop– Aiwa A170 headphones– individually in quiet room

1. training phase

2. break

3. listening test

Page 20: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

20

2.4 Control condition

• all 36 listeners

– 10 minutes

1. training phase

2. break

3. listening test

Page 21: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

2.4 Control condition

• all 36 listeners

– 8 stimuli (2 per

speaker)– duration c. 10 seconds– 5 second gap between– extracts from other parts

of Cinderella story– normalised for amplitude

with Audacity 1.2.5– answer sheet with names

1. training phase

2. break

3. listening test

Page 22: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

22

2.5 Experimental conditions

• 4 Experimental conditions

• listening tests same format as Control condition

• but stimuli modified for pitch

• Sound Forge 8.0– pitch shift effect– accuracy setting ‘high’– speech 1 mode– preserved durations

Page 23: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

23

2.5 Experimental conditions

(i) +8 semitones

(ii) +4 semitones

(iii) -4 semitones

(iv) -8 semitones

pitch shift > 8 semitones unnatural and partly incomprehensible

Page 24: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

24

2.5 Experimental conditions

listener group N conditions (semitones)

A 18 -8, +4

B 18 -4, +8

Page 25: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

25

2.5 Experimental conditions

• listening test 16-92 days after Control test– no clear effects for length of delay

• same training as in Control condition• 10 minute break• 2 stimuli for familiarisation

• 8 experimental stimuli per condition– consecutive runs for + and - stimuli

– order reversed for half of each group, but no effect

Page 26: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

26

3. Results

Page 27: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

27

3.1 Control condition

• average correct identification = 4.8/8 (60%)

0

1

2

3

4

5

6

7

8

Minus 8 Minus 4 Control Plus 4 Plus 8

ave

rag

e N

co

rre

ct

Page 28: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

28

3.1 Control condition

• individuals’ range 8 to 0• 29/36 performed better than chance

control

0

1

2

3

4

5

6

7

8

listeners

N c

orr

ect

Page 29: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

29

3.2 Experimental conditions

• ** sig. lower than in Control (p < .005, Wilcoxon)

• trend (n.s.) for higher scores in + conditions

0

1

2

3

4

5

6

7

8

Minus 8 Minus 4 Control Plus 4 Plus 8

ave

rag

e N

co

rre

ct

** ******

Page 30: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

30

-8 semitones

0

1

2

3

4

5

6

7

8

listeners

N c

orr

ect

+8 semitones

0

1

2

3

4

5

6

7

8

listeners

N c

orr

ect

+4 semitones

0

1

2

3

4

5

6

7

8

listeners

N c

orr

ect

-4 semitones

0

1

2

3

4

5

6

7

8

listeners

N c

orr

ect

• variability in listener performance, esp. ±4• majority perform above chance except -8

Page 31: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

31

3.3 variation by listener sex

• women sig. better in Control (p = .008, Mann-Whitney)

– trend (n.s.) maintained in Experimental tests– same pattern reported by Bull & Clifford (1984)

0

1

2

3

4

5

6

7

8

Minus 8 Minus 4 Control Plus 4 Plus 8

N c

orr

ec

t

Male Female

**

Page 32: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

32

3.4 summary

• as predicted, identification rates were lower with disguised voices– lowest scores with most extreme form of disguise

(±8 semitones)

• identification rates slightly better when pitch shifted up than down

• trend for women to perform better than men

• variability across listeners

Page 33: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

33

4. Discussion & conclusion

Page 34: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

34

4. discussion & conclusion

• tests reported here were not forensically realistic

• results may be affected by e.g.– degree of familiarity with voice– content of sample (vocabulary, syntax etc)– conditions of exposure (stress etc)– specific form of artificial disguise

• software, hardware system• combination of effects

Page 35: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

35

4. discussion & conclusion

• considerable variation in listeners’ scores– courts should not assume all witnesses are equally

good at such tasks– supports broader principle that lay witnesses should

be tested in their ability to identify a voice

Page 36: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

36

4. discussion & conclusion

• but even marked disguise was not catastrophic for listeners

• a broadly positive conclusion for lay speaker identification– a reasonable chance of identifying familiar voices

Page 37: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

37

4. discussion & conclusion

• but a less positive conclusion respect to use of voice changers as a means of protecting vulnerable witnesses giving evidence

• more extreme forms of modification may affect intelligibility & naturalness

• less extreme forms of modification may render witness’s voice recognisable

• different modifications for different voices?

Page 38: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

38

4. discussion & conclusion

• as ever…

• more work is needed

Page 39: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

39

thanks

tack 

thanks to Peter French, Phil Harrison, Robin How

Page 40: Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk.

40

References

Bull, R. & Clifford, B. (1984) Earwitness voice recognition accuracy. In G. Wells & E. Loftus (eds.) Eyewitness Testimony: Psychological Perspectives. Cambridge: CUP. pp. 92-123.

Foulkes, P. & Barron, A. (2000) Telephone speaker recognition amongst members of a close social network. Forensic Linguistics 7: 181-198.

Grabe, E., Post, B. & Nolan, F. (2001) English intonation in the British Isles: the IViE corpus. Final report to UK ESRC R000 237145. www.phon.ox.ac.uk/IViE

Hollien, H., Majewski, W. & Doherty, E. (1982) Perceptual identification of voices under normal, stress and disguise speaking conditions. Journal of Phonetics 10: 139-148.

Ladefoged, P. & Ladefoged, J. (1980) The ability of listeners to identify voices. UCLA Working Papers in Phonetics 49: 43-51.