Top Banner
Forensic Speaker Recognition
18

Forensic Speaker Recognition

Jan 23, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Forensic Speaker Recognition

Forensic Speaker Recognition

Page 2: Forensic Speaker Recognition

Amy Neustein • Hemant A. PatilEditors

Forensic Speaker Recognition

Law Enforcement and Counter-Terrorism

1  3

Page 3: Forensic Speaker Recognition

ISBN 978-1-4614-0262-6 e-ISBN 978-1-4614-0263-3DOI 10.1007/978-1-4614-0263-3Springer New York Dordrecht Heidelberg London

Library of Congress Control Number: 2011939216

© Springer Science+Business Media, LLC 2012All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connec-tion with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden.The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)

EditorsAmy NeusteinLinguistic Technology Systems800 Palisade Avenue Suite: 1809Fort Lee, New Jersey 07024 [email protected]

Hemant A. PatilDhirubhai Ambani Institute of Information and Communication Technology (DA-IICT)Near Indroda CircleRoom 4103, Faculty Block 4 Gandhinagar, Gujarat [email protected]

Page 4: Forensic Speaker Recognition

v

Foreword

Collection and storage of speech has become a very common phenomenon in re-cent times, thanks to the availability of the necessary electronic devices such as microphones and memory. All personal computers, cell phones and the like come equipped with these devices. With literally billions of people over the world having mobile phones, audio records are rapidly getting built up, sometimes without the knowledge of the user. In fact, many business and financial transactions are carried out over the phone without any authenticating documentation, thus creating a host of new legal problems. However, if this new mode of business is in the future likely to replace (with some degree of regularity) the conventional signed paperwork, we will need a robust authentication method for voice.

Yet, even with the increased use of voice technology, it seems highly unlikely at the moment that courts of law will accept the current speaker recognition tech-nology as forensic evidence on par with signed documents, fingerprints, or DNA. The reason is that compared to a fingerprint or DNA or even a handwritten signa-ture, voice has far greater variability. In addition, fatigue, common cold, emotions, among other factors can change the voice sample sometimes beyond recognition. In fact, in everyday life we as humans can sometimes incorrectly identify a speaker, so imagine how difficult it for a machine to consistently identify a speaker ac-curately. In spite of such limitations, which undoubtedly mitigate the evidentiary weight of speaker identification and verification findings that are presented to the court, speaker recognition can still play a significant role as a prime investigative tool in criminal prosecutions.

One of the goals in bringing this science to a higher level of performance must be to broaden the field of speaker recognition, or more aptly “voice” recognition, so that the scientific work of identifying a speaker would effectively incorporate the speaker’s vocal tract characteristics into the identification process.

In addition to speaker recognition from good quality recordings, extensive re-search is needed to handle situations where signal to noise ratio is poor, or where many persons are speaking at the same time (e.g., multi-speaker speech in a meeting or a conference). In truth, recording levels can fluctuate over a large range, and some parts of speech may even take the form of whispers. The latter may significantly

Page 5: Forensic Speaker Recognition

vivi

alter voice characteristics, resulting in little or no voicing, shift of lower formants, change in energy and duration characteristics, spectral slope, among other features.

One should not neglect the need for improved transcription either. Good tran-scription is important for improving the clarity of the record so that it can be lis-tened to with ease. Swedish professor Anders Eriksson illuminates the readers to the importance of accurate transcription in his chapter titled “Aural/acoustic versus automatic methods in forensic phonetic case work”. What we learn from this is that while undoubtedly making the voice sharp and clear for the purpose of transcrip-tion, some sounds, which perhaps are not voice-like at all, could be lost or distorted altogether, and these may be just the ones which hold vital clues for an important forensic investigation.

In the last analysis, forensic work requires a multi-disciplinary approach. We benefit from many years of research in developing digital signal processing tech-niques and in the understanding of the acoustics and aero-acoustics of production of speech (voice) signal. Moreover, as the signal processing does not have to be in real-time and can be done repeatedly, a larger variety of approaches, particularly non-linear methods such as Teager Energy Operator and computationally intensive methods—especially those involving searches of large databases—could find their place in the mainstream of forensic research.

The current volume brings together an excellent set of location markers, sign-posts and starting points for an interesting journey ahead.

Dhirubhai Ambani Institute of Information Prof. S. C. Sahasrabudhe and Communication Technology (DA-IICT) (IEEE Fellow), DirectorGandhinagar, India

Foreword

Page 6: Forensic Speaker Recognition

vii

Preface

Forensic Speaker Recognition: Law Enforcement and Counter-Terrorism is an anthology of the research findings of thirty-five speaker recognition experts from around the world. The book provides a multidimensional look at the complex sci-ence involved in determining whether a suspect’s voice truly matches forensic speech samples, collected by law enforcement and counter-terrorism agencies, that are associated with the commission of a terrorist act or other crime. Given the seri-ous consequences for the suspect, who may go to jail or even (in the most extreme cases involving terrorism or murder) face the death penalty, a rigorous and reliable science must be used in finding a match between a suspect’s voice and the speech samples collected in the forensic crime lab. The United States Supreme Court’s rul-ing in Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579 (1993) established a test for the legal admissibility of scientific evidence which requires that the theory and method upon which the evidence is based is testable, accepted, peer reviewed, and, where applicable, has a known equal error rate (EER). Similar standards for the validity and reliability of scientific evidence are used in other countries that have taken up the recommendations of the National Academy of Sciences (USA) and the Law Commission (UK).

Standards like these place a heavy burden on the expert who offers testimony in court. Each time forensic testimony is entered into evidence, the expert witness must prove ab initio that his science is reliable and valid in the case that is before the court before his or her testimony can properly qualify for admissibility. It fol-lows that if forensic scientific methods are to be useful in legal contexts, they must hold up under judicial scrutiny, since in the end the question of admissibility will be decided by a trial judge.

Hence the consistent stress on bringing speaker authentication methods into line with the strict standards of legal admissibility is exactly what the reader will find in the work of this volume’s diverse group of forensic speech scientists, whether they work side by side with investigators in crime labs, provide services to private companies that specialize in the design of speaker verification systems, or teach in university settings where they study (among other things) the effects of speech signal degradation on the quality of forensic speech samples. Forensic speaker rec-ognition, as a probative science, must competently assist criminal investigators in

Page 7: Forensic Speaker Recognition

viiiviii Preface

minimizing both the occurrence of a “false positive”—in which the speech sample related to the commission of a crime or terrorist act is matched to the wrong sus-pect—or a “false negative,” in which the real culprit’s voice fails to match the crime lab’s speech sample meant to fit him.

Although divided into eighteen chapters, addressing such varied topics as the challenges of forensic case work, handling speech signal degradation, analyzing features of speaker recognition to optimize voice verification system performance, and designing voice applications that meet the practical needs of law enforcement and counter-terrorism agencies, this book’s material all sounds a common theme: how the rigors of forensic utility are demanding new levels of excellence in all as-pects of speaker recognition. The book’s contributors are among the most eminent scientists in speech engineering and signal processing; their work represents the best to be found at universities, research institutes for police science, law enforce-ment agencies and speech companies, in such diverse countries as Switzerland, Sweden, Italy, France, Japan, India and the United States.

Forensic Speaker Recognition opens with an historical and procedural overview of forensic speaker recognition as a science. Following this is a fascinating exposi-tion by Professor Andrzej Drygajlo of the Swiss Federal Institute of Technology in Lausanne, whose chapter focuses on “the research advances in forensic automatic speaker recognition (FASR), including data-driven tools and related methodology that provide a coherent way of quantifying and presenting recorded voice as bio-metric evidence.” Professor Drygajlo furnishes the reader with an in-depth discus-sion of the

European Network of Forensic Science Institute’s evaluation campaign through a fake (simulated) case, organized by the Netherlands Forensic Institute, as an example where an automatic method using the Gaussian mixture models (GMMs) and the Bayesian interpretation (BI) framework were implemented for the forensic speaker recognition task.

This first section, aptly titled “Forensic Case Work,” is further enriched by the inves-tigations of Swedish professor Anders Eriksson (of the University of Gothenburg) into the specific challenges of forensic case work. Drawing on a substantial number of investigations performed for the Swedish police, the author inspects in painstak-ing detail the differences between the aural/acoustic and the automatic methods in forensic case work, focusing on what works and what doesn’t in real-life set-tings. The section concludes with a fascinating study of speaker profiling, based on the characteristics associated with speaker dialect. Manisha Kulshretha, a Haskins’ Laboratory (Yale University) researcher, together with C. P. Singh of the Forensics Science Laboratory, Government of NCT of Delhi, and Professor R. M. Sharma of Punjab University, show that from a sample size of 210 speakers, acoustic features associated with lexical tone and sentence intonation, along with vowel quality and vowel duration, serve potentially to identify the speaker’s particular dialect. Where dialect is an important element of identification, this method helps investigators to appreciably narrow the pool of potential suspects to those who reside in the region where that particular dialect is spoken.

Page 8: Forensic Speaker Recognition

ixixPreface

The second section of the book, titled “Speech Signal Degradation: Managing Problematic Conditions Affecting Probative Speech Samples,” devotes consider-able attention to the stubborn problem of speech signal degradation that impedes the gathering of probative speech samples (that is, samples gathered for use in court processes) that are clear and audible. Since criminals, in their zeal to cover their tracks, often lower their voices even to a whisper, or make calls from public places where there is loud noise in the background (or use VoIP networks) the quality of the voice recording is often poor. Thus, much of the speech data available for foren-sic analysis are degraded by several factors such as background noise, transmission and channel impairments, microphone variability, multi-party conversations, whis-pered speech, and VoIP artifacts. As a result, speech scientists, as part of their ef-forts to manage the problematic conditions affecting the quality of probative speech samples, must carefully isolate and measure the effects of all such factors on speech signal degradation.

The authors presented in this section have met that challenge head-on. They bring to the discussion the results of years of careful study of degraded speech on the performance of an automatic speaker recognition (ASR) system, by concentrat-ing on the following problems and, where available, their possible solutions:

1. speech under stress and the “Lombard Effect”;2. the wide range of artifacts of VoIP (speech codec, packet loss, packet reordering,

network jitter, foreign-cross talk or echo) and the effect of such artifacts on the performance of an ASR system;

3. session variability (“mismatched” environments for collection of speech samples) and the use of the non-linear modeling techniques of Teager Energy Operator-based Cepstral Coefficients (TEOCC) and amplitude versus frequency modula-tion (AM-FM) to improve speaker recognition in mismatched environments;

4. noisy environments and the use of speaker-specific prosodic features to improve speaker recognition;

5. noisy backgrounds and the use of various noise reduction filters (Noise Reduc-tion, Noise Gate, Notch Filter, Bandpass, and Butterworth Filter) in enhancing the speech signal for speaker identification; and

6. whispered speech and the use of an algorithm for whisper speech detection as part of a seamless neutral/whisper mismatched closed-set speaker recognition system.

This section has far too many contributors to name each one individually. They include University of Texas Professor John H. L. Hansen, University of Minnesota Professor Keshab K. Parhi, Raghunath S. Holambe, professor at SGGS Institute of Engineering and Technology, Nanded, India, and Jiju P. V., Senior Scientific Officer at the Forensic Science Laboratory, Government of NCT of Delhi, among other distinguished speech signal experts.

The third section, titled “Methods and Strategies: Analyzing Features of Speaker Recognition to Optimize Voice Verification System Performance in Legal Settings,” presents the experimental research findings of some of the most innovative and for-ward-looking speech scientists who have isolated the important features of speaker

Page 9: Forensic Speaker Recognition

xx Preface

recognition (some of which are appreciably less affected by signal degradation than others), and have carefully analyzed how such features may play an important role in improving forensic automatic speaker recognition.

The section begins with the experimental findings of Kanae Amino of the Na-tional Research Institute of Police Science in Japan (together with her research col-laborators), showing that nasal sounds are effective for forensic speaker recognition despite the differences in speaker sets and recording channels. They show how “performance degradation caused by the channel difference, in this study of air- and bone-conduction … can be redressed by devising normalisation methods and acoustic parameters.”

Next, T. V. Ananthapadmanabha, CEO of Voice and Speech Systems in Banga-lore, describes his careful studies of the volume-velocity airflow through the glot-tis (or the glottal airflow). In so doing, he has explored the significance of speech source characteristics by utilizing rigorous analytical results from the aerodynamic and acoustic theory of voice production. Much of this work was inspired by the author’s research collaboration with the late Professor Gunnar Fant at the Royal institute of Technology, Stockholm in the early 1980s. “A good understanding of the theory guides one in appropriate modeling and interpretation of voice source,” he writes. In addition, Dr. Ananthapadmanabha contends that “habitually formed relative dynamic variations in voice source parameters are of greater significance in forensic speaker recognition.”

The section is further enhanced by the analytic insights of Leena Mary, professor at Rajiv Gandhi Institute of Technology, Kottayam, India on the effectiveness of syllable-based prosodic features for speaker recognition. In her chapter, Professor Mary describes in painstaking detail a method for extracting prosodic features di-rectly from the speech signal itself. “Applying this method,” she tells us, speech is segmented into syllable-like regions using vowel onset points (VOP). The locations of VOPs (which entail Hilbert envelope of the linear prediction (LP) residual signal) serve as reference for extraction and representation of prosodic features.”

Significantly, Professor Mary deliberately chose to analyze prosody—which re-flects the learned/acquired speaking habits of a person and therefore contributes to speaker recognition—in as much as prosodic features are less affected by channel mismatch and noise, which are common causes of speech signal degradation in probative speech samples. Thus, prosodic features are particularly well suited to speaker forensics, a field that demands accurate identification of suspects and there-fore a minimum of obstacles to robust speaker recognition, such as those posed by channel transmission problems.

The section is rounded off by the study findings of C. Chandra Sekhar, professor at the Indian Institute of Technology (IIT), Chennai, India, and his graduate student assistant, A. D. Dileep. The authors meticulously show that when the performance of Intermediate Matching Kernel (IMK)-based Support Vector Machines (SVMs) is compared to that of state-of-the-art GMM-based approaches to speaker identifica-tion (using the 2002 and 2003 NIST speaker recognition corpora in evaluation of different approaches to speaker identification), the IMK-based SVMs performed significantly better than the GMM-based approaches for speaker identification

Page 10: Forensic Speaker Recognition

xixiPreface

tasks. From this comparison, the authors draw the conclusion that because IMK-based SVMs are well suited to the basic challenges of providing reliable scores for intra-speaker variation of suspects and for inter-speaker variation within a potential population, they can play an important role in serving the needs of law enforcement and counter-terrorism agencies in performing forensic speaker recognition.

The final section of the book, titled “Applications to Law Enforcement and Counter-Terrorism,” enlightens the reader about practical constraints in the use of forensic speaker recognition systems for the daily concerns of law enforcement and counter-terrorism agencies. The section begins with the research of V. Ramasubra-manian, who serves as a senior member of Siemen’s (Bangalore) technical staff, on automated telephony surveillance to detect if a person from a specific government watch-list is on the line at a given moment. As he points out:

[S]uch an automatic solution is of considerable interest in the context of homeland security, where a potentially large number of wire tapped conversations may have to be processed in parallel, in different deployment scenarios and demographic conditions, and with typically large watch-lists, all of which make manual lawful interception unmanageable, tedious and perhaps even impossible.

His chapter begins with the “basic framework for watch-list based speaker-spotting, namely, open-set speaker identification, subsequently refined into a ‘multi-target detection’ framework.” Dr. Ramasubramanian examines in detail “the main theo-retical analysis available within the framework of multi-target identification, lead-ing to performance predictions of such systems with respect to the watch-list size as the critical factor.”

Taking an applications-oriented approach to forensic speaker recognition, he then outlines related speech topics—speaker change detection, speaker segmenta-tion and speaker diarization—that can be useful in the design of automated telep-hony surveillance for border security and protecting critical infrastructure. These and other issues and concerns inhabit the broader context of homeland security. The author concludes with a summary of product level solutions currently available in the context of surveillance and homeland security applications, while acknowl-edging the realistic challenges and limitations faced by automated speaker-spotting systems.

Next, Patrick Perrot of the Forensic Research Institute of the French Gendarmer-ie and Gerard Chollet of Telecom-Paris take up the fascinating topic of criminals who disguise their voices to hide their actual identity, sometimes even impersonat-ing someone else. Such disguises typically occur when criminals make telephone threats, malicious calls, extortion attempts and/or blackmail, or terrorist demands. The authors point out that while

there are those cases when there are involuntary voice changes, as when there are alterations in voice characteristics due to poor transmission of telephonic communication … or even pathologies (both acute and chronic) that morph speech production … we limit this discussion to disguise which consists of a person who deliberately conceals his identity … as a means of misleading the human ear or even the automatic speaker recognition system.

Page 11: Forensic Speaker Recognition

xiixii Preface

Drs. Perrot and Chollet focus on specific voice characteristics to evaluate the recognition of a suspect’s voice in the presence of voice disguise. Their analyses of voice transformation are based both on an acoustic approach, which they use to measure specific changes in speech, and on an automatic approach, which is em-ployed to detect voice disguise. The acoustic analysis of specific features reveals that the effect of the disguise on voice characteristics is dependent upon the kind of disguise that is used, while in the automatic experiment the authors performed, they found that parallel fusion and SVM classifier provided the best results with a good level of discrimination.

The practical applications of these two French scientists’ work can be seen in the fact that a major part of their research into voice disguise has been devoted to the study of voice disguise reversibility. Their studies of voice disguise reversibility have revealed that

while it is not possible today to fully reverse a voice disguise in such a way that that the resulting waveform would sound completely natural to a listener (mainly due to limitations with the quality of converted voice synthesis), our study demonstrates, nevertheless, that a disguised voice could be reversed to a relatively “normal” voice as evaluated by current state of the art speaker verification systems.

Thus, the authors see a more robust speaker recognition on a reverse-disguised voice—that is, a voice that has already been converted back from its disguised form to normal speech—as a future practical application of their research, as well as evaluation of the performance of speech applications in such contexts.

The last two chapters of the book are authored by speech experts at Nuance Communications and Loquendo. Chuck Buffum, Nuance’s Vice President, provides insightful lessons learned from commercial voice biometric deployments to foren-sic applications, giving the reader a better understanding of the evolution of speaker verification systems in forensic settings. Mr. Buffum points out that “commercial deployments of voice biometrics have predictably focused primarily on automating the correct acceptance of true users for telephony self-service. However, over the past few years, a trend has developed within the financial institutions to begin us-ing voice biometric technology to look for duplicate enrollments or to investigate suspicious transaction activity,” a trend that, he contends, “opens the discussion of bringing relevant techniques and experiences from commercial voice biometric deployments into the forensic voice biometric space.”

Avery Glasser, consulting architect for the Italian-based company Loquendo, closely examines the practical needs of anyone wishing to implement investiga-tory voice biometric technology, and how best to bridge the gap between creators and implementers of this technology. The author points out that “there are critical problems that only voice biometrics can solve, but getting the solutions well posi-tioned requires a deep understanding of the nature of government implementations that seems to escape the grasp of too many vendors. The chapter,” according to Mr. Glasser’s exordium, “will explore a number of critical use cases and provide perspective on how technology creators can position their solutions to meet those needs.”

Page 12: Forensic Speaker Recognition

xiiixiiiPreface

As the editors of this compendium we have endeavored to bring together notable forensic speaker recognition experts who, by virtue of their meticulous research and keen attentiveness to the needs of law enforcement and counter-terrorism agencies, have both individually and collectively brought forensic automatic speaker recogni-tion (FASR) technology to a new plane. It is our hope that this science will continue to evolve so that the admissibility of speaker recognition evidence will no longer present a Sisyphean challenge to prosecutors who have come to depend on voice verification systems to make a convincing case to the court about the identity of a criminal suspect.

Fort Lee, NJ, USA Amy Neustein, Ph.D.Gandhinagar, India Hemant A. Patil, Ph.D.

Page 13: Forensic Speaker Recognition

xv

Contents

Part I Forensic Case Work

1 Historical and Procedural Overview of Forensic Speaker Recognition as a Science ............................................................................ 3Kanae Amino, Takashi Osanai, Toshiaki Kamada, Hisanori Makinae and Takayuki Arai

2 Automatic Speaker Recognition for Forensic Case Assessment and Interpretation ................................................................. 21Andrzej Drygajlo

3 Aural/Acoustic vs. Automatic Methods in Forensic Phonetic Case Work ................................................................................... 41Anders Eriksson

4 Speaker Profiling: The Study of Acoustic Characteristics Based on Phonetic Features of Hindi Dialects for Forensic Speaker Identification................................................................................ 71Manisha Kulshreshtha, C. P. Singh and R. M. Sharma

Part II Speech Signal Degradation: Managing Problematic Conditions Affecting Probative Speech Samples

5 Speech Under Stress and Lombard Effect: Impact and Solutions for Forensic Speaker Recognition ....................... 103John H. L. Hansen, Abhijeet Sangwan and Wooil Kim

6 Speaker Identification over Narrowband VoIP Networks ..................... 125Hemant A. Patil, Aaron E. Cohen and Keshab K. Parhi

Page 14: Forensic Speaker Recognition

xvi

7 Noise Robust Speaker Identification: Using Nonlinear Modeling Techniques ............................................................................. 153Raghunath S. Holambe and Mangesh S. Deshpande

8 Robust Speaker Recognition in Noisy Environments: Using Dynamics of Speaker-Specific Prosody ..................................... 183Shashidhar G. Koolagudi, K. Sreenivasa Rao, Ramu Reddy, Vuppala Anil Kumar and Saswat Chakrabarti

9 Characterization of Noise Associated with Forensic Speech Samples ...................................................................................... 205Jiju P. V., C. P. Singh and R. M. Sharma

10 Speech Processing for Robust Speaker Recognition: Analysis and Advancements for Whispered Speech ........................... 253John H. L. Hansen, Chi Zhang and Xing Fan

Part III Methods and Strategies: Analyzing Features of Speaker Recognition to Optimize Voice Verification System Performance in Legal Settings

11 Effects of the Phonological Contents and Transmission Channels on Forensic Speaker Recognition ......................................... 275Kanae Amino, Takashi Osanai, Toshiaki Kamada, Hisanori Makinae and Takayuki Arai

12 Aerodynamic and Acoustic Theory of Voice Production .................... 309T. V. Ananthapadmanabha

13 Prosodic Features for Speaker Recognition......................................... 365Leena Mary

14 Speaker Identification Using Intermediate Matching Kernel-Based Support Vector Machines .............................................. 389A. D. Dileep and C. Chandra Sekhar

Part IV Applications to Law Enforcement and Counter-Terrorism

15 Speaker Spotting: Automatic Telephony Surveillance for Homeland Security ................................................................................. 427V. Ramasubramanian

16 Helping the Forensic Research Institute of the French Gendarmerie to Identify a Suspect in the Presence of Voice Disguise or Voice Forgery ...................................................................... 469Patrick Perrot and Gérard Chollet

Contents

Page 15: Forensic Speaker Recognition

xvii

17 Applying Lessons Learned from Commercial Voice Biometric Deployments to Forensic Investigations ............................. 505Chuck Buffum

18 Designing Better Speaker Verification Systems: Bridging the Gap between Creators and Implementers of Investigatory Voice Biometric Technologies ........................................ 511Avery Glasser

About the Editors ........................................................................................... 529

Index ................................................................................................................ 531

Contents

Page 16: Forensic Speaker Recognition

xix

Contributors

Kanae Amino, Ph.D. National Research Institute of Police Science, 6-3-1 Kashiwanoha, Kashiwa-shi, Chiba 277-0882, Japane-mail: [email protected]

T. V. Ananthapadmanabha, Ph.D. Voice and Speech Systems, 53, “Girinivas”, Temple Road, 13th Cross, Malleswaram, Bangalore 560003, Indiae-mail: [email protected], [email protected]

Takayuki Arai, Ph.D. Department of Electrical and Electronics Engineering, Sophia University, 7-1 Kioi-cho, Chiyoda-ku, Tokyo 102-8554, Japane-mail: [email protected]

Chuck Buffum, B.S. Nuance Communications, 1198 E. Arques Avenue, Sunnyvale, CA 94085, USAe-mail: [email protected]

Saswat Chakrabarti, Ph.D. G.S. Sanyal School of Telecommunications, Indian Institute of Technology Kharagpur, Kharagpur 721302, West Bengal, Indiae-mail: [email protected]

Gérard Chollet, Ph.D. CNRS-LTCI, Telecom ParisTech, 46 rue Barrault, Paris 75013, Francee-mail: [email protected]

Aaron E. Cohen, Ph.D. Leanics Corporation, 1313 5t St. SE, Mail Unit 70, Minneapolis, MN 55414, USAe-mail: [email protected]

Mangesh S. Deshpande, M.E. Department of Electronics and Telecommunication Engineering, SRES’s College of Engineering, Kopargaon 423603, Maharashtra, India

A. D. Dileep, M. Tech. Department of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai 600036, Tamilnadu, India

Andrzej Drygajlo, Ph.D. EPFL Speech Processing and Biometrics Group, UNIL School of Criminal Justice, Swiss Federal Institute of Technology Lausanne (EPFL), University of Lausanne (UNIL), Lausanne, Switzerlande-mail: [email protected]

Page 17: Forensic Speaker Recognition

xxxx

Anders Eriksson, Ph.D. Phonetics, Department of Philosophy, Linguistics and Theory of Science, University of Gothenburg, Box 200, 40530 Gothenburg, Swedene-mail: [email protected]

Xing Fan, B.S.A. Center for Robust Speech Systems (CRSS), Department of Electrical Engineering, Erik Jonsson School of Engineering and Computer Science, The University of Texas at Dallas, Richardson, TX 75080-3021, USA

Avery Glasser, B.L.S. Loquendo S.p.A., a Telecom Italia Group Company, Via Arrigo Olivetti, 6, 10148 Torino, Italye-mail: [email protected]

John H. L. Hansen, Ph.D. Department of Electrical Engineering, Center for Robust Speech Systems (CRSS), Erik Jonsson School of Engineering and Computer Science, The University of Texas at Dallas, Richardson, TX 75080-3021, USAe-mail: [email protected]

Raghunath S. Holambe, Ph.D. Department of Instrumentation Engineering, SGGS Institute of Engineering and Technology, Nanded 431606, Maharashtra, Indiae-mail: [email protected]

Toshiaki Kamada, B.E. National Research Institute of Police Science, 6-3-1 Kashiwanoha, Kashiwa-shi, Chiba 277-0882, Japane-mail: [email protected]

Wooil Kim, Ph.D. Department of Electrical Engineering, Center for Robust Speech Systems (CRSS), Erik Jonsson School of Engineering and Computer Science, The University of Texas at Dallas, Richardson, TX 75080-3021, USAe-mail: [email protected]

Shashidhar G. Koolagudi, M. Tech. School of Information Technology, Indian Institute of Technology Kharagpur, Kharagpur 721302, West Bengal, Indiae-mail: [email protected]

Manisha Kulshreshtha, Ph. D. Haskins Laboratories, Yale University, 300 George St., Suite 900, New haven, CT 06511, USA

Vuppala Anil Kumar, M. Tech. G.S. Sanyal School of Telecommunications, Indian Institute of Technology Kharagpur, Kharagpur 721302, West Bengal, Indiae-mail: [email protected]

Hisanori Makinae, Ph.D. National Research Institute of Police Science, 6-3-1 Kashiwanoha, Kashiwa-shi, Chiba 277-0882, Japane-mail: [email protected]

Leena Mary, Ph.D. Rajiv Gandhi Institute of Technology, Kottayam 686501, Kerala, Indiae-mail: [email protected]

Takashi Osanai, Ph.D. National Research Institute of Police Science, 6-3-1 Kashiwanoha, Kashiwa-shi, Chiba 277-0882, Japane-mail: [email protected]

Contributors

Page 18: Forensic Speaker Recognition

xxixxi

Keshab K. Parhi, Ph.D. Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN 55455, USAe-mail: [email protected]

Hemant A. Patil, Ph.D. Dhirubhai Ambani Institute of Information and Communication Technology, DA-IICT, Gandhinagar, Indiae-mail: [email protected]

Patrick Perrot, Ph.D. Gendarmerie Operational Unit, Gendarmerie Nationale, 3 rue de Dampierre, 17400 Saint Jean d’Angely, Francee-mail: [email protected]

Jiju P.V., M.Sc. Documents Division, Forensic Science Laboratory, Govt. of NCT of Delhi, Madhuban Chowk, Rohini, New Delhi 110085, India

V. Ramasubramanian, Ph.D. Siemens Corporate Research & Technologies—India, Bangalore 560100, Indiae-mail: [email protected]

K. Sreenivasa Rao, Ph.D. School of Information Technology, Indian Institute of Technology Kharagpur, Kharagpur 721302, West Bengal, Indiae-mail: [email protected]

Ramu Reddy, B. Tech. School of Information Technology, Indian Institute of Technology Kharagpur, Kharagpur 721302, West Bengal, Indiae-mail: [email protected]

Abhijeet Sangwan, Ph.D. Department of Electrical Engineering, Center for Robust Speech Systems (CRSS), Erik Jonsson School of Engineering and Computer Science, The University of Texas at Dallas, Richardson, TX 75080-3021, USAe-mail: [email protected]

C. Chandra Sekhar, Ph.D. Speech and Vision Laboratory, Department of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai 600036, Tamilnadu, Indiae-mail: [email protected]

R. M. Sharma, Ph.D. Department of Forensic Science, Punjabi University, Patiala 147002, Punjab, Indiae-mail: [email protected]

C. P. Singh, Ph.D. Physics Division, Forensic Science Laboratory, Government of NCT of Delhi, Madhuban Chowk, Rohini, New Delhi 110085, India

Chi Zhang, M.S.E.E. Center for Robust Speech Systems (CRSS), Department of Electrical Engineering, Erik Jonsson School of Engineering and Computer Science, The University of Texas at Dallas, Richardson, Texas 75080-3021, USA

Contributors