MAJOR PROJECT FINAL PRESENTATION : TEXT PROMPTED REMOTE SPEAKER AUTHENTICATION Project Members: Ganesh Tiwari (75010) Madhav Pandey(75014) Manoj Shrestha(75018) Project Supervisor : Dr. Subarna Shakya Associate Professor Internal Examiner: Er. Manoj Ghimire External Examiner Er. Bimal Acharya Tribhuvan University Institute of Engineering Pulchowk Campus Department of Electronics and Computer Engineering
31
Embed
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recognition/Verification System Final Presentation Slide
Joint Speech and Speaker Recognition using Hidden Markov Model/Vector Quantization for speaker independent Speech Recognition and Gaussian Mixture Model for speech independent speaker recognition- used MFCC (Mel-Frequency Cepstral Coefficient) for Feature Extraction (delta,delta delta and energy - 39 coefficients). Developed in JAVA with client/server Architecture, web interface developed in Adobe Flex. This project was done at TU, IOE - Pulchowk Campus, Nepal. For more details visit http://ganeshtiwaridotcomdotnp.blogspot.com
ABSTRACT OF PROJECT>>>
Biometric is physical characteristic unique to each individual. It has a very useful application in authentication and access control. The designed system is a text-prompted version of voice biometric which incorporates text-independent speaker verification and speaker-independent speech verification system implemented independently. The foundation for this joint system is that the speech signal conveys both the speech content and speaker identity. Such systems are more-secure from playback attack, since the word to speak during authentication is not previously set. During the course of the project various digital signal processing and pattern classification algorithms were studied. Short time spectral analysis was performed to obtain MFCC, energy and their deltas as feature. Feature extraction module is same for both systems. Speaker modeling was done by GMM and Left to Right Discrete HMM with VQ was used for isolated word modeling. And results of both systems were combined to authenticate the user. The speech model for each word was pre-trained by using utterance of 45 English words. The speaker model was trained by utterance of about 2 minutes each by 15 speakers. While uttering the individual words, the recognition rate of the speech recognition system is 92 % and speaker recognition system is 66%. For longer duration of utterance (>5sec) the recognition rate of speaker recognition system improves to 78%.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
MAJOR PROJECT FINAL PRESENTATION :
TEXT PROMPTED REMOTE
SPEAKER AUTHENTICATION
Project Members:
Ganesh Tiwari (75010)
Madhav Pandey(75014)
Manoj Shrestha(75018)
Project Supervisor :
Dr. Subarna Shakya
Associate Professor
Internal Examiner:
Er. Manoj Ghimire
External Examiner
Er. Bimal Acharya
Tribhuvan University
Institute of Engineering
Pulchowk Campus
Department of Electronics and Computer Engineering
INTRODUCTION
Voice biometric system
User login
Text-Prompted system
Claimant is asked to speak a prompted(random) text
Speech and Speaker Recognition
Why Text prompted ?
Playback attack
OUR SYSTEM
Feature : MFCC
Modeling and Classifications : both statistical
GMM - Speaker Modeling :
HMM/VQ - Speech Modeling :
PROPERTIES OF SPEECH SIGNAL
Carries both Speech Content and Speaker identity
What makes Speech Signal Unique ?
Each phoneme resonates at its own fundamental frequency
and harmonics of it
Studied over short period : short time spectral analysis
What is Speaker Dependent information
Fundamental frequency, primarily
function of the dimensions and tension of the vocal chords
size and shape of the mouth, throat, nose, and teeth
Studied over long period : all the variations from that speaker