Detection of Alertness Based on Analysis of Speech Signal Pulak Sarangi Ojaswa Anand Induja Sreekant Bibek Kabi Under the Guidance of Prof. Aurobinda Routray Department of Electrical Engineering Indian Institute of Technology Kharagpur
Jul 18, 2015
Detection of Alertness Based on Analysis of Speech Signal
Pulak Sarangi
Ojaswa Anand
Induja Sreekant
Bibek Kabi
Under the Guidance ofProf. Aurobinda RoutrayDepartment of Electrical EngineeringIndian Institute of Technology Kharagpur
Objectives
• Design and Develop System Capable ofdetecting alertness of a person by analyzingthe speech signal
• Implementation on GPU
• Implement the system on STM32Edevelopment board
• Implementation as app on Android 4.2(Jelly Bean, target API 17)
Work PlanWeek 1 • Literature Survey
Week 2 • Formulation of Algorithm
Week 3 • Algorithm testing on MATLAB
Week 4 • Conversion of MATLAB code to C code• Conversion of MATLAB code to JAVA code
Week 5 • Implementation on GPU• Implementation on STM32E• Implementation on Android platform
1. Model As Implemented in MATLAB and C/C++
RecordingFormation of
Henkel Matrix
Noise Removal
using SVD
De-framing for Enhanced
Signal
Framing & Windowing
Extraction of Wavelet Features
Classificationof voiced/ silence
parts based on energy
S(n)
Selection of Wavelet Features
Enhanced Speech
Segmentation of speech signal
into overlapping
samples
6 level Decomposition of signal using
Daubechies wavelet
Computation of ratio of 62.5-
1000Hz energy to the total energy
E(i)
Comparison with
threshold
VoicedSilence
E(i) input
>0.8<0.3
Single Segment with same pre
& post segment
Series Segment with same pre
& post segment
Single or Series with different
pre & post segment
Classification
PROGRESS• Fully Functional MATLAB & C/C++ code
• Fully Functional Java Code
• Literature survey for implementation of C/JAVA code onto
Embedded/ANDROID platform and GPU respectively.
2. Model As Implemented in MATLAB and C/C++
RecordingFormation of
Henkel Matrix
Noise Removal
using SVD
De-framing for Enhanced
Signal
Framing & Windowing
Feature Extraction (MFCCs, LPCCs)
Classificationof voiced/ silence
parts based on Generalized Eigenvalue
S(n)
Observation• After feature extraction instead of independent statistical properties
like mean, standard deviation, kurtosis, etc. covariance property was
taken into consideration, making processing much faster.
3. Model As Implemented in MATLAB and C/C++
RecordingFormation of
Henkel Matrix
Noise Removal
using SVD
De-framing for Enhanced
Signal
Framing & Windowing
Feature extraction (MFCCs, LPCCs)
Classificationof voiced/ silence parts by GMM, SVM classifier
S(n)
PLAN FOR FURTHER WORK• Implementation on GPU
• Implementation on STM32E development board
• Implementation as Android App for Android 4.2(API 17, Jelly Bean)
• Comparison of Results with other algorithms
Voiced and Unvoiced Sounds• Fundamental difference :
o Vibrations of the vocal cords produce voiced sounds. o Rate at which the vocal cords vibrate dictates the pitch of the sound.o Unvoiced sounds do not rely on the vibration of the vocal cords. o Unvoiced sounds are created by the constriction of the vocal tract. o Vocal cords remain open and the constrictions of the vocal tract force air out to produce
the unvoiced sounds
• The fundamental frequency of voiced segments is ranged from 60-500Hz• The ratio between the energy of the bands between 62.5 Hz and 1000Hz to that of all bands
is computed and used in our algorithm as the fundamental parameter in formulating the V/UV decision.