Page 1
Voice Impairments Detection Libraryv.2.1.3.283
Issues
Concurrent Operation
alternative 1
alternative 2
List of Functions
CSST_SDK_API bool CSST_InitLib(void);
CSST_SDK_API void CSST_ReleaseLib(void);
CSST_SDK_API const wchar_t* CSST_GetVersion(void);
CSST_SDK_API CSContext * CSST_CreateProcessor(ESContextsTypes aCType);
CSST_SDK_API void ReleaseProcessor(CSContext * aPProcessor);
CSST_SDK_API long CSST_PutSound(CSContext * aPProcessor, short * aPSamples, long aNSamples);
CSST_SDK_API int CSST_GetFrameSize(CSContext * aPProcessor);
CSST_SDK_API long CSST_GetSampleRate(CSContext * aPProcessor);
CSST_SDK_API long CSST_GetNChannels(CSContext * aPProcessor);
CSST_SDK_API void CSST_SetSampleRate(CSContext * aPProcessor, long aSmplRate);
CSST_SDK_API void CSST_SetNChannels(CSContext * aPProcessor, long aNChannels);
CSST_SDK_API TSResult CSST_GetResult(CSContext * aPProcessor, long aChannel);
int GetLastErrorCode(void);
Usage Examples
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 1100530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 1276911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 2
Signal Noise Ratio Calculator (SNR)
Clipping detection
Echo detection
Sample code of the library usage
Sample source code
Delivery:
Compilation
Voice Impairments Detection Library passively detects different impairments in speech signal that degrade voice
perception quality. The library is based on comprehensive algorithms of digital signal processing represented as
separate processors, which user can access as unified virtual classes that have identical interfaces.
Issues
Concurrent Operation
alternative 1Results ProcessSamples(List<short> samples) for (int n=0;n<samples.size();n++) processSample(samples.at(n)); return finalAnalysisResult();
alternative 2// Add decoded audio data of single RTP packet ...void AddSamples(List<short> samples
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 3
for (int n=0;n<samples.size();n++) processSample(samples.at(n));
// Add decoded audio data of single RTP packet ...void AddSamples(List<short> samples for (int n=0;n<samples.size();n++) processSample(samples.at(n));
.
. // repeat several times
.
.
// Add decoded audio data of single RTP packet ...void AddSamples(List<short> samples for (int n=0;n<samples.size();n++) processSample(samples.at(n));
// Get results of waveform analysisResults GetAnalysisResult() return finalAnalysisResult();
List of Functions
The following functions are used to initialize and release the library:
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 4
CSST_SDK_API bool CSST_InitLib(void);
Library initialization/loading. To be called to start working with the library. On successful load of the library the
function returns True and False in case of failure.
CSST_SDK_API void CSST_ReleaseLib(void);
Release function is called to finish working with the library.
CSST_SDK_API const wchar_t* CSST_GetVersion(void);
Returns library version string.
To work with the library one has the following functions:
CSST_SDK_API CSContext * CSST_CreateProcessor(ESContextsTypes aCType);
- create processor; identifier of the processor is set by aCType parameter. ESContextsTypes contains list of
possible processors presented in the table below:
Identifies Description
esctSNRCalculator SNR calculation
esctClippingDetector Clipping impairment detection
esctEchoDetector Echo impairment detection
esctClickDetector Clicking detection
esctStuckDetector Stuck impairment detection
esctUnknownDetector Unknown processor
On success function returns pointer to unified processor. On error the function returns NULL. Function
GetLastErrorCode returns error code.
CSST_SDK_API void ReleaseProcessor(CSContext * aPProcessor);
- removes processor and all associated data from memory.
CSST_SDK_API long CSST_PutSound(CSContext * aPProcessor, short * aPSamples, long
aNSamples);
- add sound date to processor aPProcessor. aPSamples is array of signal samples, aNSamples – number of
samples. In case of multichannel sound samples are placed into aPSamples in the following order:
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 5
aPSamples[0] = sample 0 channel 1
aPSamples[1] = sample 0 channel 2
...
aPSamples[N-1] = sample 0 channel N
aPSamples[N] = sample 1 channel 1
aPSamples[N+1] = sample 1 channel 2
...
aPSamples[aNSamples-1] = sample M-1 channel N
Channel numbering starts with 1.
CSST_SDK_API int CSST_GetFrameSize(CSContext * aPProcessor);
- returns size of working data buffer of aPProcessor in samples. Buffer size defines minimal number of samples
required by processor to perform analysis. Function CSST_PutSound has only one requirement: input data must
be multiple to the number of channels and may not necessarily correspond to the frame size. Processors
automatically split data into buffers, but working with buffers gives users a possibility to receive results within
every frame.
CSST_SDK_API long CSST_GetSampleRate(CSContext * aPProcessor);
- returns signal sampling frequency value that aPProcessor is set to work with.
CSST_SDK_API long CSST_GetNChannels(CSContext * aPProcessor);
- returns number of speech channels the aPProcessor is set to work with.
CSST_SDK_API ESCodecsTypes CSST_GetCodecType(CSContext * aPProcessor);
- returns identifier of the initial signal coding algorithm.
Setting the initial compression algorithm allows the system to adapt more exactly to the parameters of the
signal, to improve the used compression algorithm and to identify more accurately the various types of
distortion. Codecs known to the system are listed in ESCodecsTypes.
The following values are foreseen/defined:
Identifier Description
esctNotCoded samples without coding
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 6
esctG711_ALaw G.711 A-Law
esctG711_ULaw G.711 U-Law
esctUnKnownCodec Any other codec
CSST_SDK_API void CSST_SetSampleRate(CSContext * aPProcessor, long aSmplRate);
- Passes aPProcessor sampling frequency value of the signal – aSmplRate. By default sampling frequency is set to
8kHz. It's important to set actual sampling frequency because many algorithms depend on that and this results
into correctness of processing results.
CSST_SDK_API void CSST_SetNChannels(CSContext * aPProcessor, long aNChannels);
- passes aPProcessor number of channels (aNChannels) in the input signal.
CSST_SDK_API int CSST_SetCodecType(CSContext * aPProcessor, ESCodecsTypes aCType);
- passes identifier of initial coding algorithm to aPProcessor.
CSST_SDK_API TSResult CSST_GetResult(CSContext * aPProcessor, long aChannel);
- Returns results of aPProcessor work for channel aChannel joined in the following structure:
struct TSResult
ESContextsTypes dRType;
bool isValid;
long dChannel;
USUniResult dResult;
;
The structure has common fields for all types of processors: SNR, Clipping, Echo. Field dRType contains processor
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 7
type, which allows correctly select dResult union depending on processor type:
Field isValid contains flag of processing correctness as there are possible cases that processor will not be
able to provide results for certain frame. If result is invalid one can obtain the reason using function
GetLastErrorCode()
Field dChannel contains channel number that results are associated with.
In case of multichannel sound inquiry to channel 0 returns average result over all channels. Averaging
depends on the processor type.
Union of polytipic results is defined as the following structure:
union USUniResult
TSSNRResult dSNR;
TSClippingResult dClipping;
TSEchoResult dEcho;
TSClickResult dClicking;
TSStuckResult dStuck;
;
Each processor type corresponds to a certain result identifier:
dSNR – result of SNR processor;
dClipping – result of Clipping detection processor;
dEcho – result of Echo detection processor;
dClicking – result of Clicking detection processor;
dStuck – result of Stuck detection processor.
Channel numbering begins with one, if you ask the result for channel 0, you get the average result for all
channels that are processed. In all other cases the return value corresponds to the requested channel.
CSST_SDK_API int CSST_GetLastErrorCode(void);
- returns error code of the last error. After function call current error code is reset, so repeating call will return 0 if
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 8
no other error occurs in the meanwhile. Error codes are declared in CSpTErrors.h file and their description can be
found in the table below:
Code Identifier Description
0 errNoErrors No error
1 errSpeechToolIsNotInitialized Library is not initialized
2 errLicensesFileNameMissed License file name is missing
3 errLicensesFileOpeningError Error opening license file
4 errIncompatibleLicensesFile Incompatible license file
5 errIncompatibleHost License file was issued for another machine
6 errLibraryInUse The library has been already loaded and is in use
7 errLicensesTimeIsOver License period has expired
8 errAllChannelsAreOpened All channels are in use
9 errUnKnownProcessorType Unknown processor type
10 errNotEqualSamplesNumForChannel
s
The number of samples is not a multiple of the number of
channels
11 errIncorrectSampleRate Algorithm cannot work with the provided sampling frequency
12 errCannelNumOutOfRange Channel number is not in the range of sound channels of the
processor
13 errTooFewOfData Processor received too few data
14 errTooFewOfDataForFlySNR Too few data to calculate SNR mean value
15 errTooFewOfDataForFlyClipping Too few data to calculate clipping mean value
16 errSetContextInValideCrashed Crash in SetContextInValide
17 errInitLibCrashed Crash in CSST_InitLib
18 errGetLastErrorCodeCrashed Crash in CSST_ReleaseLib
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 9
19 errPooreProcessorPointer Poor pointer to Processor
20 errInValideProcessor Processor validity flag is false
21 errCreateProcessorCrasched Crash in CSST_CreateProcessor
22 errReleaseProcessorCrashed Crash in CSST_ReleaseProcessor
23 errPutSoundCrashed Crash in CSST_PutSound
Usage Examples
Signal Noise Ratio Calculator (SNR)
Signal Noise Ratio calculator (SNR processor) can be created by function CSST_CreateProcessor when using
parameter esctSNRCalculator. It performs sound processing by 240 samples and frame size may vary depending
on sampling frequency used.
When passing data into the calculator one does not need to consider frame size as SNR processor stores data and
processes it depending on buffer contents size. Results will be sent to structure TSSNRResult :
struct TSSNRResult
double dSNR;
double dEnergy;
double dFlySNR;
double dFlyEnergy;
long dNSamples;
double dTime;
;
Processor calculates signal energy level for each frame in dB. Last frame energy level is sent to dEnergy field.
dSNR field contains signal noise ratio from the beginning of the sound stream and up to current moment, which is
calculated as difference between maximal and minimal energy values.
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 10
dFlySNR and dFlyEnergy fields contain average values of signal noise ratio and energy for last 10 frames of the
signal. dNSamples field saves number of processed signal samples and field dTime stores duration of the
processed signal in seconds.
SNR Calculator returns the first result after receiving two valid frames of audio data.By valid we mean:
The energy level on the frame exceeds the threshold value;
The difference between the minimum and the maximum value exceeds the threshold;
The active signal started (we have found a maximum power value that exceeds the threshold value).
Amplitude Clipping detection
Amplitude Clipping detection processor can be created by function CSST_CreateProcessor using parameter value
esctClippingDetector. It performs sound processing by 80 samples and frame size may vary depending on
sampling frequency used.
When passing data into the processor one does not need to consider frame size as the processor stores data and
processes it depending on buffer contents size. Results will be sent to structure TSClippingResult :
struct TSClippingResult
double dFrameClpLevel;
double dFrameClpLevelWide;
double dClpLevel;
double dClpLevelWide;
double dFlyClpLevel;
double dFlyClpLevelWide;
long dNSamples;
double dTime;
;
Field dFrameClpLevel contains signal clipping level which corresponds to the number of clipped samples on
current frame to the frame length.
Field dFrameClpLevelWide contains clipping level of sequentially clipped samples and is calculated as ratio of
clipped sequence length to the frame length.
Fields dClpLevel and dClpLevelWide contain clipping level and sequential clipping level from the beginning of
audio stream till current moment.
Fields dFlyClpLevel and dFlyClpLevelWide store clipping level and sequential clipping level received as result of
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 11
averaging of the recent 10 processed frames.
Field dNSamples stores number of processed signal samples and field dTime stores duration of processed signal
in seconds.
Echo detection
Echo detection processor can be created by function CSST_CreateProcessor using parameter value
esctEchoDetector. It performs sound processing by 2000 samples and frame size may vary depending on sampling
frequency used.
When passing data into the processor one does not need to consider frame size as the processor stores data and
processes it depending on buffer contents size. Results will be sent to structure TSEchoResult:
struct TSEchoResult
double dEchoPower;
double dSpeechPower;
double dEchoLevel;
long dNSamples;
double dTime;
;
Fields dEchoPower and dSpeechPower contain minimal and maximal levels of signal autocorrelation and field
dEchoLevel contains their difference.
The current echo detection algorithm is based on echo compensator and fields dEchoPower and dSpeechPower
contain energy values of echo and initial signal. Field dEchoLevel is equal to the ratio of these two fields
multiplied by 100 (to return value in percentage).
Field dNSamples stores number of processed signal samples and field dTime stores duration of processed signal
in seconds.
Предусмотрены два варианта сборки библиотеки, в которых реализованы различные вариантыдетектирования эхо. В одном случае это корреляция, в другом – эхо-компенсатор. В первом вариантеобеспечивается большая скорость работы, но меньшая точность детектирования. Кроме того, впервом варианте в качестве значений, возвращаемых детектором, выступают коэффициентыподобия сигналов, а во втором – значения энергии сигналов.
There are two library variant, which implement different ways of detecting echo. In one case, this is done using
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 12
correlation, in the other - an echo canceller. The first option provides greater speed, but less accurate detection.
In addition, in the first realization the values returned by the detector are also coefficients of the similarity of
signals. In the second case they represent the energy of the signals.
Sample code of the library usageAs an example we provide source code of the main file /Sample/sptLibTest/mains/SPTTest.cpp of packet
processor for detecting voice impairments in sound files, which can detect different impairments and print out
place of their occurrencies in time domain.
// Standard header files#include <stdio.h>#include <stdlib.h>#include <string.h>#include <fcntl.h>#include <sys/timeb.h>#include <time.h>
#ifdef WIN32#include <io.h>#endif
#ifndef WIN32#include <dirent.h>#endif
// Threshold values to output impairments into a log file (user defined)#define cmFlySNRThresh 10#define cmFrameClpThresh 0#define cmWideFrameClpThresh 0#define cmEchoThresh 0.22
// Header files of the SpeechTool#include "../../DLL/include/CSpTErrors.h"#include "../../DLL/include/CSSpeechTool.h"
// Header files to work with audio recordings#include "../../Wave/WavFiles.h"#include "../../Wave/CSmtSamples.h"
// Array of processors nameschar * pCntNames[] = "SNRCalculator", // SNR processor "ClippingDetector", // Clipping detection processor "EchoDetector", // Echo detection processor "UnKnown", // Unknown processor;
// Structrue for processor list elementstruct TSContextsListItem CSContext * pContext;
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 13
double dDuration; TSContextsListItem * pNext;;
// List of connected processorsTSContextsListItem * pContextsList = NULL;
// Log fileFILE * iPRepFile;
// String transcoding from char into wchar_tstatic wchar_t * char2wchar(const char * aPStr, wchar_t * aPWStr, int aLen) int i, l; if (!aPStr) return(NULL); l = strlen(aPStr); if (l >= aLen) return(NULL); for(i = 0; i < l + 1; i++) aPWStr[i] = wchar_t(aPStr[i]); return(aPWStr);
// Creating processor with set functional identifiersint CreateProcessor(char * aPProcName) TSContextsListItem * iPNewLI; iPNewLI = new TSContextsListItem(); iPNewLI>dDuration = 0.0; if (strcmp(aPProcName, "snr") == 0) // SNR calculator iPNewLI>pContext = CSST_CreateProcessor(esctSNRCalculator); if (!iPNewLI>pContext) delete(iPNewLI); return(2); CSST_SetSampleRate(iPNewLI>pContext, 8000); CSST_SetNChannels(iPNewLI>pContext, 1);
// iPNewLI>pNext = pContextsList; pContextsList = iPNewLI; return(0); else if (strcmp(aPProcName, "clipp") == 0) // Clipping detector iPNewLI>pContext = CSST_CreateProcessor(esctClippingDetector);
if (!iPNewLI>pContext) delete(iPNewLI); return(3); CSST_SetSampleRate(iPNewLI>pContext, 8000); CSST_SetNChannels(iPNewLI>pContext, 1);
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 14
//
iPNewLI>pNext = pContextsList; pContextsList = iPNewLI; return(0); else if (strcmp(aPProcName, "echo") == 0) // Echo detector iPNewLI>pContext = CSST_CreateProcessor(esctEchoDetector); if (!iPNewLI>pContext) delete(iPNewLI); return(4);
CSST_SetSampleRate(iPNewLI>pContext, 8000); CSST_SetNChannels(iPNewLI>pContext, 1);
//
iPNewLI>pNext = pContextsList; pContextsList = iPNewLI; return(0);
delete(iPNewLI); return(1);
// Defines minimal processing step
long CalculateMinStep(void)
TSContextsListItem * iPNewLI;
long iRet, iTmp;
iRet = CSST_GetFrameSize(pContextsList>pContext);
for(iPNewLI=pContextsList>pNext; iPNewLI; iPNewLI=iPNewLI>pNext)
iTmp = CSST_GetFrameSize(iPNewLI>pContext);
if (iRet > iTmp) iRet = iTmp;
return(iRet);
// Sound processing
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 15
int ProcessWave(FILE * aPRepFile, short * aPSamples, long iSize)
TSContextsListItem * iPNewLI;
clock_t iStartTime;
clock_t iFinishTime;
double iDuration;
for(iPNewLI=pContextsList; iPNewLI; iPNewLI=iPNewLI>pNext)
iStartTime = clock();
if (CSST_PutSound(iPNewLI>pContext, aPSamples, iSize) != iSize)
return(1);
iFinishTime = clock();
iDuration = (double)(iFinishTime iStartTime) / CLOCKS_PER_SEC;
iPNewLI>dDuration += iDuration;
return(0);
// Prints into log file at each frame if imairment was detected
int PrintfFlyReport(FILE * aPRepFile, long aPos, double aTime)
TSContextsListItem * iPWrkLI;
TSResult iPrRes;
bool isPrinted = false;
for(iPWrkLI=pContextsList; iPWrkLI; iPWrkLI=iPWrkLI>pNext)
iPrRes = CSST_GetResult(iPWrkLI>pContext, 0);
//
if (iPrRes.isValid)
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 16
switch(iPrRes.dRType)
case esctSNRCalculator : // SNR calculator
if (iPrRes.dResult.dSNR.dFlySNR < cmFlySNRThresh)
if (!isPrinted) fprintf(iPRepFile, "Current File Position is %li (%lf sec)\n",aPos, aTime);
fprintf(aPRepFile, "dSNR = %lf\n", iPrRes.dResult.dSNR.dSNR);
fprintf(aPRepFile, "dFlySNR = %lf\n", iPrRes.dResult.dSNR.dFlySNR);
isPrinted = true;
break;
case esctClippingDetector : // Clipping detection
if ((iPrRes.dResult.dClipping.dFrameClpLevel >cmFrameClpThresh)||(iPrRes.dResult.dClipping.dFrameClpLevelWide > cmWideFrameClpThresh))
if (!isPrinted) fprintf(iPRepFile, "Current File Position is %li (%lf sec)\n",aPos, aTime);
fprintf(aPRepFile, "dFrameClpLevel = %lf\n",iPrRes.dResult.dClipping.dFrameClpLevel);
fprintf(aPRepFile, "dFrameClpLevelWide = %lf\n",iPrRes.dResult.dClipping.dFrameClpLevelWide);
isPrinted = true;
break;
case esctEchoDetector : // Echo detector
if (iPrRes.dResult.dEcho.dEchoLevel > cmEchoThresh)
if (!isPrinted) fprintf(iPRepFile, "Current File Position is %li (%lf sec)\n",aPos, aTime);
fprintf(aPRepFile, "dEchoPower = %lf\n", iPrRes.dResult.dEcho.dEchoPower);
fprintf(aPRepFile, "dSpeechPower = %lf\n", iPrRes.dResult.dEcho.dSpeechPower);
fprintf(aPRepFile, "dEchoLevel = %lf\n", iPrRes.dResult.dEcho.dEchoLevel);
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 17
isPrinted = true;
break;
case esctUnKnown : // Unknown processor
default:
break;
;
return(0);
// Prints final report into log file
int PrintfFinalReport(FILE * aPRepFile)
TSContextsListItem * iPWrkLI;
TSResult iPrRes;
fprintf(aPRepFile,"\n");
for(iPWrkLI=pContextsList; iPWrkLI; iPWrkLI=iPWrkLI>pNext)
iPrRes = CSST_GetResult(iPWrkLI>pContext, 0);
fprintf(aPRepFile, "Processor type : %i (%s)\n", iPWrkLI>pContext>dCType,pCntNames[iPWrkLI>pContext>dCType]);
if (!iPrRes.isValid) fprintf(aPRepFile, "Not resulted (%i)\n", GetLastErrorCode());
else
switch(iPrRes.dRType)
case esctSNRCalculator : // SNR Calculator
fprintf(aPRepFile, "dSNR = %lf\n", iPrRes.dResult.dSNR.dSNR);
fprintf(aPRepFile, "dEnergy = %lf\n", iPrRes.dResult.dSNR.dEnergy);
break;
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 18
case esctClippingDetector : // Clipping detector
fprintf(aPRepFile, "dClpLevel = %lf\n",iPrRes.dResult.dClipping.dClpLevel);
fprintf(aPRepFile, "dClpLevelWide = %lf\n",iPrRes.dResult.dClipping.dClpLevelWide);
break;
case esctEchoDetector : // Echo detector
fprintf(aPRepFile, "dEchoPower = %lf\n", iPrRes.dResult.dEcho.dEchoPower);
fprintf(aPRepFile, "dSpeechPower = %lf\n", iPrRes.dResult.dEcho.dSpeechPower);
fprintf(aPRepFile, "dEchoLevel = %lf\n", iPrRes.dResult.dEcho.dEchoLevel);
break;
case esctUnKnown : // Unknown processor
default:
break;
;
fprintf(aPRepFile,"\n");
return(0);
// Prints processors performance
int PrintfTimingReport(FILE * aPRepFile, long aDataSize)
TSContextsListItem * iPNewLI;
double iDataLens;
double iSpeedCoeff;
fprintf(aPRepFile,"\n");
iDataLens = double(aDataSize) / 8000;
for(iPNewLI=pContextsList; iPNewLI; iPNewLI=iPNewLI>pNext)
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 19
fprintf(aPRepFile, "Processor type : %i (%s)\n", iPNewLI>pContext>dCType,pCntNames[iPNewLI>pContext>dCType]);
if (iPNewLI>dDuration > 0.000001)
iSpeedCoeff = iDataLens / iPNewLI>dDuration;
fprintf(aPRepFile, "RealTime Coefficient = %lf\n", iSpeedCoeff);
else
fprintf(aPRepFile, "Too fast to calculate!\n");
fprintf(aPRepFile,"\n\n\n");
return(0);
// Release all processors
int ReleaseProcessors(void)
TSContextsListItem * iPWrkLI;
while(pContextsList)
iPWrkLI = pContextsList;
pContextsList = pContextsList>pNext;
ReleaseProcessor(iPWrkLI>pContext);
delete(iPWrkLI);
return(0);
//Processing single audio file
int ProcessOneFile(char * aWaveName, int argc, char * argv[])
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 20
CSmartSamples iInData;
short * iPDataShort;
long i, iPos, iStep, iEOFData, iSize;
double iTime;
fprintf(iPRepFile, "File : '%s'\n", aWaveName);
// Reading audio
if (iInData.ReadFromFile(aWaveName, 0) != 0)
iInData.Reset();
printf("Cannot open input file!\n");
fprintf(iPRepFile, "Cannot open input file!\n");
return(0);
// Creating processors
for(i=4; i<argc; i++)
if (CreateProcessor(argv[i]) != 0)
printf("Cannot create processor '%s'!\n", argv[i]);
fprintf(iPRepFile, "Cannot create processor '%s'!\n", argv[i]);
iInData.Reset();
fclose(iPRepFile);
return(0);
// Preparing sound data
iEOFData = iInData.GetNSamples();
iPDataShort = (short *)iInData.GetPSamplesArray(esstShort);
// Define maximal processing step
iStep = CalculateMinStep();
// Parse audio data
for(iPos=0; iPos<iEOFData; iPos+=iStep)
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 21
//Define size of processed block
if ((iPos + iStep) <= iEOFData) iSize = iStep;
else iSize = iEOFData iPos;
iTime = double(iPos) / 8000.0;
// Process sound
ProcessWave(iPRepFile, iPDataShort + iPos, iSize);
// Output of per frame report
PrintfFlyReport(iPRepFile, iPos, iTime);
// Output of final report
PrintfFinalReport(iPRepFile);
// Print processors'performance
PrintfTimingReport(iPRepFile, iEOFData);
// Release everything
ReleaseProcessors();
iInData.Reset();
return(0);
// Function main()
int main(int argc, char * argv[])
wchar_t iLicName[2048];
// Checking number of arguments
if (argc < 5)
printf("Usage:\n");
printf("spttest <licfile> <pathtowavs> <repfile> <procid00> [[procid01] ...[procidNN]]\n");
printf("<licfile> : Set the licenses file name;\n");
printf("<pathtowavs> : Set the path to wave files;\n");
printf("<repfile> : Set the output text file name;\n");
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 22
printf("<procidXX> : Set the sound processorc.\n");
printf(" : snr\n");
printf(" : clipp\n");
printf(" : echo\n");
return(0);
// Library initialization
char2wchar(argv[1], iLicName, 2048);
if (!CSST_InitLib(iLicName))
printf("Licenses data failed!\n");
return(0);
//Preparing log file
iPRepFile = fopen(argv[3], "w+t");
if (!iPRepFile)
printf("Cannot open output file!\n");
return(0);
#ifdef WIN32
//
// Windows related
//
static char dFullPathString[4096];
struct _finddata_t ffblk;
int handle;
int dNFiles, done;
// Filling in string with path
sprintf(dFullPathString, "%s\\*.wav", argv[2]);
// Take first .wav file
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 23
dNFiles = done = 0;
handle = _findfirst(dFullPathString, &ffblk);
// Scanning folder containing .wav files
while ((handle != 1)&&(done == 0))
if ((ffblk.attrib & _A_SUBDIR) == 0)
sprintf(dFullPathString, "%s\\%s", argv[2], ffblk.name);
ProcessOneFile(dFullPathString, argc, argv);
dNFiles++;
printf("%i files are processed!\r", dNFiles);
done = _findnext(handle, &ffblk);
_findclose(handle);
#else
//
// Linux related
//
struct dirent * iDirEntP;
DIR * iDirP;
iDirP = opendir(argv[2]);
if (iDirP == NULL) return(false);
// Scanning folder with .wav files
for(dNFiles = done = 0; ; done++)
iDirEntP = readdir(iDirP);
if (iDirEntP == NULL) break;
if (strstr(iDirEntP>d_name, ".wav"))
sprintf(dFullPathString, "%s/%s", argv[2], iDirEntP>d_name);
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 24
ProcessOneFile(dFullPathString, argv, argc);
closedir(iDirP);
#endif
printf("%i files are processed!\n", dNFiles);
fclose(iPRepFile);
// Releasing the library
CSST_ReleaseLib();
return(0);
Sample source code
Delivery:
Header files to work with audio (/Sample/Wave/). Classes implemented within the sample source code
are not allowed for commercial use and do not represent part of the product.
Header files of the library (/Sample/DLL/Include/). File CspTErrors.h contains error codes and file
CSSpeechTool.h contains headers of all library functions.
To ease sample code compilation archive include batch file CompileSptLibTest.bat (/Sample/)
Libray binary is libSpeechTool_static.a (/Sample/)
Compiled sample application spt-lib-test (/Sample/)
Sound file for testing clean.wav (/Sample/)
Batch file for a test run 2.bat (/Sample/)
Compilation
One can compile the sample source code by invoking batch file CompileSptLibTest.bat which has the following
line:
g++ -O2 ./sptLibTest/mains/SPTTest.cpp ./libSpeechTool_static.a -o ./spt-lib-test
As result of compilation one receives a program that can find places having impairments that the library can
detect in audio files. To run the program use the following command line options:
spt-lib-test <lic-file> <path-to-wavs> <rep-file> <proc-id-00> [[proc-id-01] ... [proc-id-NN]]
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 25
where
<lic-file> - license file, which is delivered by the vendor
<path-to-wav> - path to folder containg .wav files
<rep-file> - log/report file name
<proc-id-00> [[proc-id-01] ... [proc-id-NN]] – list of processors identifies; currently the library supports the
following processors:
SNR – signal/noise ratio processor
clipp – clipping level calculating processor
echo – echo level calculating processor
This is a new document.
For measuring the performance of different detectors a SW called Performance.cpp was developed.
The program can be compiled with any version of the library (Performance-test-corr - built with library that uses
the algorithm based on correlation, Performance-test-echo - built with library that working with the echo
canceller).
The command line invocation of the software is:
Performance-test-XXXX <proc-id> <ch-1-wave> [ch-2-wave]
where:
<proc-id> : the ID of the sound processors:);
: snr
: clipp
: echo
: click
: stuck
<ch-1-wave> : path of channel 1 wave file;
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Page 26
[ch-2-wave] : path of channel 2 wave file.
The parameters passed to the program are:
the ID of the analyzing detector and one or two audio data files. Only the echo detector needs two files of audio
data. All other detectors operate with a single file.
After starting the program a series of performance measurements are done, the mean and variance are
calculated. The resulting value of performance will be defined for a single processor core of the computer that
has executed the software.
Copyright © Sevana Ltd, 2012
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165
Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178