Sevana Voice Impairments Detection Library

Voice Impairments Detection Libraryv.2.1.3.283

Issues

Concurrent Operation

alternative 1

alternative 2

List of Functions

CSST_SDK_API bool CSST_InitLib(void);

CSST_SDK_API void CSST_ReleaseLib(void);

CSST_SDK_API const wchar_t* CSST_GetVersion(void);

CSST_SDK_API CSContext * CSST_CreateProcessor(ESContextsTypes aCType);

CSST_SDK_API void ReleaseProcessor(CSContext * aPProcessor);

CSST_SDK_API long CSST_PutSound(CSContext * aPProcessor, short * aPSamples, long aNSamples);

CSST_SDK_API int CSST_GetFrameSize(CSContext * aPProcessor);

CSST_SDK_API long CSST_GetSampleRate(CSContext * aPProcessor);

CSST_SDK_API long CSST_GetNChannels(CSContext * aPProcessor);

CSST_SDK_API void CSST_SetSampleRate(CSContext * aPProcessor, long aSmplRate);

CSST_SDK_API void CSST_SetNChannels(CSContext * aPProcessor, long aNChannels);

CSST_SDK_API TSResult CSST_GetResult(CSContext * aPProcessor, long aChannel);

int GetLastErrorCode(void);

Usage Examples

Copyright © Sevana Ltd, 2012

Sevana Oy

Agricolankatu 1100530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 1276911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.e8u36yb2l3u5

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.mptgsgt0ewy5

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.sqmrk96c6ynb

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.c2bb5wzeozoj

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.j463e0ppxmy9

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.ihquswqr8nj

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.fdpc1fq0l76

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.4wjw6cemzudy

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.4buy8rtwalf

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.3rjqkmqdhna2

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.zihxx3gijmex

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.98cu18wkjx67

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.375updjgiyv3

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.jb81vrinql2b

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.c10ysoadmgpv

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.ihkj0hgpej4m

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.1n41z8fclvjo

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.tkv07xvfcnuf

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.s31zd03ku699

Signal Noise Ratio Calculator (SNR)

Clipping detection

Echo detection

Sample code of the library usage

Sample source code

Delivery:

Compilation

Voice Impairments Detection Library passively detects different impairments in speech signal that degrade voice

perception quality. The library is based on comprehensive algorithms of digital signal processing represented as

separate processors, which user can access as unified virtual classes that have identical interfaces.

Issues

Concurrent Operation

alternative 1Results ProcessSamples(List<short> samples) for (int n=0;n<samples.size();n++) processSample(samples.at(n)); return finalAnalysisResult();

alternative 2// Add decoded audio data of single RTP packet ...void AddSamples(List<short> samples


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.v7fxobbyalel

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.fk6i8su2mevv

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.6e7xj39o399d

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.1tosf1pjadii

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.p12a9wdt40st

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.ykfv7gk52ubl

https://docs.google.com/document/d/s84dC5fZ-m0cP9w_FC6cM-A/headless/print#heading=h.l8b4ps8mqfgx

http://www.google.com/url?q=http%3A%2F%2Fsamples.at&sa=D&sntz=1&usg=AFQjCNF-F973L4shHu8EzjSV-GpJu7_4cw

for (int n=0;n<samples.size();n++) processSample(samples.at(n));

// Add decoded audio data of single RTP packet ...void AddSamples(List<short> samples for (int n=0;n<samples.size();n++) processSample(samples.at(n));

.

. // repeat several times

.

.

// Add decoded audio data of single RTP packet ...void AddSamples(List<short> samples for (int n=0;n<samples.size();n++) processSample(samples.at(n));

// Get results of waveform analysisResults GetAnalysisResult() return finalAnalysisResult();

List of Functions

The following functions are used to initialize and release the library:


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178




CSST_SDK_API bool CSST_InitLib(void);

Library initialization/loading. To be called to start working with the library. On successful load of the library the

function returns True and False in case of failure.

CSST_SDK_API void CSST_ReleaseLib(void);

Release function is called to finish working with the library.

CSST_SDK_API const wchar_t* CSST_GetVersion(void);

Returns library version string.

To work with the library one has the following functions:

CSST_SDK_API CSContext * CSST_CreateProcessor(ESContextsTypes aCType);

- create processor; identifier of the processor is set by aCType parameter. ESContextsTypes contains list of

possible processors presented in the table below:

Identifies Description

esctSNRCalculator SNR calculation

esctClippingDetector Clipping impairment detection

esctEchoDetector Echo impairment detection

esctClickDetector Clicking detection

esctStuckDetector Stuck impairment detection

esctUnknownDetector Unknown processor

On success function returns pointer to unified processor. On error the function returns NULL. Function

GetLastErrorCode returns error code.

CSST_SDK_API void ReleaseProcessor(CSContext * aPProcessor);

- removes processor and all associated data from memory.

CSST_SDK_API long CSST_PutSound(CSContext * aPProcessor, short * aPSamples, long

aNSamples);

- add sound date to processor aPProcessor. aPSamples is array of signal samples, aNSamples – number of

samples. In case of multichannel sound samples are placed into aPSamples in the following order:


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

aPSamples[0] = sample 0 channel 1

aPSamples[1] = sample 0 channel 2

...

aPSamples[N-1] = sample 0 channel N

aPSamples[N] = sample 1 channel 1

aPSamples[N+1] = sample 1 channel 2

...

aPSamples[aNSamples-1] = sample M-1 channel N

Channel numbering starts with 1.

CSST_SDK_API int CSST_GetFrameSize(CSContext * aPProcessor);

- returns size of working data buffer of aPProcessor in samples. Buffer size defines minimal number of samples

required by processor to perform analysis. Function CSST_PutSound has only one requirement: input data must

be multiple to the number of channels and may not necessarily correspond to the frame size. Processors

automatically split data into buffers, but working with buffers gives users a possibility to receive results within

every frame.

CSST_SDK_API long CSST_GetSampleRate(CSContext * aPProcessor);

- returns signal sampling frequency value that aPProcessor is set to work with.

CSST_SDK_API long CSST_GetNChannels(CSContext * aPProcessor);

- returns number of speech channels the aPProcessor is set to work with.

CSST_SDK_API ESCodecsTypes CSST_GetCodecType(CSContext * aPProcessor);

- returns identifier of the initial signal coding algorithm.

Setting the initial compression algorithm allows the system to adapt more exactly to the parameters of the

signal, to improve the used compression algorithm and to identify more accurately the various types of

distortion. Codecs known to the system are listed in ESCodecsTypes.

The following values are foreseen/defined:

Identifier Description

esctNotCoded samples without coding


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

esctG711_ALaw G.711 A-Law

esctG711_ULaw G.711 U-Law

esctUnKnownCodec Any other codec

CSST_SDK_API void CSST_SetSampleRate(CSContext * aPProcessor, long aSmplRate);

- Passes aPProcessor sampling frequency value of the signal – aSmplRate. By default sampling frequency is set to

8kHz. It's important to set actual sampling frequency because many algorithms depend on that and this results

into correctness of processing results.

CSST_SDK_API void CSST_SetNChannels(CSContext * aPProcessor, long aNChannels);

- passes aPProcessor number of channels (aNChannels) in the input signal.

CSST_SDK_API int CSST_SetCodecType(CSContext * aPProcessor, ESCodecsTypes aCType);

- passes identifier of initial coding algorithm to aPProcessor.

CSST_SDK_API TSResult CSST_GetResult(CSContext * aPProcessor, long aChannel);

- Returns results of aPProcessor work for channel aChannel joined in the following structure:

struct TSResult

ESContextsTypes dRType;

bool isValid;

long dChannel;

USUniResult dResult;

;

The structure has common fields for all types of processors: SNR, Clipping, Echo. Field dRType contains processor


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

type, which allows correctly select dResult union depending on processor type:

Field isValid contains flag of processing correctness as there are possible cases that processor will not be

able to provide results for certain frame. If result is invalid one can obtain the reason using function

GetLastErrorCode()

Field dChannel contains channel number that results are associated with.

In case of multichannel sound inquiry to channel 0 returns average result over all channels. Averaging

depends on the processor type.

Union of polytipic results is defined as the following structure:

union USUniResult

TSSNRResult dSNR;

TSClippingResult dClipping;

TSEchoResult dEcho;

TSClickResult dClicking;

TSStuckResult dStuck;

;

Each processor type corresponds to a certain result identifier:

dSNR – result of SNR processor;

dClipping – result of Clipping detection processor;

dEcho – result of Echo detection processor;

dClicking – result of Clicking detection processor;

dStuck – result of Stuck detection processor.

Channel numbering begins with one, if you ask the result for channel 0, you get the average result for all

channels that are processed. In all other cases the return value corresponds to the requested channel.

CSST_SDK_API int CSST_GetLastErrorCode(void);

- returns error code of the last error. After function call current error code is reset, so repeating call will return 0 if


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

no other error occurs in the meanwhile. Error codes are declared in CSpTErrors.h file and their description can be

found in the table below:

Code Identifier Description

0 errNoErrors No error

1 errSpeechToolIsNotInitialized Library is not initialized

2 errLicensesFileNameMissed License file name is missing

3 errLicensesFileOpeningError Error opening license file

4 errIncompatibleLicensesFile Incompatible license file

5 errIncompatibleHost License file was issued for another machine

6 errLibraryInUse The library has been already loaded and is in use

7 errLicensesTimeIsOver License period has expired

8 errAllChannelsAreOpened All channels are in use

9 errUnKnownProcessorType Unknown processor type

10 errNotEqualSamplesNumForChannel

s

The number of samples is not a multiple of the number of

channels

11 errIncorrectSampleRate Algorithm cannot work with the provided sampling frequency

12 errCannelNumOutOfRange Channel number is not in the range of sound channels of the

processor

13 errTooFewOfData Processor received too few data

14 errTooFewOfDataForFlySNR Too few data to calculate SNR mean value

15 errTooFewOfDataForFlyClipping Too few data to calculate clipping mean value

16 errSetContextInValideCrashed Crash in SetContextInValide

17 errInitLibCrashed Crash in CSST_InitLib

18 errGetLastErrorCodeCrashed Crash in CSST_ReleaseLib


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

19 errPooreProcessorPointer Poor pointer to Processor

20 errInValideProcessor Processor validity flag is false

21 errCreateProcessorCrasched Crash in CSST_CreateProcessor

22 errReleaseProcessorCrashed Crash in CSST_ReleaseProcessor

23 errPutSoundCrashed Crash in CSST_PutSound

Usage Examples

Signal Noise Ratio Calculator (SNR)

Signal Noise Ratio calculator (SNR processor) can be created by function CSST_CreateProcessor when using

parameter esctSNRCalculator. It performs sound processing by 240 samples and frame size may vary depending

on sampling frequency used.

When passing data into the calculator one does not need to consider frame size as SNR processor stores data and

processes it depending on buffer contents size. Results will be sent to structure TSSNRResult :

struct TSSNRResult

double dSNR;

double dEnergy;

double dFlySNR;

double dFlyEnergy;

long dNSamples;

double dTime;

;

Processor calculates signal energy level for each frame in dB. Last frame energy level is sent to dEnergy field.

dSNR field contains signal noise ratio from the beginning of the sound stream and up to current moment, which is

calculated as difference between maximal and minimal energy values.


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

dFlySNR and dFlyEnergy fields contain average values of signal noise ratio and energy for last 10 frames of the

signal. dNSamples field saves number of processed signal samples and field dTime stores duration of the

processed signal in seconds.

SNR Calculator returns the first result after receiving two valid frames of audio data.By valid we mean:

The energy level on the frame exceeds the threshold value;

The difference between the minimum and the maximum value exceeds the threshold;

The active signal started (we have found a maximum power value that exceeds the threshold value).

Amplitude Clipping detection

Amplitude Clipping detection processor can be created by function CSST_CreateProcessor using parameter value

esctClippingDetector. It performs sound processing by 80 samples and frame size may vary depending on

sampling frequency used.

When passing data into the processor one does not need to consider frame size as the processor stores data and

processes it depending on buffer contents size. Results will be sent to structure TSClippingResult :

struct TSClippingResult

double dFrameClpLevel;

double dFrameClpLevelWide;

double dClpLevel;

double dClpLevelWide;

double dFlyClpLevel;

double dFlyClpLevelWide;

long dNSamples;

double dTime;

;

Field dFrameClpLevel contains signal clipping level which corresponds to the number of clipped samples on

current frame to the frame length.

Field dFrameClpLevelWide contains clipping level of sequentially clipped samples and is calculated as ratio of

clipped sequence length to the frame length.

Fields dClpLevel and dClpLevelWide contain clipping level and sequential clipping level from the beginning of

audio stream till current moment.

Fields dFlyClpLevel and dFlyClpLevelWide store clipping level and sequential clipping level received as result of


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

averaging of the recent 10 processed frames.

Field dNSamples stores number of processed signal samples and field dTime stores duration of processed signal

in seconds.

Echo detection

Echo detection processor can be created by function CSST_CreateProcessor using parameter value

esctEchoDetector. It performs sound processing by 2000 samples and frame size may vary depending on sampling

frequency used.

When passing data into the processor one does not need to consider frame size as the processor stores data and

processes it depending on buffer contents size. Results will be sent to structure TSEchoResult:

struct TSEchoResult

double dEchoPower;

double dSpeechPower;

double dEchoLevel;

long dNSamples;

double dTime;

;

Fields dEchoPower and dSpeechPower contain minimal and maximal levels of signal autocorrelation and field

dEchoLevel contains their difference.

The current echo detection algorithm is based on echo compensator and fields dEchoPower and dSpeechPower

contain energy values of echo and initial signal. Field dEchoLevel is equal to the ratio of these two fields

multiplied by 100 (to return value in percentage).

Field dNSamples stores number of processed signal samples and field dTime stores duration of processed signal

in seconds.

Предусмотрены два варианта сборки библиотеки, в которых реализованы различные вариантыдетектирования эхо. В одном случае это корреляция, в другом – эхо-компенсатор. В первом вариантеобеспечивается большая скорость работы, но меньшая точность детектирования. Кроме того, впервом варианте в качестве значений, возвращаемых детектором, выступают коэффициентыподобия сигналов, а во втором – значения энергии сигналов.

There are two library variant, which implement different ways of detecting echo. In one case, this is done using


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

correlation, in the other - an echo canceller. The first option provides greater speed, but less accurate detection.

In addition, in the first realization the values returned by the detector are also coefficients of the similarity of

signals. In the second case they represent the energy of the signals.

Sample code of the library usageAs an example we provide source code of the main file /Sample/sptLibTest/mains/SPTTest.cpp of packet

processor for detecting voice impairments in sound files, which can detect different impairments and print out

place of their occurrencies in time domain.

// Standard header files#include <stdio.h>#include <stdlib.h>#include <string.h>#include <fcntl.h>#include <sys/timeb.h>#include <time.h>

#ifdef WIN32#include <io.h>#endif

#ifndef WIN32#include <dirent.h>#endif

// Threshold values to output impairments into a log file (user defined)#define cmFlySNRThresh 10#define cmFrameClpThresh 0#define cmWideFrameClpThresh 0#define cmEchoThresh 0.22

// Header files of the SpeechTool#include "../../DLL/include/CSpTErrors.h"#include "../../DLL/include/CSSpeechTool.h"

// Header files to work with audio recordings#include "../../Wave/WavFiles.h"#include "../../Wave/CSmtSamples.h"

// Array of processors nameschar * pCntNames[] = "SNRCalculator", // SNR processor "ClippingDetector", // Clipping detection processor "EchoDetector", // Echo detection processor "UnKnown", // Unknown processor;

// Structrue for processor list elementstruct TSContextsListItem CSContext * pContext;


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

double dDuration; TSContextsListItem * pNext;;

// List of connected processorsTSContextsListItem * pContextsList = NULL;

// Log fileFILE * iPRepFile;

// String transcoding from char into wchar_tstatic wchar_t * char2wchar(const char * aPStr, wchar_t * aPWStr, int aLen) int i, l; if (!aPStr) return(NULL); l = strlen(aPStr); if (l >= aLen) return(NULL); for(i = 0; i < l + 1; i++) aPWStr[i] = wchar_t(aPStr[i]); return(aPWStr);

// Creating processor with set functional identifiersint CreateProcessor(char * aPProcName) TSContextsListItem * iPNewLI; iPNewLI = new TSContextsListItem(); iPNewLI>dDuration = 0.0; if (strcmp(aPProcName, "snr") == 0) // SNR calculator iPNewLI>pContext = CSST_CreateProcessor(esctSNRCalculator); if (!iPNewLI>pContext) delete(iPNewLI); return(2); CSST_SetSampleRate(iPNewLI>pContext, 8000); CSST_SetNChannels(iPNewLI>pContext, 1);

// iPNewLI>pNext = pContextsList; pContextsList = iPNewLI; return(0); else if (strcmp(aPProcName, "clipp") == 0) // Clipping detector iPNewLI>pContext = CSST_CreateProcessor(esctClippingDetector);

if (!iPNewLI>pContext) delete(iPNewLI); return(3); CSST_SetSampleRate(iPNewLI>pContext, 8000); CSST_SetNChannels(iPNewLI>pContext, 1);


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

//

iPNewLI>pNext = pContextsList; pContextsList = iPNewLI; return(0); else if (strcmp(aPProcName, "echo") == 0) // Echo detector iPNewLI>pContext = CSST_CreateProcessor(esctEchoDetector); if (!iPNewLI>pContext) delete(iPNewLI); return(4);

CSST_SetSampleRate(iPNewLI>pContext, 8000); CSST_SetNChannels(iPNewLI>pContext, 1);

//

iPNewLI>pNext = pContextsList; pContextsList = iPNewLI; return(0);

delete(iPNewLI); return(1);

// Defines minimal processing step

long CalculateMinStep(void)

TSContextsListItem * iPNewLI;

long iRet, iTmp;

iRet = CSST_GetFrameSize(pContextsList>pContext);

for(iPNewLI=pContextsList>pNext; iPNewLI; iPNewLI=iPNewLI>pNext)

iTmp = CSST_GetFrameSize(iPNewLI>pContext);

if (iRet > iTmp) iRet = iTmp;

return(iRet);

// Sound processing


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

int ProcessWave(FILE * aPRepFile, short * aPSamples, long iSize)


clock_t iStartTime;

clock_t iFinishTime;

double iDuration;

for(iPNewLI=pContextsList; iPNewLI; iPNewLI=iPNewLI>pNext)

iStartTime = clock();

if (CSST_PutSound(iPNewLI>pContext, aPSamples, iSize) != iSize)

return(1);

iFinishTime = clock();

iDuration = (double)(iFinishTime iStartTime) / CLOCKS_PER_SEC;

iPNewLI>dDuration += iDuration;

return(0);

// Prints into log file at each frame if imairment was detected

int PrintfFlyReport(FILE * aPRepFile, long aPos, double aTime)

TSContextsListItem * iPWrkLI;

TSResult iPrRes;

bool isPrinted = false;

for(iPWrkLI=pContextsList; iPWrkLI; iPWrkLI=iPWrkLI>pNext)

iPrRes = CSST_GetResult(iPWrkLI>pContext, 0);

//

if (iPrRes.isValid)


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

switch(iPrRes.dRType)

case esctSNRCalculator : // SNR calculator

if (iPrRes.dResult.dSNR.dFlySNR < cmFlySNRThresh)

if (!isPrinted) fprintf(iPRepFile, "Current File Position is %li (%lf sec)\n",aPos, aTime);

fprintf(aPRepFile, "dSNR = %lf\n", iPrRes.dResult.dSNR.dSNR);

fprintf(aPRepFile, "dFlySNR = %lf\n", iPrRes.dResult.dSNR.dFlySNR);

isPrinted = true;

break;

case esctClippingDetector : // Clipping detection

if ((iPrRes.dResult.dClipping.dFrameClpLevel >cmFrameClpThresh)||(iPrRes.dResult.dClipping.dFrameClpLevelWide > cmWideFrameClpThresh))


fprintf(aPRepFile, "dFrameClpLevel = %lf\n",iPrRes.dResult.dClipping.dFrameClpLevel);

fprintf(aPRepFile, "dFrameClpLevelWide = %lf\n",iPrRes.dResult.dClipping.dFrameClpLevelWide);

isPrinted = true;

break;

case esctEchoDetector : // Echo detector

if (iPrRes.dResult.dEcho.dEchoLevel > cmEchoThresh)


fprintf(aPRepFile, "dEchoPower = %lf\n", iPrRes.dResult.dEcho.dEchoPower);

fprintf(aPRepFile, "dSpeechPower = %lf\n", iPrRes.dResult.dEcho.dSpeechPower);

fprintf(aPRepFile, "dEchoLevel = %lf\n", iPrRes.dResult.dEcho.dEchoLevel);


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

isPrinted = true;

break;

case esctUnKnown : // Unknown processor

default:

break;

;

return(0);

// Prints final report into log file

int PrintfFinalReport(FILE * aPRepFile)


TSResult iPrRes;

fprintf(aPRepFile,"\n");

for(iPWrkLI=pContextsList; iPWrkLI; iPWrkLI=iPWrkLI>pNext)

iPrRes = CSST_GetResult(iPWrkLI>pContext, 0);

fprintf(aPRepFile, "Processor type : %i (%s)\n", iPWrkLI>pContext>dCType,pCntNames[iPWrkLI>pContext>dCType]);

if (!iPrRes.isValid) fprintf(aPRepFile, "Not resulted (%i)\n", GetLastErrorCode());

else

switch(iPrRes.dRType)

case esctSNRCalculator : // SNR Calculator

fprintf(aPRepFile, "dSNR = %lf\n", iPrRes.dResult.dSNR.dSNR);

fprintf(aPRepFile, "dEnergy = %lf\n", iPrRes.dResult.dSNR.dEnergy);

break;


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

case esctClippingDetector : // Clipping detector

fprintf(aPRepFile, "dClpLevel = %lf\n",iPrRes.dResult.dClipping.dClpLevel);

fprintf(aPRepFile, "dClpLevelWide = %lf\n",iPrRes.dResult.dClipping.dClpLevelWide);

break;

case esctEchoDetector : // Echo detector

fprintf(aPRepFile, "dEchoPower = %lf\n", iPrRes.dResult.dEcho.dEchoPower);

fprintf(aPRepFile, "dSpeechPower = %lf\n", iPrRes.dResult.dEcho.dSpeechPower);

fprintf(aPRepFile, "dEchoLevel = %lf\n", iPrRes.dResult.dEcho.dEchoLevel);

break;

case esctUnKnown : // Unknown processor

default:

break;

;


return(0);

// Prints processors performance

int PrintfTimingReport(FILE * aPRepFile, long aDataSize)


double iDataLens;

double iSpeedCoeff;


iDataLens = double(aDataSize) / 8000;

for(iPNewLI=pContextsList; iPNewLI; iPNewLI=iPNewLI>pNext)


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

fprintf(aPRepFile, "Processor type : %i (%s)\n", iPNewLI>pContext>dCType,pCntNames[iPNewLI>pContext>dCType]);

if (iPNewLI>dDuration > 0.000001)

iSpeedCoeff = iDataLens / iPNewLI>dDuration;

fprintf(aPRepFile, "RealTime Coefficient = %lf\n", iSpeedCoeff);

else

fprintf(aPRepFile, "Too fast to calculate!\n");

fprintf(aPRepFile,"\n\n\n");

return(0);

// Release all processors

int ReleaseProcessors(void)


while(pContextsList)

iPWrkLI = pContextsList;

pContextsList = pContextsList>pNext;

ReleaseProcessor(iPWrkLI>pContext);

delete(iPWrkLI);

return(0);

//Processing single audio file

int ProcessOneFile(char * aWaveName, int argc, char * argv[])


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

CSmartSamples iInData;

short * iPDataShort;

long i, iPos, iStep, iEOFData, iSize;

double iTime;

fprintf(iPRepFile, "File : '%s'\n", aWaveName);

// Reading audio

if (iInData.ReadFromFile(aWaveName, 0) != 0)

iInData.Reset();

printf("Cannot open input file!\n");

fprintf(iPRepFile, "Cannot open input file!\n");

return(0);

// Creating processors

for(i=4; i<argc; i++)

if (CreateProcessor(argv[i]) != 0)

printf("Cannot create processor '%s'!\n", argv[i]);

fprintf(iPRepFile, "Cannot create processor '%s'!\n", argv[i]);

iInData.Reset();

fclose(iPRepFile);

return(0);

// Preparing sound data

iEOFData = iInData.GetNSamples();

iPDataShort = (short *)iInData.GetPSamplesArray(esstShort);

// Define maximal processing step

iStep = CalculateMinStep();

// Parse audio data

for(iPos=0; iPos<iEOFData; iPos+=iStep)


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

//Define size of processed block

if ((iPos + iStep) <= iEOFData) iSize = iStep;

else iSize = iEOFData iPos;

iTime = double(iPos) / 8000.0;

// Process sound

ProcessWave(iPRepFile, iPDataShort + iPos, iSize);

// Output of per frame report

PrintfFlyReport(iPRepFile, iPos, iTime);

// Output of final report

PrintfFinalReport(iPRepFile);

// Print processors'performance

PrintfTimingReport(iPRepFile, iEOFData);

// Release everything

ReleaseProcessors();

iInData.Reset();

return(0);

// Function main()

int main(int argc, char * argv[])

wchar_t iLicName[2048];

// Checking number of arguments

if (argc < 5)

printf("Usage:\n");

printf("spttest <licfile> <pathtowavs> <repfile> <procid00> [[procid01] ...[procidNN]]\n");

printf("<licfile> : Set the licenses file name;\n");

printf("<pathtowavs> : Set the path to wave files;\n");

printf("<repfile> : Set the output text file name;\n");


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

printf("<procidXX> : Set the sound processorc.\n");

printf(" : snr\n");

printf(" : clipp\n");

printf(" : echo\n");

return(0);

// Library initialization

char2wchar(argv[1], iLicName, 2048);

if (!CSST_InitLib(iLicName))

printf("Licenses data failed!\n");

return(0);

//Preparing log file

iPRepFile = fopen(argv[3], "w+t");

if (!iPRepFile)

printf("Cannot open output file!\n");

return(0);

#ifdef WIN32

//

// Windows related

//

static char dFullPathString[4096];

struct _finddata_t ffblk;

int handle;

int dNFiles, done;

// Filling in string with path

sprintf(dFullPathString, "%s\\*.wav", argv[2]);

// Take first .wav file


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

dNFiles = done = 0;

handle = _findfirst(dFullPathString, &ffblk);

// Scanning folder containing .wav files

while ((handle != 1)&&(done == 0))

if ((ffblk.attrib & _A_SUBDIR) == 0)

sprintf(dFullPathString, "%s\\%s", argv[2], ffblk.name);

ProcessOneFile(dFullPathString, argc, argv);

dNFiles++;

printf("%i files are processed!\r", dNFiles);

done = _findnext(handle, &ffblk);

_findclose(handle);

#else

//

// Linux related

//

struct dirent * iDirEntP;

DIR * iDirP;

iDirP = opendir(argv[2]);

if (iDirP == NULL) return(false);

// Scanning folder with .wav files

for(dNFiles = done = 0; ; done++)

iDirEntP = readdir(iDirP);

if (iDirEntP == NULL) break;

if (strstr(iDirEntP>d_name, ".wav"))

sprintf(dFullPathString, "%s/%s", argv[2], iDirEntP>d_name);


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

ProcessOneFile(dFullPathString, argv, argc);

closedir(iDirP);

#endif

printf("%i files are processed!\n", dNFiles);

fclose(iPRepFile);

// Releasing the library

CSST_ReleaseLib();

return(0);

Sample source code

Delivery:

Header files to work with audio (/Sample/Wave/). Classes implemented within the sample source code

are not allowed for commercial use and do not represent part of the product.

Header files of the library (/Sample/DLL/Include/). File CspTErrors.h contains error codes and file

CSSpeechTool.h contains headers of all library functions.

To ease sample code compilation archive include batch file CompileSptLibTest.bat (/Sample/)

Libray binary is libSpeechTool_static.a (/Sample/)

Compiled sample application spt-lib-test (/Sample/)

Sound file for testing clean.wav (/Sample/)

Batch file for a test run 2.bat (/Sample/)

Compilation

One can compile the sample source code by invoking batch file CompileSptLibTest.bat which has the following

line:

g++ -O2 ./sptLibTest/mains/SPTTest.cpp ./libSpeechTool_static.a -o ./spt-lib-test

As result of compilation one receives a program that can find places having impairments that the library can

detect in audio files. To run the program use the following command line options:

spt-lib-test <lic-file> <path-to-wavs> <rep-file> <proc-id-00> [[proc-id-01] ... [proc-id-NN]]


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

where

<lic-file> - license file, which is delivered by the vendor

<path-to-wav> - path to folder containg .wav files

<rep-file> - log/report file name

<proc-id-00> [[proc-id-01] ... [proc-id-NN]] – list of processors identifies; currently the library supports the

following processors:

SNR – signal/noise ratio processor

clipp – clipping level calculating processor

echo – echo level calculating processor

This is a new document.

For measuring the performance of different detectors a SW called Performance.cpp was developed.

The program can be compiled with any version of the library (Performance-test-corr - built with library that uses

the algorithm based on correlation, Performance-test-echo - built with library that working with the echo

canceller).

The command line invocation of the software is:

Performance-test-XXXX <proc-id> <ch-1-wave> [ch-2-wave]

where:

<proc-id> : the ID of the sound processors:);

: snr

: clipp

: echo

: click

: stuck

<ch-1-wave> : path of channel 1 wave file;


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

[ch-2-wave] : path of channel 2 wave file.

The parameters passed to the program are:

the ID of the analyzing detector and one or two audio data files. Only the echo detector needs two files of audio

data. All other detectors operate with a single file.

After starting the program a series of performance measurements are done, the mean and variance are

calculated. The resulting value of performance will be defined for a single processor core of the computer that

has executed the software.


Sevana Oy

Agricolankatu 11

00530 Helsinki

Finland

Phone: +358 9 2316 4165

Sevana Oü

Rohtlaane 12

76911 Huuru kula

Estonia (Harjumaa)

Phone: +372 53485178

Sevana Voice Impairments Detection Library

Technology