Top Banner
Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th 2010
34

Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Functional Data Analysis for Speech Research

Michele GubianRadboud University Nijmegen The NetherlandsLondon, March 24th 2010Cambridge, March 26th 2010

Page 2: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Content What and why Functional Data Analysis (FDA)

Motivation

Case study 1

Case study 2 – pitch re-synthesis

How to use FDA

Using the R package ‘fda’

Page 3: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Motivation

Page 4: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Analyzing curves

PCA

ANOVA

Linear models

xxx

x

?

dur ext

58

48

98

2.8

3.8

2.9

dur

ext

Page 5: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Problems

xxx

x

?

dur

ext

Decide what are the important features of a curve using

models

intuition / trial and error

However

Those features may not capture all the relevant dynamic

aspects

e.g. concavity/convexity

long range correlatioins

Page 6: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Problems (2)

xxx

x

?

dur

ext

Identify those feature points

manually

(semi)automatically

However

The identification may be hard, even ill-posed

time consuming

risk of subjective judgment

Page 7: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Analyzing curves with FDA

xxx

x

?

dur

ext

Functional

Data

Analysis

Page 8: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Analyzing curves with FDA

All the information contained in the curve (dynamics) is used

No need to reduce a curve to a set of significant features

No need to introduce assumptions on what is relevant in a curve

shape and what is not

FDA provides both VISUAL and QUANTITATIVE results

input is curves, output is also curves

plus classic statistical output like p-values, confidence intervals

Page 9: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Functional Data Analysis: an extension of (some) statistical techniques to the domain of functions

Example

Ask people: How old are you? How much do you earn?

Each data point is a point in 2D

CLASSIC FDA

age

salary xx

x

xxx

x

x

Record people salary through the years

Each “data point” is a whole CURVE

age

salary

Page 10: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Case study

Page 11: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Diphthong vs. hiatus in Spanish

/ja/ vs. /i.a/ contrast is unstable in European Spanish

Diachronically, in Romance languages /i.a/ becomes /ja/

Diatopically, in Latin American Spanish the contrast seems to be lost

It is not present in orthography (“ia” in either case)

No strict minimal pairs

Investigate

Consistent realization of the contrast

Inter-speaker variation

Cues used in the realization

Page 12: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

CuesDIPHTHONG

/ja/HIATUS

/i.a/

Duration

Formants

Pitch

short long

f1

f2

f1

f2

f0 f0

Page 13: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Example diphthong

Page 14: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Example hiatus

Page 15: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Dataset

Read speech

Diphthong

‘Emiliana no, …’ /e.mi.lja.na#no#.../ (‘Not Emiliana, …’)

Hiatus

‘Mi liana no, … ‘ /mi#li.a.na#no#.../ (‘Not my liana, …’)

9 speakers (gender balanced)

20 repetitions per speaker per type

In total 365 utterances

Page 16: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Duration

Page 17: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Pitch

Pitch was extracted from the beginning of /l/ to the end of the

rising gesture

In Spanish the pitch rising peak falls beyond the accented

syllable

lja li a

Page 18: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

The raw dataspeaker

/ja/ vs /i.a/

Page 19: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

FDA data preparation

Each sampled curve has to be turned into a function

Decide how much detail to retain (smoothing)

Page 20: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

FDA data preparation (2)

All functions will be obtained by a combination of so-called

basis functions, usually B-splines

All functions will be linearly stretched in time to become of

equal duration

Functional

representation

B-spline

Page 21: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

ClassicPrincipal Component Analysis (PCA)

age25 65

salary

xx

xxx

x

xx

xxx

xx xx xx

x x

xxx

x

xx x

xxx

xx

x

xx

x

x

PC1

PC2

Page 22: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Functional PCA on pitch contours

Page 23: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Functional PCA on pitch contours

PCA does not know about labels !!

Page 24: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Functional PCA on pitch contours

PC1

Page 25: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Functional PCA on pitch contours

PC1

Page 26: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Functional PCA on pitch contours

PC2

Page 27: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Functional PCA on pitch contours

PC2

Page 28: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Functional PCA on formants

PC2

PC1

f1 f2

Page 29: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Functional PCA on formants

PC1PC1

Page 30: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Cues coordination

Duration vs formants Duration vs pitch

Page 31: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Summary

FDA provides tools to extract relevant dynamic characteristics of a set of

curves

Traditional tools like PCA (and linear regression) are extended to curves

Functional PCA revealed the main dynamic cues used in the realization

of a (weak) contrast in Spanish

Without using the labels information

Without extracting features from the curves (e.g. peaks)

Combining multi-dimensional curves (formants) without effort

Page 32: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

References Functional Data Analysis website:

www.functionaldata.org

Books:

Software:

a bilingual (R and MATLAB) tool is freely available

online

Page 33: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Appendix

Page 34: Functional Data Analysis for Speech Research Michele Gubian Radboud University Nijmegen The Netherlands London, March 24 th 2010 Cambridge, March 26 th.

Functional linear models

y(t) = a(t) + b(t) x

diphthong, x = 0

hiatus, x = 1

Confidence intervals for a(t) and b(t)

R2(t) = percentage of explained variance