YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: Classifying Motion Picture Audio

Classifying Motion Picture Audio

Eirik Gustavsen07.06.07

Page 2: Classifying Motion Picture Audio

Outline

• Motivation • Thesis• State of the Art• Proposed system• Experimental setup• Results• Future work• Conclusion

Page 3: Classifying Motion Picture Audio

Motivation

• Most projects classify clear classes or classes with noise.

• Few clear boundaries in motion picture audio• Subjective descriptions of movies• Dificult to compare movie content

Page 4: Classifying Motion Picture Audio

Thesis

It is possible to automatically create a table of contents of a motion picture, based on its audio track only.

Page 5: Classifying Motion Picture Audio

Research questions

• Find best LLDs to classify motion picture audio

• Detect boundaries between audio classes within complex audio segments

• Automatically create a TOC based on the audio track only

Page 6: Classifying Motion Picture Audio

Pre-Processing44100 Hz sample rateMono16 bits

30 ms windows (LW)

Page 7: Classifying Motion Picture Audio

Low Level Descriptors

Time domain Frequency domain

Page 8: Classifying Motion Picture Audio

Low Level Descriptors

• Total of 23 low level descriptors

TIME DOMAIN

• Audio Power• Audio Wave Form• Root-Mean Square• Short Time Energy• Low Short Time Energy Ratio• Zero-Crossing Rate• High Zero-Crossing Rate Ratio

FREQUENCY DOMAIN

• Audio Spectrum Centroid• Fundamental Frequency• 10 Mel-Frequency Cepstral Coefficients• Spectrum Flux

Page 9: Classifying Motion Picture Audio

Dimensionally reduction

Principal components analysis (PCA) is a technique used to reduce multidimensional data sets to lower dimensions for analysis.

f(1)f(2)f(3)f(4)f(5)...f(23)

PCAd(1)d(2)d(3)

Page 10: Classifying Motion Picture Audio

K Nearest Neighbors

Page 11: Classifying Motion Picture Audio

Proposed system

Pre- Prosessing LLD Norm

PCAKNNPost- Prosessing

TOC Generation

Page 12: Classifying Motion Picture Audio

Classifying Audio

Speech

Noise (white)

Music

”Silence”

Mixed audio classes

Page 13: Classifying Motion Picture Audio

Class Boundary Detection

Page 14: Classifying Motion Picture Audio

Class Boundary Detection

Page 15: Classifying Motion Picture Audio

Class Boundary Detection

Page 16: Classifying Motion Picture Audio

Finding most suitable LLDs

Most Suitable:

ASCAWFRMSHZCRR

Page 17: Classifying Motion Picture Audio

Sample Results

Music with low volume

Clear speech

Speech with background environmental sounds

Fading between music and speech

Speech with Background music

Jingle

” Some mistakes”

Page 18: Classifying Motion Picture Audio

Future Work

• To be done in this thesis– Post processing– TOC

• Open research questions for future works– New motion picture audio classes– Detecting sound objects– Speech recognition

Page 19: Classifying Motion Picture Audio

Conclusion

• Pre-processing makes it possible to classify motion picture audio correctly

• Using right combination of LLDs enhances the result of the classification

Page 20: Classifying Motion Picture Audio

Questions

?


Related Documents