Copyright Nov. 2002, George Tzanetakis Digital Music & Music Processing George Tzanetakis PostDoctoral Fellow Computer Science Department Carnegie Mellon University [email protected] http://www.cs.cmu.edu/ ~gtzan
Dec 19, 2015
Copyright Nov. 2002, George
Tzanetakis
Digital Music & Music Processing
George Tzanetakis PostDoctoral Fellow Computer Science Department Carnegie Mellon University
[email protected]://www.cs.cmu.edu/~gtzan
Copyright Nov. 2002, George
Tzanetakis
Overview
Music Information Retrieval (MIR) and Computer Audition
Motivation
Techniques
Applications
Computer Music and Sound SynthesisExamples, demos
Copyright Nov. 2002, George
Tzanetakis
MIR Music History
9000 B.C 1000 1700
1877 1960 2002
Copyright Nov. 2002, George
Tzanetakis
Music
4 million recorded CD tracks
4000 CDs / month
Mp3 bandwidth %
Global
Pervasive
Persistent
Why ?
Copyright Nov. 2002, George
Tzanetakis
The future of MIR
Library of all recorded music
Tasks: organize, search, retrieve, classify, recommend, browse, listen, annotate
Examples:
Copyright Nov. 2002, George
Tzanetakis
Audio MIR Pipeline
Signal Processing Machine Learning Human Computer Interaction
HearingRepresentation
UnderstandingAnalysis
ReactingInteraction
Copyright Nov. 2002, George
Tzanetakis
Traditional Music Representations
Copyright Nov. 2002, George
Tzanetakis
Time domain waveform
time
pressure
Decompose into building blocks
time
frequency
Copyright Nov. 2002, George
Tzanetakis
MIDI
Musical Instrument Digital InterfaceHardware interface
File format
Note eventsDuration, discrete pitch, “instrument”
ExtensionsGeneral MIDI
Notation, OMR, continuous pitch
Copyright Nov. 2002, George
Tzanetakis
Symbolic vs Audio MIR
Audio
PolyphonicTranscription
Symbolic Representation (MIDI)
MIR
Audio
ComputerAudition
MIR
Machine Learning Models
Copyright Nov. 2002, George
Tzanetakis
Feature extraction
Copyright Nov. 2002, George
Tzanetakis
Timbral Texture
Timbre = differentiate sounds of same loudness, pitchTimbral Texture = differentiate mixtures of sounds
Global, statistical and fuzzy properties
Copyright Nov. 2002, George
Tzanetakis
Spectrum
t
t+1
M
M
Copyright Nov. 2002, George
Tzanetakis
Fourier Transform P=1/f
Copyright Nov. 2002, George
Tzanetakis
Short Time Fourier Transform
STFT Filterbank interpretation
Filters Oscillators
Amplitude
Frequency
output
Copyright Nov. 2002, George
Tzanetakis
Short Time Fourier Transform II
t
t+1
M
M
Copyright Nov. 2002, George
Tzanetakis
Formants
From “Real Time Synthesis for Interactive Applications”P.Cook, A.K Peters Press, used by permission
Copyright Nov. 2002, George
Tzanetakis
Linear Prediction Coefficients
Impulses @ f0
White Noise
Source Filter Speech
Lossless tubes
Copyright Nov. 2002, George
Tzanetakis
MPEG Audio Coding (mp3)
AnalysisFilterbank
Psychoacoustic Model
Available bits
32 linearly spacedbands
Encoder: Slower, ComplicatedDecoder: Faster, Simpler
Perceptual Audio Coding
Copyright Nov. 2002, George
Tzanetakis
Spectral Shape
t
MCentroidRolloffFluxRMS
Moments…
Copyright Nov. 2002, George
Tzanetakis
Summary of Timbral Texture Features
Time-Frequency analysis
Signal Processing (STFT, DWT)
Source-filter (LPC)
Perceptual (MP3)
Spectral Shape to feature vector
Copyright Nov. 2002, George
Tzanetakis
Pitch Content
Harmony-melody = pitch concepts
Music theory score = music
Bridge to symbolic MIR
Automatic music transcription
Non-transcriptive arguments
Copyright Nov. 2002, George
Tzanetakis
Automatic Pitch Detection
P=1/f
Time-domainFrequency-domain
Perceptual
Zerocrossings Autocorrelation analysis = peaks of function correspond to dominant pitches
Copyright Nov. 2002, George
Tzanetakis
Pitch Histograms
Jazz IrishChroma - folded
Height - unfolded
Copyright Nov. 2002, George
Tzanetakis
Automatic Music Transcription
Original Transcribed
Mixture signal Noise suppresionPredominant Pitch
Estimation
Remove detected sound
Estimate # of voices
Copyright Nov. 2002, George
Tzanetakis
Rhythm
Movement in time
Origins in Poetry (iambic, trochaic)
Foot tapping definition
Hierarchical semi-periodic structure at multiple levels of detail
Links to motion, dance
Running vs global
Copyright Nov. 2002, George
Tzanetakis
Self similarity
DWT Autocorrelation
Peak Picking
Beat Histograms
EnvelopeExtraction
Copyright Nov. 2002, George
Tzanetakis
Beat Histograms
Copyright Nov. 2002, George
Tzanetakis
Analysis
Classification
Segmentation
Similarity Retrieval
Clustering
Thumbnailing
Fingerprinting
Copyright Nov. 2002, George
Tzanetakis
Analysis Overview
Musical PieceTrajectory Point
Copyright Nov. 2002, George
Tzanetakis
Query-by-example Content-based Retrieval
Ranked list of k nearest neighbors
Copyright Nov. 2002, George
Tzanetakis
QBE examples
Rock: BeatlesJazz: Bobby Hutserson Funk: Mano negraWorld: Tibetan singerComputer Music: Paul Lansky
Query Match
Copyright Nov. 2002, George
Tzanetakis
Automatic Musical Genre Classification
Categorical music descriptions created by humans
Fuzzy boundaries
Statistical propertiesTimbral texture, rhythmic structure, harmonic content
Automatic musical genre classificationEvaluate musical content features
Structure audio collections
Copyright Nov. 2002, George
Tzanetakis
Genregram demo
Dynamic real-timevisualization for classification
of radio signals
Copyright Nov. 2002, George
Tzanetakis
Audio segmentation
Detect changes of audio texture
Copyright Nov. 2002, George
Tzanetakis
Multifeature automatic segmenation methodology
Time series of feature vector v(t)
Detect abrupt changes in trajectory
Copyright Nov. 2002, George
Tzanetakis
Context & Content Aware User Interfaces
Automatic results not perfect
Music listening is personal and subjective
Browsing vs retrieval
“Overview, zoom and filter, details on demand”, Shneiderman mantra
Adapt UI to music content and contextComputer Audition
Visualization
Copyright Nov. 2002, George
Tzanetakis
Content and Context
Content ~ fileGenre, male voice, saxophone
Content ~ file, collectionSimilarity
Slow-fast
Multiple visualizationsSame content
Different context
Copyright Nov. 2002, George
Tzanetakis
Timbregrams
Content & Context Similarity + Time StructurePrincipal Component AnalysisMap feature vectors to color
Copyright Nov. 2002, George
Tzanetakis
Timbrespaces
Copyright Nov. 2002, George
Tzanetakis
Islands of Music
Copyright Nov. 2002, George
Tzanetakis
Auditory Scene Analysis