1 Birdsong Recognition 鳥鳥鳥鳥鳥鳥 鳥 鳥 鳥 鳥鳥鳥鳥鳥鳥鳥鳥鳥鳥鳥鳥
1
Birdsong Recognition鳥類鳴聲辨識
李 建 興中華大學資訊工程學系教授
2
Automatic Classification of Bird Species From Their Sounds Using Two-
Dimensional Cepstral Coefficients
Chang-Hsing Lee, Chin-Chuan Han, and Ching-Chien ChuangIEEE Trans. on Audio, Speech, and Language Processing,
Vol. 16, No. 8, Nov. 2008, pp. 1541-1550.
3
System Framework
Training syllable
Feature Database
Feature Extraction
LDA
Prototype Vectors Generation
PCA
Classified Bird Species sc
Test syllable
Feature Extraction
LDA Transformation
Classification
PCA Transformation
4
Feature Extraction
Two-dimensional Mel-frequency cepstral coefficient (TDMFCC)
Time
MFCC
Time
MFCC
DCT TDMFCC
5
Feature Extraction (cont.)
Dynamic Two-dimensional MFCC ( DTDMFCC )
0
0
0
2
1
))()(()(
n
nn
n
nnini
i
n
jEjEnja
6
Prototype Vector Generation
Gaussian mixture model (GMM) vs. Vector quantization (VQ)
Acoustic Model Selection – Bayesian information criterion (BIC)
Component Number Selection – self-splitting Gaussian mixture learning (SGML)
7
Experimental Results
28 bird species Training set – 3143 syllables
Yushan National Park, CD Sound of the Mountain IV: The songs of Wild Birds
Yushan National Park, CD Sound of the Mountain V: The songs of Wild Birds
Test set – 646 syllables Downloaded from website of National Fonghuanggu
Bird Park
8
Experimental Results (cont.)
Comparison of classification results for different PCA threshold
9
Experimental Results (cont.)SUMMARIZATION OF CLASSIFICATION ACCURACY (CA), SELECTED MODEL (EVQ
OR GMM), THE CLUSTER NUMBER (NS) FOR EACH BIRD SPECIES USING SDTDMFCC WHEN PCA THRESHOLD = 0.97
Subject Code Bird Name CA (%) Ns Selected Model
1 Crested Serpent Eagle 100.00 2 EVQ
2 Bronzed Drongo 86.49 5 EVQ
3 Gray-headed Pygmy Woodpecker 0.00 1 EVQ
4 Blue Shortwing 72.41 4 EVQ
5 Streak-breasted Scimitar Babbler 54.55 3 GMM
6 Taiwan Firecrest 100.00 3 EVQ
7 Taiwan Sibia 100.00 6 EVQ
8 White-throated Laughing Thrush 94.59 3 EVQ
9 White-breasted Water Hen 100.00 4 EVQ
10 Beavan's Bullfinch 100.00 3 EVQ
11 Gray-sided Laughing Thrush 100.00 3 EVQ
12 Alpine Accentor 71.70 1 EVQ
13 Green-backed Tit 7.14 5 EVQ
14 Taiwan Yuhina 100.00 3 EVQ
10
Experimental Results (cont.)SUMMARIZATION OF CLASSIFICATION ACCURACY (CA), SELECTED MODEL (EVQ
OR GMM), THE CLUSTER NUMBER (NS) FOR EACH BIRD SPECIES USING SDTDMFCC WHEN PCA THRESHOLD = 0.97 (cont.)
Subject Code Bird Name CA (%) Ns Selected Model
15 Red-headed Tit 100.00 2 EVQ
16 Collared Bush Robin 94.44 9 EVQ
17 Taiwan Bulbul 83.33 5 EVQ
18 Taiwan Hill Partridge 88.89 6 EVQ
19 Verreaux's Bush Warbler 100.00 4 EVQ
20 Oriental Cuckoo 95.56 3 GMM
21 Taiwan Tit 96.30 7 EVQ
22 Vivid Niltava 100.00 5 EVQ
23 Coal Tit 100.00 4 EVQ
24 Crested Goshawk 100.00 3 EVQ
25 Gould's Fulvetta 33.33 1 EVQ
26 Collared Pigmy Owlet 100.00 1 EVQ
27 Swinhoe's Pheasant 100.00 3 EVQ
28 Steere's Liocichla 80.00 3 EVQ
11
Continuous Birdsong Recognition Using Gaussian Mixture Modeling of
Image Shape Features
Chang-Hsing Lee, Sheng-Bin Hsu, Jau-Ling Shih, and Chih-Hsun Chou
IEEE Trans. on Multimedia, Vol. 15, No. 2, Feb. 2013, pp. 454-463.
12
System Framework
13
Feature Extraction• Angular Radial Transformation (ART) Feature
14
Feature Extraction (cont.)
Step 1: Spectrogram Generation
Zoom in
Music wave form :
Frame
Overlap
Spectrum analysis
15
Feature Extraction (cont.)
Step 1: Spectrogram Generation (cont.)
frame decomposition
frequency
…
16
Feature Extraction (cont.)
Step 1: Spectrogram Generation (cont.)
Waveform
Spectrogram
17
Feature Extraction (cont.)
Step 1: Spectrogram Generation (cont.)
鳳頭蒼鷹(Crested Goshawk)
火冠戴菊鳥 (Taiwan Firecest)
白耳畫眉(Taiwan Sibia)
黃腹琉璃(Vivid
Niltava)
18
Feature Extraction (cont.)
Step 2: Recognition window segmentation
19
Feature Extraction (cont.)
Step 3: Sector image generation
20
Feature Extraction (cont.)
Step 3: Sector image generation (cont.)
uu 256vv 256 sinu
cosv
f256
2562
t
2222 )256()256(256)()(256256 vuvuf
256
256tan
2
256tan
2
256
2
256 11
v
u
v
ut
21
Feature Extraction (cont.)
Step 4: ART feature extraction
Vn,m(ρ, θ): the ART basis function of order n and m, which is separable along the angular and radial directions:
where
2
0
1
0 ,, ),(),(),(),,(),( ddIVIVmnF SmnSmn
)()(),(, nmmn RAV
jm
m eA2
1)(
0)cos(2
01)(
nn
nRn
22
Feature Extraction (cont.)
Step 4: ART feature extraction (cont.)
The 1212 (N = 12 and M = 12) complex ART basis functions (a) real parts of ART basis functions (b) imaginary parts of ART basis functions
23
Feature Extraction (cont.)
Step 4: ART feature extraction (cont.)
24
Feature Extraction (cont.)
Step 4: ART feature extraction (cont.)
Experimental ResultsCOMMON AND LATIN NAME OF BIRD SPECIES IN THE BIRDSONG DATABASE AND THE NUMBER OF BIRDSONG SEGMENTS IN THE TRAINING SET (NTr) AND TEST SET (NTe) FOR BIRDSONG SEGMENTS OF DIFFERENT DURATIONS (D)
Common Name Latin NameD = 3 seconds D = 5 seconds
NTr NTe NTr NTe
Crested Serpent Eagle Spilornis cheela 107 5 105 3
Bronzed Drongo Dicrurus aeneus 128 10 126 8
Gray-headed Pygmy Woodpecker Dendrocopos canicapillus 50 9 48 7
Blue Shortwing Brachypteryx montana 172 6 170 4
Streak-breasted Scimitar Babbler Pomatorhinus ruficollis 147 16 145 4
Taiwan Firecest Regulus goodfellowin 92 10 90 8
Taiwan Sibia Heterophasia auricularis 97 5 95 3
White-throated Laughing Thrush Garrulax albogularis 61 8 59 6
White-breasted Water Hen Amauromis phoenicurus 83 6 81 4
Beavan's Bullfinch Pyrrhula erythaca 104 3 102 1
Gray-sided Laughing Thrush Garrulax caerulatus 77 79 75 77
Alpine Accentor Prunella collaris 62 9 60 7
Green-backed Tit Parus monticolus 127 4 125 2
Taiwan Yuhina Yuhina brunneiceps 62 6 60 425
Experimental Results (cont.)Red-headed Tit Aegithalos concinnus 98 9 96 7
Collared Bush Robin Erithacus johnstoniae 147 5 145 3
Taiwan Bulbul Pycnonotus taivanus Styan 58 8 56 6
Taiwan Hill Partridge Arborophila crudigularis 141 10 139 8
Verreaux's Bush Warbler Cettia acanthizoides 72 8 70 6
Oriental Cuckoo Cuculus saturatus 124 10 122 8
Taiwan Tit Parus holsti 116 7 114 5
Vivid Niltava Niltava vivida 91 8 89 6
Colal Tit Parus ater 105 10 103 8
Crested Goshawk Accipiter trivirgatus 113 11 111 9
Gould's Fulvetta Alcippe brunnea 41 7 39 5
Collared Pigmy Owlet Glaucidium brodiei 59 16 57 9
Swinhoe's Pheasant Lophura swinhoii 92 5 90 3
Steere's Liocichla Liocichla steerii 57 6 55 4
Red-headed Tit Aegithalos concinnus 98 9 96 7
Collared Bush Robin Erithacus johnstoniae 147 5 145 3
Total number of birdsong segments 2683 296 2627 22526
27
Experimental Results (cont.)
Comparison of classification accuracy for different number of GMM Gaussian components (G) and distinct PCA thresholds () using 624 ART basis
functions for the recognition of birdsong segments having distinct durations (D)
28
Experimental Results (cont.)
Comparison of classification accuracy on distinct ART basis functions (NM) for the classification of birdsong segments having different durations (D) with
fixed number of GMM component (G = 5)
29
Experimental Results (cont.)
COMPARISON OF VARIOUS FEATURE DESCRIPTORS IN TERMS OF CLASSIFICATION ACCURACY (CA)
DescriptorD = 3 D = 5
CA (%) (G, ) CA (%) (G, )
LPCC 30.41 (50, 0.98/0.99) 40.00 (30, 0.99)
MFCC 46.62 (35, 0.98/0.99) 56.89 (45, 0.95/0.96/0.97)
TDMFCC 69.86 (10, 0.96) 77.13 (5, 0.95)
DTDMFCC 76.03 (5, 0.99) 83.86 (10, 0.99)
SDTDMFCC 73.63 (10, 0.95) 79.82 (10, 0.95/0.96)
ART 86.30 (5, 0.97/0.98) 94.62 (5, 0.95/0.97)
30
Thanks!