Machine Learning Tools and Particle Swarm Optimization for Content-Based Search in Big Multimedia Databases Moncef Gabbouj Academy of Finland Professor Tampere University of Technology Tampere, Finland
Nov 13, 2014
Machine Learning Tools and Particle Swarm Optimization for
Content-Based Search in Big Multimedia Databases
Moncef Gabbouj Academy of Finland Professor
Tampere University of Technology Tampere, Finland
OUTLINE
v Big Data
v How to explore Big Data
v Prescriptive Analytics
v Future Trends and Policies
v Conclusions and Recommendations
19/05/14 Gabbouj – GCC 2013 2
OUTLINE
v Big Data
v How to explore Big Data
v Prescriptive Analytics
v Future Trends and Policies
v Conclusions and Recommendationsand
Recommendations 19/05/14 Gabbouj – GCC 2013 3
Big Data Sources
19/05/14 Gabbouj – GCC 2013 4 Source: King et. al., IEEE BD 2013
What is Big Data?
• File/Object Size,
19/05/14 Gabbouj – GCC 2013 5
Big Data refers to datasets which grow so large and complex that it is no longer possible to capture, store, manage, share, analyze and visualize within the current computational architecture, display and storage capacity.
Source: King et. al., IEEE BD 2013
The 4V of Big Data
19/05/14 Gabbouj – GCC 2013 6
Big Data in Science (1/2)
• 10 PB/year at start, 1000 PB in 10 years! 19/05/14 Gabbouj – GCC 2013 7
Big Data in Science (2/2)
19/05/14 Gabbouj – GCC 2013 8
Large Synoptic Survey Telescope (Chili) ~5-10 PB/year at start in 2012 ~100 PB by 2025
Pan-STARRS (Hawaii) – now: 800 TB/year – soon: 4 PB/year
Big Data in Business Sectors
19/05/14 Gabbouj – GCC 2013 9
Big Data Generated from Smart Grids
19/05/14 Gabbouj – GCC 2013 10
19/05/14 Gabbouj – GCC 2013 11
OUTLINE
v Big Data
v How to explore Big Data?
v Prescriptive Analytics
v Future Trends and Policies
v Conclusions and Recommendationsand
Recommendations 19/05/14 Gabbouj – GCC 2013 12
How to Explore Big Data?
19/05/14 Gabbouj – GCC 2013 13 Source: AYATA Media
OUTLINE
v Big Data
v How to explore Big Data
v Prescriptive Analytics
v Future Trends and Policies
v Conclusions and Recommendationsand
Recommendations 19/05/14 Gabbouj – GCC 2013 14
Descriptive Analytics
§ Classic descriptors § Advanced representations and tools
§ Optimization: PSO § Evolutionary Neural Networks § Advanced Clustering: CNBC § Feature synthesis
§ Big tools for Big Data
19/05/14 Gabbouj – GCC 2013 15
16
Content-Based Image Retrieval Scenario
An Automatic Object Extraction Method Based on Multi-scale Sub-segment
Analysis over Edge Field
19/05/14 Gabbouj – GCC 2013 17
Original scale = 1 scale = 3scale = 2
Canny Edge Field
Segmentation
Scale-Map CL SegmentSub-Segments
Object Extraction Examples
19/05/14 Gabbouj – GCC 2013 18
(a)2=CLN
(g)2=CLN
(d)3=CLN
(e)2=CLN
(c)1=CLN
(b)2=CLN
(h)1=CLN
(f)1=CLN
Quantum Mechanics Principles for Automatic Object Extraction
19/05/14 Gabbouj – GCC 2013 19
1
2
3
Goal: Apply principles of Quantum Mechanics through solving the time-independent Schrödinger’s equation:
to extract objects through an innovative and multi-disciplinary research track.
Object segmentation examples with tunneling effect. Red arrows indicate the regions where tunneling occurs
2D Walking Ant Histogram
19/05/14 Gabbouj – GCC 2013 20
ThinningNoisy Edge
FilteringJunction
Decomposition
Sub-SegmentFormation
Relevance Model
FeX
BilateralFilterRange
andDomainFiltering
( )dr σσ ,,Canny Edge
DetectorNon-Maximum
SupressionHysterisis
PolyphaseFilters
Interpolation
Decimation
FrameResampling
NoSScales
Scale-mapFormation
),,( highlow thrthrσ
MMDatabase
scale=1
scale=3
scale=2
Canny
Canny
CannyOriginal
Canny
2DWAH
2D WAH for Branches
2D WAH for Corners
Corners
Branches
20=SN
2D WAH Corner Detection Original Image Proposed
Corner Detector
19/05/14 Gabbouj – GCC 2013 21
2D WAH Image Retrieval
19/05/14 Gabbouj – GCC 2013 22
Stamps
Stop SignTower
Pyramid
M-MUVIS Retrieval on Nokia 9500
19/05/14 Gabbouj – GCC 2013 23
Query Image
11 best matched retrieved images
Lessons Learned (the hard way)
Clustering helps!
Gabbouj – GCC 2013 24
Special type of classifiers – media content – Efficient (optimized) – Scalable – Dynamic (incremental)
Prescriptive Analytics
§ Classic signal and imge processing and analysis tools
§ Optimization: PSO § Evolutionary Neural Networks § Advanced Clustering: CNBC § Improved Features: EFS § Big tools for Big Data
19/05/14 Gabbouj – GCC 2013 25
Optimization.. • Weak Definition: Search for a minimum or
maximum of a function, system or surface. • Deterministic Greedy Descent Methods
– Function Minimization: Gradient Descent Methods – Feed-Forward ANN Training: Back-Propagation (BP) – GMM Training: Expectation-Maximization (EM) – Data Clustering: K-means (K-medians, FCM, etc.) – ...
• They are very efficient for uni-modal functions or surfaces, i.e. Fast, guaranteed convergence, simple..
• What about multi-modal functions or surfaces?
27 GRIEWANK DEJONG ROSENBROCK
SPHERE GIUNTA RASTRIGIN
DSP Requires Optimization, but how to do it?
Greedy Descent Methods: Problems..
• They converge to the nearest local optimum.
• Random Initialization à Random Convergence..
• Results are unreliable, unrepeatable and sub-optimum.
• Only “works” for simple problems..
• Take e.g. K-means clustering
• K?
How does Nature Optimize?
• We wish to design something – we want the best possible (or, at least a very good) design.
• The set S is the set of all possible designs. It is always much too large to search through this set one by one, however we want to find good examples in S.
• In nature, this problem seems to be solved wonderfully well, again and again and again, by evolution.
• Nature has designed millions of extremely complex machines, each almost ideal for their tasks using the evolution as the only mechanism.
Swarm Intelligence • How do swarms of birds, fish, etc. manage to
move so well as a unit? How do ants manage to find the best sources of food in their environment. Answers to these questions have led to some very powerful new optimisation methods, that are different to EAs. These include Ant Colony Optimisation (ACO), and Particle Swarm Optimisation (PSO).
• Also, only by studying how real swarms work are we able to simulate realistic swarming behaviour
Evolutionary Computation Algorithms 1. Initialize the population 2. Calculate the fitness of each individual in the Population. 3. Reproduce selected individuals to form a new generation, e.g. in GA: Perform evolutionary
operations such as crossover and mutation 4. Loop to step 2 until some condition is met ü The Rule: The survival of the fittest..
Evolutionary Computation Paradigms • Genetic algorithms (GAs) - John Holland • Evolutionary programming (EP) - Larry Fogel • Evolution strategies (ES) - I. Rechenberg • Genetic programming (GP) - John Koza • Particle swarm optimization (PSO) - Kennedy
& Eberhart (1995)
SWARMS
• Coherence without choreography
• Particle swarms; “.. behavior of a single
organism in a swarm is often insignificant but their collective and social behavior is of paramount importance”
Some swarms
Intelligent Swarm • A population of interacting individuals that
optimizes a function or goal by collectively adapting to the local and/or global environment
• Swarm intelligence ≅ collective adaptation • A “swarm” is an apparently disorganized collection
(population) of moving individuals that tend to cluster together while each individual seems to be moving in a random direction
• We also use “swarm” to describe a certain family of social processes
Introduction to Particle Swarm Optimization (PSO)
A concept for optimizing nonlinear functions
• Has roots in artificial life and evolutionary computation
• Developed by Kennedy and Eberhart (1995) • Simple in concept • Easy to implement • Computationally efficient • Effective on a variety of problems
Features of Particle Swarm Optimization • Population initialized by assigning random
positions and velocities; potential solutions are then flown through hyperspace.
• Each particle keeps track of its “best” (highest fitness) position in hyperspace.
• This is called pbest for an individual particle • It is called gbest for the best in the population • At each time step, each particle stochastically
accelerates toward its pbest and gbest (or lbest).
Particle Swarm Optimization Process 1. Initialize population in hyperspace. 2. Evaluate fitness of individual particles. 3. Modify velocities based on previous best and
global (or neighborhood) best. 4. Terminate on some condition. 5. Go to step 2.
19/05/
14
39
Velocity Update Equation for a PSO particle
• Basic version:
where d is the dimension, c1 and c2 are positive constants, rand and Rand are random functions, and w is the inertia weight.
New v = (particle Inertia) + (Cognitive term) + (Social term)
41
Basic PSO (bPSO)
42
bPSO ..
19/05/14 43
Shortcomings of PSO
• The dimensionality of the solution space must be fixed
• Premature convergence to local minima • Degeneracy of the search space in case of
high dimensionality (particle velocities lapse into degeneracy in such a way that successive range is restricted in a sub-plane of the full search hyper-plane)
44
Extending PSO to Work on Varying Dimensionality: MD PSO Algorithm
• Instead of operating at a fixed dimensionality N, the MD PSO algorithm is designed to seek both positional and dimensional optima within a dimensionality range, (Dmin<N<Dmax).
• To do this, each particle is iterated through two interleaved PSO processes:
– a regular positional PSO, i.e. the traditional velocity update and due positional shift in N dimensional search (solution) space,
– a dimensionality PSO, which allows the particle to navigate through different dimensionalities.
MD PSO Algorithm (1)
• Each particle keeps track of its last position, velocity and personal best position (pbest) in a particular dimension so that when it re-visits that the same dimension at a later time, it can perform its regular “positional” fly using this information.
• The dimensional PSO process of each particle may then move the particle to another dimension where it will remember its positional status and keep “flying” within the positional PSO process in this dimension, and so on.
MD PSO Algorithm (2) • The swarm keeps track of the gbest particles in
each dimensionality, indicating the best (global) position so far achieved (and will be used in the regular velocity update equation for that dimensionality).
• Similarly the dimensionality PSO process of each particle uses its personal best dimensionality in which the personal best fitness score has so far been achieved.
• Finally, the swarm keeps track of the global best dimension, dbest, among all the personal best dimensionalities.
• The gbest particle in dbest dimensionality represents the optimum solution and dimensionality, respectively.
MD PSO illustration..
Multimedia Group – Profs. Moncef Gabbouj and Serkan Kiranyaz
Go to d =23
gbest(3)
9
7
3)(9 =txd
gbest(2)d=2
d=3
2)(7 =txd
MD PSO(dbest) a
23)( =txda
OK!
MD PSO Algorithm (4)
MD PSO Algorithm (5)
MD PSO Algorithm (6)
A Second Extension of PSO: Fractional Global Best formation (FGBF)
• Motivation: Both PSO and MD PSO may suffer from premature convergence (i.e. convergence to a local optimum)
• Idea: Can we provide a better “guide” than the Swarm’s Global Best? • Proposal: Introduce a new particle to the swarm
whose j’th component is the corresponding swarm’s best component (i.e. component-wise best particle). This new particle is called an artificial GB particle (aGB) and the process is called Fractional GB formation (FGBF).
FGBF (2)
X
1
3
8 +
gbest
x
y
bestxΔ
bestyΔ
),( 11 yx
),( 88 yx
),( 33 yx
),(: 83 yxaGB
0
),( TT yxTarget:
FGBF
FGBF
FGBF (3) • aGB can and usually is better than gbest, especially at the beginning of the
iteration • aGB has the advantage of assessing each dimension of every particle in
the swarm individually, and uses the most promising (or simply the best) components among them.
• Using the available diversity among individual dimensional components,
FGBF can prevent the swarm from being trapped in a local optimum due to its ongoing and varying FGBF operations.
• At each iteration, FGBF is performed after the assignment of the swarm’s
gbest particle and the best one between the two will be the GB particle, which is used in the swarm’s velocity updates, i.e., the swarm will be guided always by the best (winner) GB particle at any time.
• What are the limitations of FGBF? (requires the component-wise evaluation of the fitness function, i.e. it’s a problem-dependent)
Experimental Results 1- Function Minimization
56 GRIEWANK DEJONG ROSENBROCK
SPHERE GIUNTA RASTRIGIN
DSP Requires Optimization, but how to do it?
(Uni-modal) De Jong Function MD-PSO Basic PSO
Fitness score vs. iteration number
Fitness score vs. iteration number
Dim. vs. iteration number Dim. vs. iteration number
Red curves trace the performance of the GB particle which could be either the new gbest or aGB when FGBF is used, whereas, the blue curves (backward) trace the behavior of the gbest particle when the termination criterion is met.
Unimodal Sphere, MD PSO with vs. without FGBF
MD-PSO with FGBF MD-PSO without FGBF Fitness score vs. iteration
number Fitness score vs. iteration
number
Dim. vs. iteration number Dim. vs. iteration number
Multimodal Giunta MD-PSO with FGBF MD-PSO without FGBF
Fitness score vs. iteration number
Fitness score vs. iteration number
Dim. vs. iteration number Dim. vs. iteration number
MD PSO with and without FGBF on Schwefel
FGBF guidance in run-time
Effects of dimension and swarm size
Grie
wan
k R
astri
ng
S = 80 S = 320
d0 = 20, d0 = 80
65
2. Application to Data Clustering
• In clustering, similar to other PSO applications, each particle represents a potential solution at a particular time t, i.e. the particle a in the swarm, is formed as,
• where is the jth (potential) cluster centroid in N dimensional data space and K is the number of clusters fixed in advance.
},..,,..,{ 1 Sa xxx=ξ
jajaKajaaa ctxccctx ,,,,1, )(},..,,..,{)( =⇒=
jac ,
Application to Data Clustering
• Note that contrary to nonlinear function minimization in the earlier section, the data space dimension, N, is now different than the solution space dimension, K. Furthermore, the fitness function, f that is to be optimized, is formed with respect to two widely used criteria in clustering:
• Compactness: Data items in one cluster should be similar or close to each other in N dimensional space and different or far away from the others when belonging to different clusters.
• Separation: Clusters and their respective centroids should be distinct and well-separated from each other.
∑ ∑= ∈
−=ΔK
k cxpkKmeans
kp
xc1
2
∑∑
=
∈∀
−
=
+−+=
K
j ja
xzpja
ae
aeaaa
x
zx
KxQwhere
xQwxdZwZxdwZxf
jap
1 ,
,
3minmax2max1
,1)(
)())((),(),(
67
MD PSO & FGBF for Data Clustering
• Particle a in the swarm has the following form:
and represents a potential solution (i.e. the cluster centroids) for number of clusters where the jth component is the jth cluster centroid.
jatxd
jatxdajaatxd
a ctxxccctxx a
a
a,
)(,)(,,1,
)( )(},..,,..,{)( =⇒=
)(txda
Data Clustering in 2D: Some Synthetic Examples
Standalone (MD) PSO clustering.. (OK for easy datasets)
S. Kiranyaz, T. Ince, A. Yildirim and M. Gabbouj, “Fractional Particle Swarm Optimization in Multi-Dimensional Search Space”, IEEE Transactions on Systems, Man, and Cybernetics – Part B, pp. 298 – 319, vol. 40, No. 2, April 2010. S. Kiranyaz, T. Ince, and M. Gabbouj, “Stochastic Approximation Driven Particle Swarm Optimization with Simultaneous Perturbation (Who will guide the guide?)”, Applied Soft Computing Journal, 11(2), pp. 2334-2347, 2011.
Dominant Color Extraction based on Dynamic Clustering by Multi-
Dimensional Particle Swarm Optimization
Median-Cut(Original)
MPEG-7DCD Proposed
Serkan Kiranyaz, Stefan Uhlmann, Turker Ince and Moncef Gabbouj, "Perceptual Dominant Color Extraction by Multi-Dimensional Particle Swarm Optimization, “EURASIP Journal on Advances in Signal Processing, vol. 2009 (2009), Article 451638, 13 pages, doi:10.1155/2009/451638
Experimental Results • We have made comparative evaluations against MPEG-7 DCD over a
sample database with 110 images, which are selected from Corel database in such a way that the prominent colors (DCs) can be selected by ground-truth:
0 20 40 60 80 100 1200
5
10
15
20
25
image number
DC Number
Ts=15, Ta=1%Ts=25, Ta=1%Ts=25, Ta=5%
Figure 4: Number of DC plot from three different MPEG-7 DCDs over the sample database. Note how the number of DCs is strictly dependent to the parameters used and can vary significantly, e.g. between 2 to 25 even for a particular image.
Median-Cut(Original)
MPEG-7DCD Proposed
Median-Cut algorithm produces 256 (maximum) colors, which is almost identical to the original image.
Median-Cut(Original)
MPEG-7DCD Proposed
• S. Kiranyaz, S. Uhlmann, T. Ince, and M. Gabbouj, “Perceptual Dominant Color Extraction by Multi-Dimensional Particle Swarm Optimization”, EURASIP Journal on Advances in Signal Processing, vol. 2009, Article ID 451638, 13 pages, 2009. doi:10.1155/2009/451638.
Median-Cut(Original)
MPEG-7DCD Proposed
Median-Cut(Original)
MPEG-7DCD Proposed
OUTLINE
• Optimization Tools (PSO and extensions) • Applications in function minimization, data
clustering and image retrieval • Machine Learning tools
– Evolving NNs with MD PSO – Novel Classifiers (CNBC) – Evolutionary feature synthesis
• Applications in CBIR • Conclusions
Multimedia Group – Prof. Moncef Gabbouj and Prof. Serkan Kiranyaz
Unsupervised Design of Artificial Neural Networks via Multi-Dimensional Particle
Swarm Optimization
S. Kiranyaz, T. Ince, A. Yildirim and M. Gabbouj, “Evolutionary Artificial Neural Networks by Multi-Dimensional Particle Swarm Optimization”, Neural Networks, vol. 22, pp. 1448 – 1462, Dec. 2009. (top 5th downloaded paper from Elsevier Journal since 2009)
Artificial Neural Networks (ANNs) • Neural Networks are computer programs designed to recognize patterns
and learn like the human brain. • Used for prediction and classification. Iteratively determine best weights.
(input/hidden/output layers) • After introduction of simplified neurons by McCulloch and Pitts in 1943,
ANNs have been applied widely to many application areas, most of which used feed-forward ANNs , or the so-called multi-layer perceptrons (MLPs) with Back Propagation (BP) training algorithm.
• For training ANNs, many researchers reported that Evolutionary Algorithms (EAs), such as genetic algorithm, evolutionary programming, and PSO, can outperform BP specially for large networks. In addition, EAs are population based stochastic processes and they can avoid being trapped in a local optimum.
• Evolutionary ANNs can be automatically designed (internal structure and parameters) according to the problem.
Introduction
" A novel technique for automatic design of Artificial Neural Networks (ANNs) by evolving to the optimal network configuration(s) within an architecture space.
• With the proper encoding of the network configurations and parameters into particles, MD PSO can then seek for positional optimum in the error space and dimensional optimum in the architecture space.
• The efficiency and performance of the proposed technique is demonstrated over one of the hardest synthetic problems. The experimental results show that MD PSO evolves to optimum or near-optimum networks in general.
MD PSO for evolving ANNs
• MD PSO negates the need of fixing the dimension of the solution space in advance. We then adapt MD PSO technique for designing (near-) optimal ANNs.
• The focus is particularly drawn on automatic design of the feed-forward ANNs and the search is carried out over all possible network configurations within the specified architecture space.
Main Idea:
• All potential network configurations are transformed into a hash (dimension) table with a proper hash function where indices represent the solution space dimensions of the particles, MD PSO can then seek both positional and dimensional optima in an interleaved PSO process.
• The optimum dimension found naturally corresponds to a distinct ANN architecture where the network parameters (connections, weights and biases) can be resolved from the positional optimum reached on that dimension.
19/05/14 85
Architecture Space Definition over MLPs:
• Layers: • Neurons: for à • MLPs:Let F be the activation function applied
over the weighted inputs plus a bias, as follows:
• The training MSE, is formulated as,
},{ maxmin LL},{ maxmin
ll NN maxmin LlL ≤≤},,...,,{ 1
min1minmin
maxO
LI NNNNR −= },,...,,{ 1
max1maxmax
maxO
LI NNNNR −=
lk
lpj
j
ljk
lpk
lpk
lpk ywswheresFy θ+== −−∑ 1,1,,, )(
( )∑∑∈ =
−=Tp
N
k
Opk
pk
O
O
ytPN
MSE1
2,
21
19/05/14 86
Dim. Configuration Dim. Configuration 1 9 x 2 22 9 x 5 x 2 x 2 2 9 x 1 x 2 23 9 x 6 x 2 x 2 3 9 x 2 x 2 24 9 x 7 x 2 x 2 4 9 x 3 x 2 25 9 x 8 x 2 x 2 5 9 x 4 x 2 26 9 x 1 x 3 x 2 6 9 x 5 x 2 27 9 x 2 x 3 x 2 7 9 x 6 x 2 28 9 x 3 x 3 x 2 8 9 x 7 x 2 29 9 x 4 x 3 x 2 9 9 x 8 x 2 30 9 x 5 x 3 x 2
10 9 x 1 x 1 x 2 31 9 x 6 x 3 x 2 11 9 x 2 x 1 x 2 32 9 x 7 x 3 x 2 12 9 x 3 x 1 x 2 33 9 x 8 x 3 x 2 13 9 x 4 x 1 x 2 34 9 x 1 x 4 x 2 14 9 x 5 x 1 x 2 35 9 x 2 x 4 x 2 15 9 x 6 x 1 x 2 36 9 x 3 x 4 x 2 16 9 x 7 x 1 x 2 37 9 x 4 x 4 x 2 17 9 x 8 x 1 x 2 38 9 x 5 x 4 x 2 18 9 x 1 x 2 x 2 39 9 x 6 x 4 x 2 19 9 x 2 x 2 x 2 40 9 x 7 x 4 x 2 20 9 x 3 x 2 x 2 41 9 x 8 x 4 x 2 21 9 x 4 x 2 x 2
19/05/14 87
MD PSO for Evolving MLPs
• At a time t, suppose that the particle a in the swarm, has the positional component formed as,
• Where and represent the sets of weights and biases of the layer l. Note that the input layer (l=0) contains only weights whereas the output layer (l=O) has only biases. By means of such a direct encoding scheme, the particle a represents all potential network parameters of the MLP architecture at the dimension (hash index)
},..,,..,{ 1 Sa xxx=ξ
⎪⎭
⎪⎬⎫
⎪⎩
⎪⎨⎧
=−− }{},{},{,...,
}{},{},{},{},{)(
11
22110)(
Ok
Ok
Ojk
kjkkjkjktxda w
wwwtxx a
θθ
θθ
}{ ljkw }{ l
kθ
The Two-spiral Problem
Many attempts, e.g. Jia and Chua, IEEE International Conference on Neural Networks, 1995. The authors studied the effect of input data representation on the performance of back-propagation neural network in solving a highly nonlinear two-spiral problem.
Gabbouj - 2014
89
Results over Two-spirals problem: " Given the following architecture space with 1,2,3
layer MLPs: },1,1,{: min11
OI NNRR = },4,8,{max1
OI NNR =
0 5 10 15 20 25 30 35 40 45
0.35
0.4
0.45
0.5
Min. ErrorMean ErrorMedian Error
0 5 10 15 20 25 30 35 40 450
5
10
15
20
25
30
35
Figure 1. Error (MSE) statistics from exhaustive BP training (top) and dbest histogram from 100 MD PSO evolutions (bottom) for two-spirals problem.
BP
MD PSO
Automated Patient-specific Classification
of ECG Data
T. Ince, S. Kiranyaz, and M. Gabbouj, “A Generic and Robust System for Automated Patient-specific Classification of Electrocardiogram Signals”, IEEE Transactions on Biomedical Engineering, vol. 56, issue 5, pp. 1415-1426, May 2009.
91
System Overview
DimensionReduction
(PCA)
ExpertLabeling
BeatDetection
Data Acquisition
Morph.Feature
Extraction(TI-DWT)
Patient-specific data:first 5 min. beats
MD PSO:Evolution + Training
Common data: 200 beats
Training Labels per beat
Beat Class Type
Patient X
Temporal Features
ANNSpace
• Experimental Results – MD PSO Optimality Evaluation
Figure: Error (MSE) statistics from exhaustive BP training (top) and dbest histogram from 100 MD PSO evolutions (bottom) for patient record 222.
• Experimental Results – MD PSO Optimality Evaluation
Error (MSE) statistics from exhaustive BP training (top) and dbest histogram from 100 MD PSO evolutions (bottom) for patient record 214.
19/05/14 94
Performance Evaluation
% Normal PVC Other
Method Acc Sen Pp Sen Pp Sen Pp DWT /
ANN (Inan et al.) 95.2 98.1 97 85.2 92.4 87.4 94.5
(DWT+PCA) / MD PSO - ENN (Proposed) 97.0 99.4 98.9 93.4 93.3 87.5 97.8
For PVC detection, the following beat types are considered: Normal, PVC, LBBB, RBBB, aberrated atrial premature, atrial premature contraction, and supraventricular premature beats.
A “Divide & Conquer” Classifier Topology: Collective Network of (Evolutionary) Binary
Classifiers
For CBIR, the key questions.. 1) How to select certain features so as to achieve
highest discrimination over certain classes? 2) How to combine them in the most effective way? 3) Which distance metric to apply? 4) How to find the optimal classifier configuration for
the classification problem in hand? 5) How to scale/adapt the classifier if large number
of classes/features are incrementally introduced? 6) How to train the classifier efficiently to maximize
the classification accuracy?
Objectives: • Evolutionary Search: Seeking for the optimum network
architecture among a collection of configurations (the so-called Architecture Space, AS).
• Feature/Class Scalability: Support for varying number of features and classes. A new feature/class can be dynamically integrated into the framework without requiring a full-scale initialization and re-evolution.
• High efficiency for the evolution (or training) process: Using as compact and simple classifiers as possible in the AS.
• Online (incremental) Evolution: Continuous online/incremental training (or evolution) sessions can be performed to improve the classification accuracy.
• Parallel processing: Classifiers can be evolved using several processors working in parallel.
The CNBC framework..
• Each NBC corresponds to a unique semantic class and shall contain indefinite number of evolutionary binary classifiers (BCs) in the input layer where each BC performs binary classification over an individual feature.
• Each BC in an NBC shall in time learn the significance of individual dimensions of the corresponding feature vector for the discrimination of its class.
• Finally, a “fuser” BC in the output layer shall fuse the binary outputs of all BCs in the input layer and outputs a single binary output, indicating the relevance of each media item to its class.
The overview of the CNBC framework.
FeatureVectors
0CV
1−NBC0BC 1BC
0FV 1FV 1−NFV
0NBCFuser
1−CCV
1−NBC0BC 1BC
0FV 1FV 1−NFV
1−CNBCFuser
1CV
1−NBC0BC 1BC
0FV 1FV 1−NFV
1NBCFuser
Class/Feature Scalability • The proposed CNBC framework makes the system
scalable to any number of classes since whenever a new semantic class becomes available (user defined), the system simply creates and trains a new NBC for this class and thus the overall system dynamically adapts to user demands of semantic classes
• CNBC is also scalable wrt features, i.e., whenever a new feature is extracted, a new BC will be created, trained and inserted into each NBC of the system using the available Relevance Feedback, while keeping the other BCs unchanged.
Training & Evolution
• We shall be applying a “long term” learning strategy where the previous RF logs shall be stored and used for continuous, offline (“idle-time”) training of the entire system, in order to improve the overall classification performance.
• The evolution will be applied over an architecture space –not training of a single configuration. The architecture space containing the best possible BCs (with respect to a given criteria) shall always be kept intact and with each ongoing RF session, each BC configuration will therefore, “evolve” through a better state, whilst the best among all at a given time shall be used for classification and retrieval.
Training & Evolution Feature + Class
Vectors
Class Vectors
1−NBC0BC 1BC
0FV 1FV 1−NFV
0NBC
1−NBC0BC 1BC
0FV 1FV 1−NFV
1NBC
1−NBC0BC 1BC
0FV 1FV 1−NFV
1−CNBC
Architecture Spacesfor BCs
0 1 0 1 0 1 0 1 0 1 0 11 0 1 0 1 00CV
Fuser
1CV
Fuser
1−CCV
Fuser
100 =CV 011 =CV 101 =−CCV
CNBC Evolution Phase 1(Evolution of BCs in the 1st Layer)
CNBC Evolution Phase 2(Evolution of Fuser BCs)
1−NBC0BC 1BC
0FV 1FV 1−NFV
0NBC
1−NBC0BC 1BC
0FV 1FV 1−NFV
1−CNBC
1−NBC0BC 1BC
0FV 1FV 1−NFV
1NBC
100 =CV
Fuser
011 =CV
Fuser
101 =−CCV
Fuser
Best (so far) Classifiers in Architecture Spaces
Class Vectors
OUTLINE
• Optimization Tools (PSO and extensions) • Applications in function minimization, data
clustering and image retrieval • Machine Learning tools
– Evolving NNs with MD PSO – Novel Classifiers (CNBC) – Evolutionary feature synthesis
• Applications in CBIR • Conclusions
Multimedia Group – Prof. Moncef Gabbouj and Prof. Serkan Kiranyaz
CNBC for Polarimetric SAR Image Classification
S. Kiranyaz, T. Ince, S. Uhlmann, and M. Gabbouj, “Collective Network of Binary Classifier Framework for Polarimetric SAR Image Classification: An Evolutionary Approach”, IEEE Transactions on Systems, Man, and Cybernetics – Part B, (in Press).
The CNBC test-bed application GUI showing a sample user-defined ground truth set over San Francisco Bay area.
Multimedia Group – Prof. Moncef Gabbouj and Prof. Serkan Kiranyaz
CET-1
CET-2 CET-3
Water Urban Forest FlatZones Mountain/Rock
Multimedia Group – Prof. Moncef Gabbouj and Prof. Serkan Kiranyaz
Retrieval Results: With and Without CNBC
4x2 sample queries in Corel_10 (qA and qB), and Corel_Caltech_30 (qC and qD) databases Top-left is the query image.
Traditional With CNBCqA
Traditional With CNBCqB
Traditional With CNBCqC
Traditional With CNBCqD
Retrieval Results: With and Without CNBC
Traditional With CNBCqA
Traditional With CNBCqB
Traditional With CNBCqC
Traditional With CNBCqD
Retrieval Results: With and Without CNBC
Traditional With CNBCqA
Traditional With CNBCqB
Traditional With CNBCqC
Traditional With CNBCqD
Evolutionary Feature Synthesis
Multimedia Group – Prof. Moncef Gabbouj
EFS
class-1class-2class-3
Evolutionary Feature Synthesis Why do we Need it?
• Discriminative features are essential for classification, retrieval etc.
• Semantic gap – Low level features cannot fully match
with the human perception of similarity – Higher level of understanding is
necessary
• Using the experience/knowledge of human similarity perception, highly discriminative features can be synthesized from low-level features.
Multimedia Group – Prof. Moncef Gabbouj
Evolutionary Feature Synthesis by MD PSO
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-1
-0.5
0
0.5
1
(1,0)
1x
2x { }2122
21 2,, xxxx
2D à 3D
(1,0)2y
1y
class-1class-2
)2sin( fxπ1D à 1D 0 1
(FS-1)
class-1class-2
(FS-2)
FV
ImageDatabase
FeX
MD-PSO basedFeature Synth. Fitness
Eval.(1-AP)
Synt.FV (1)Ground Truth
MD-PSO basedFeature Synth.
Synt.FV (R)
Synt.FV (R-1)
0x
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢
⎣
⎡
Β
Β
Β
Α
=
K
K
djaxx
θ
θ
θ
...
...2
2
1
1
1
,
where,
[ ] [ ) [ ] [ ]KiFNdj ii ,1,,1,,0,,1,0 1 ∈∈ℜ∈ΒΑ−∈ θ
⎣ ⎦ [ ] ⎣ ⎦ [ ]
)(1,0,1
1,0,1,01
111
1
ii
iii
i
Operatorwwiwandw
NBiN
θ
βαβα
βαβα
≡Θ
<≤−Β=−Α=
−∈=−∈Α=Let:
1x
1αx
1βx
2βx
Kxβ
1−Nx
1αw
1βw
2βw
Kwβ
1Θ 2Θ KΘ
0y
1y
jy
1−dy
Original FV(N-dimensional)
Synthesized FV(d-dimensional)
Multimedia Group – Prof. Moncef Gabbouj and Prof. Serkan Kiranyaz
Overview of the Evolutionary Feature Synthesizer
§ We perform an evolutionary search technique, which for each new feature: • selects K+1 original (or synthesized ) features, f0,…, fK
• scales the selected features using proper weights, w0,…, wK • selects K operators, Θ1,…, ΘK, to be performed over the (selected
and scaled) features • bounds the results using a non-linear operator (i.e. tangent
hyperbolic, tanh). § If the application of a specific operator, Θi, on features, fa and fb, is
denoted as Θi (fa, fb ) the synthesis formula used to form each new feature may be given as follows:
Multimedia Group – Prof. Moncef Gabbouj and Prof. Serkan Kiranyaz
( )( )( )( )( )1 2 1 0 0 1 1 2 2tanh ... , , ,... ,j K K K Ky w f w f w f w f−= Θ Θ Θ Θ
Some Fitness Functions
Ø It is practically not possible to use any direct retrieval measure (e.g. ANMRR)
Ø We originally used clustering validity index (CVI) combined with the number of false positives
Ø The retrieval results were not always improving even though the fitness measure was greatly improved
Ø We adopted an approach similar to ANNs, but instead of 1-of-c coding we used output codes inspired by ECOC
Ø The fitness measure is the MSE to the target output vector (divided by the output dimensionality)
( ) ( ) ( ) ( )mean, min,/ ,j j j j j i j i jf Z FP Z d c d c c= +
Multimedia Group – Prof. Moncef Gabbouj and Prof. Serkan Kiranyaz
Experimental Results - Setup
§ 1000 image Corel database with 10 distinct classes § Low-level features used : RGB histogram, YUV histogram, LBP, Gabor
features
Multimedia Group – Prof. Moncef Gabbouj and Prof. Serkan Kiranyaz
EFS RETRIEVAL RESULTS RGB color histogram (4x4x4) Original Features EFS Run-2 & 3EFS Run-1
Multimedia Group – Prof. Moncef Gabbouj and Prof. Serkan Kiranyaz
Multimedia Group – Prof. S. Kiranyaz
Conclusions
Ø MD PSO is a poweful optimization tool which can be used in several fields, including function minimization, clustering and CBIR
Ø CNBC represents the core clustering mechanism used in MUVIS CBIR search engine
Ø EFS framework presents a promising performance Ø MUVIS (with MD PSO, CNBC and EFS) is a step forward
towards accomplishing the Descriptive Analytics in ”BIG” data
Particle Swarm Optimation
19/05/14 Gabbouj – GCC 2013 120
Go to d =23
gbest(3)
9
7
3)(9 =txd
gbest(2)d=2
d=3
2)(7 =txd
MD PSO(dbest) a
23)( =txda
OK!
Multi-Dimensional PSO is a recent optimization algorithm based on particle swarms which finds the optimal solution at the optimal dimension (it can be applied to optimization in multi-dimensional spaces where the dimension of the solution space is not known a priori).
S. Kiranyaz, T. Ince, A. Yildirim and M. Gabbouj, “Fractional Particle Swarm Optimization in Multi-Dimensional Search Space”, IEEE Trans. on Systems, Man, and Cybernetics – Part B, pp. 298 – 319, vol. 40, No. 2, April 2010.
Evolutionary Artificial Neural Networks Goal: Design optimal neural networks through an evolutionary optimization process based on MD-PSO. S. Kiranyaz, T. Ince, A. Yildirim and M. Gabbouj, “Evolutionary Artificial Neural Networks by Multi-Dimensional Particle Swarm Optimization”, Neural Networks, vol. 22, pp. 1448 – 1462, Dec. 2009. 8th “most-cited” paper in the Journal of Neural Networks since 2008.
19/05/14 Gabbouj – GCC 2013 121
⎪⎭
⎪⎬⎫
⎪⎩
⎪⎨⎧
=−− }{},{},{,...,
}{},{},{},{},{)(
11
22110)(
Ok
Ok
Ojk
kjkkjkjktxda w
wwwtxx a
θθ
θθ
Divide And Conquer Collective Network of Binary Classifier (CNBC)
Framework
19/05/14 Gabbouj – GCC 2013 122
FeatureVectors
0CV
1−NBC0BC 1BC
0FV 1FV 1−NFV
0NBCFuser
1−CCV
1−NBC0BC 1BC
0FV 1FV 1−NFV
1−CNBCFuser
1CV
1−NBC0BC 1BC
0FV 1FV 1−NFV
1NBCFuser
Goal: Design an efficient classifier for multimedia databases which is highly scalable and its kernel is continuously updated with the aid of the evolutionary MD-PSO technique. S. Kiranyaz, T. Ince, S. Uhlmann, and M. Gabbouj, “Collective Network of Binary Classifier Framework for Polarimetric SAR Image Classification: An Evolutionary Approach”, IEEE Trans. on Systems, Man, and Cybernetics – Part B, pp. 1169-1186, August 2012.
Retrieval Examples
19/05/14 Gabbouj – GCC 2013 123
How to Explore Big Data?
19/05/14 Gabbouj – GCC 2013 124 Source: AYATA Media
Evolutionary Feature Synthesis
19/05/14 Gabbouj – GCC 2013 125
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-1
-0.5
0
0.5
1
(1,0)
1x
2x { }2122
21 2,, xxxx
2D à 3D
(1,0)2y
1y
class-1class-2
)2sin( fxπ1D à 1D 0 1
(FS-1)
class-1class-2
(FS-2)EFS
class-1class-2class-3
FV
ImageDatabase
FeX
MD-PSO basedFeature Synth. Fitness
Eval.(1-AP)
Synt.FV (1)Ground Truth
MD-PSO basedFeature Synth.
Synt.FV (R)
Synt.FV (R-1)
EFS Retrieval Results
19/05/14 Gabbouj – GCC 2013 126
Original Features EFS Run-2 & 3EFS Run-1
Patient Specific EEG Segmentation and Classification
19/05/14 Gabbouj – GCC 2013 127
Data Acquisition
Patient XFeatureExtraction
0CV
1−NBC
0BC
1BC
0NBC
1CV
1−NBC
0BC
1BC
1NBC
17CV
1−NBC
0BC
1BC
17NBC
NormalizedFeature Vectors
Norm.
EEGCNBC
EEGClassification
Expert Labels
ExpertLabeling
Evolution + Training
Early EEG Records
Patient Specific ECG Segmentation and Classification
19/05/14 Gabbouj – GCC 2013 128
DimensionReduction
(PCA)
ExpertLabeling
BeatDetection
Data Acquisition
Morph.Feature
Extraction(TI-DWT)
Patient-specific data:first 5 min. beats
MD PSO:Evolution + Training
Common data: 200 beats
Training Labels per beat
Beat C
lass Type
Patient X
Temporal Features
ANNSpace
Prescriptive Analytics
§ Classic signal and imge processing and analysis tools
§ Optimization: PSO § Evolutionary Neural Networks § Advanced Clustering: CNBC § Improved Features: EFS § Big tools for Big Data
19/05/14 Gabbouj – GCC 2013 129
Cloud CNBC for Big Data
19/05/14 Gabbouj – GCC 2013 130
Self-‐OrganizedBinary EFS Cloud
cNDEFS )5(),5(
SynthesizedFeature Vectors
FV-‐1
FV-‐N
MM DatabaseFeatureVectors
FV-‐2
0CV
1−NBC
0BC
1BC
0NBC
1CV
1−NBC
0BC
1BC
1NBC
17CV
1−NBC
0BC
1BC
17NBC
NBC Cloud(class C-‐1)
0CV
1−NBC
0BC
1BC
0NBC
1CV
1−NBC
0BC
1BC
1NBC
17CV
1−NBC
0BC
1BC
17NBC
0CV
1−NBC
0BC
1BC
0NBC
1CV
1−NBC
0BC
1BC
1NBC
17CV
1−NBC
0BC
1BC
17NBC
0CV
1−NBC
0BC
1BC
0NBC
1CV
1−NBC
0BC
1BC
1NBC
17CV
1−NBC
0BC
1BC
17NBC
ClassVectors
1)1(),1(
−CNDNBC
1)3(),3(
−CNDNBC
1−CCV
0CV
1−NBC
0BC
1BC
0NBC
1CV
1−NBC
0BC
1BC
1NBC
17CV
1−NBC
0BC
1BC
17NBC
NBC Cloud(class 0)
0CV
1−NBC
0BC
1BC
0NBC
1CV
1−NBC
0BC
1BC
1NBC
17CV
1−NBC
0BC
1BC
17NBC
0CV
1−NBC
0BC
1BC
0NBC
1CV
1−NBC
0BC
1BC
1NBC
17CV
1−NBC
0BC
1BC
17NBC
0CV
1−NBC
0BC
1BC
0NBC
1CV
1−NBC
0BC
1BC
1NBC
17CV
1−NBC
0BC
1BC
17NBC
class-‐0 MasterFuser BC
ClassVectors
0CV0
)1(),1( NDNBC
0)3(),3( NDNBC
cNDEFS )0(),0(
cNDEFS )1(),1(
cNDEFS )1(),1(
cNDEFS )0(),0(
class C-‐1 MasterFuser BC
19/05/14 Gabbouj – GCC 2013 137
OUTLINE
v Big Data
v How to explore Big Data
v Prescriptive Analytics
v Future Trends and Policies
v Conclusions and Recommendationsand
Recommendations 19/05/14 Gabbouj – GCC 2013 138
Future Trends
19/05/14 Gabbouj – GCC 2013 139
IP Traffic Growth
19/05/14 Gabbouj – GCC 2013 140
19/05/14 Gabbouj – GCC 2013 141
EU Big Data Policies
The European Data Forum 2013 of EC projects • BIG: Build a self-sustainable Industrial community around Big Data in Europe • LOD2: Linked open data Web • PlanetData: Large‐scale open-data sets management • Optique: Efficient Big Data access • Envision: Environmental services • TELEIOS: Earth observation Big Data • EUCLID: Professional training for Big Data practitioners
19/05/14 Gabbouj – GCC 2013 142
Cloud Computing and Cloud Enterprise
19/05/14 Gabbouj – GCC 2013 143
OUTLINE v Big Data
v How to explore Big Data
v Prescriptive Analytics
v Future Trends and Policies
v Conclusions and Recommendations
19/05/14 Gabbouj – GCC 2013 144
Conclusions and Recommendations o Big Data is everywhere o Requires Big Tools and
proper training o Engineering education
landscape is changing o Big Data will transform
our lives - A new generation
19/05/14 Gabbouj – GCC 2013 145
19/05/14 146
Will Big Data change our lives?
19/05/14 147
Ä Ö Å