17-18 Mai 2006 Evaluation INRIA 1 METISS METISS METISS Modélisation et Expérimentation pour le Traitement des Informations et des Signaux Sonores Scientific leader : Frédéric BIMBOT Audio & speech processing Audio & speech processing Overview of activities 2002-2005 Overview of activities 2002-2005 INRIA-Rennes
35
Embed
METISS 17-18 Mai 2006Evaluation INRIA1 METISS Modélisation et Expérimentation pour le Traitement des Informations et des Signaux Sonores Scientific leader.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
17-18 Mai 2006 Evaluation INRIA 1METISS
METISSMETISS Modélisation et Expérimentationpour le Traitement des Informations et des Signaux Sonores
to design generic, robust, fast and flexible approaches to a variety of problems in speech and audio segmentation, detection and classification, operating in the probabilistic framework
to investigate on theoretical properties and practical applications of adaptive representations and sparseness criteria with the purpose of advanced processing and structured description of audio signals
to extend and adapt approaches classically used in the context of speech processing to other classes of signals and problems
to study convergence between statistical approaches and adaptive decomposition within a common framework embedding signal representations and classification
If a sparse representation is sparse enough,then it is the sparsest one
17-18 Mai 2006 Evaluation INRIA 25
METISS
Matching Pursuit made tractableGribonval, Krstulovic et al.
C++ ToolkitGPL Licence
for a 1 hour audio signalprocessing time reduced from 20 h 0.25 h
flexible operationreproducible results
usable in other fields : medical signals, sismology, etc …
MPTK
17-18 Mai 2006 Evaluation INRIA 26
METISS
Source separation(with primary focus on undertermined problems)
Statistical schemes and adaptive training for single-channel separation
Source separation approaches using multi-channel Matching Pursuit in the underdetermined case
Contributions in evaluation methodology : task definition & performance measurements
Speech « denoising » using underdetermined sources separation techniques
Dictionary design methods for source separation [ongoing]
DEMIX : a robust algorithm to estimate the number of sources using clustering techniques [ongoing]
17-18 Mai 2006 Evaluation INRIA 27
METISS
Single sensor audio source separation
Factorial GMM
Voice GMM
Music GMM
Observed signalVoice + Music
Wiener filter
EstimatedVoice signal
Benaroya, Bimbot, Gribonval, Ozerov (with FTR&D)
innovative scheme for underdetermined source separation compatibility with speech processing state-of-the-art strong links with sparse decomposition problems versatile and efficient for a range of audio description tasks
Use of afactorial GMMto builda time-varyingWiener filter
Articlein IEEETrans SAP2006
+ new resultsto come
17-18 Mai 2006 Evaluation INRIA 28
METISS
Underdetermined stereophonicsource separation using sparse method
Separation
least squares sparsity
Mixing matrix
Lesage, Gribonval et al.
Audio examplesavailable
17-18 Mai 2006 Evaluation INRIA 29
METISS
Collaborations, Disseminationand Visibility
Privileged cooperation with the TEXMEX group at IRISA (+ VISTA)
Consistent network of academic and industrial partners outside IRISA
Regular participation to collaborative projects (EU-IST, RNRT, bilateral partnership, …)
Strong involvement in concerted research actions (ESTER, MathSTIC, GDR-ISIS, NIST evaluations, …)
Visible participation to and production of free software : ELISA platform, AudioSeg, MPTK, SIROCCO, BSS-EVAL
Sustained effort of publication and dissemination of the group research results
Additional visibility through responsability taking in scientific societies, workshop organisation and editorial boards
17-18 Mai 2006 Evaluation INRIA 30
METISS
Summary 2002-2005Summary 2002-2005
Strategy and perspectivesStrategy and perspectives2006-20102006-2010
17-18 Mai 2006 Evaluation INRIA 31
METISS
Achievements 2002-2005 (1)
solid contributions to the state-of-the art with respect to several topics related to speaker and audio class modelling and recognition
key extension, experimentation and validation of the Hidden Markov Model framework for joint audio and video modelling and structuring
major theoretical and experimental progress in the field of sparse representations and adaptive decomposition
pioneering work in mono- and multi-channel source separation in the underdetermined case
17-18 Mai 2006 Evaluation INRIA 32
METISS
Achievements 2002-2005 (2)
strategic improvement in the efficiency of pursuit algorithms both in terms of search strategy and implementation
development of a usable know-how in keyword spotting and speech recognition
sustained activities in assessment methodology, resource distribution and evaluation campaigns
scientific objective #4 needs consolidation
17-18 Mai 2006 Evaluation INRIA 33
METISS
To keep our position in our initial field of expertise : models, algorithms and tools for automatic processing of audio and speech signal
To push our advantage in the field of sparse representations, both from the theoretical and applicative viewpoint.
To extend our scope towards more powerful approaches for the representation and modeling of audio and multi-modal signals with an audio component
To step in and progress in the area of compressing large-scale high-dimensional multi-modal data
Strategy 2006-2010
17-18 Mai 2006 Evaluation INRIA 34
METISS
Scientific challenges
Probabilistic multi-level multi-stream dependency models for the representation of multiple sources and the integration of heterogeneous levels of knowledge in audio (-visual) streams Bayesian networks
Data-driven representations, model discovery and self-structuring of information in audio and audio-visual streams and contents
theoretical consolidation
Experimental platforms and numerically efficient algorithms for large scale data and near real-time processing engineering work
Deeper understanding of the links between theoretical concepts of adaptive representation, sparse decomposition, multi-scale analysis and pratical implications in terms of robustness, separability and adaptability
potential links with SVM
Compressing large-scale high-dimensional multimodal data for storage, description and classification compressed sensing