Click to edit Master text styles Second level Third level Fourth level Fifth level Click to edit Master title style Blind Source Separation: Finding Needles in Haystacks Scott C. Douglas Department of Electrical Engineering Southern Methodist University [email protected]
40
Embed
Click to edit Master text styles Second level Third level Fourth level Fifth level Click to edit Master title style Blind Source Separation: Finding Needles.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Click to edit Master text stylesSecond levelThird levelFourth levelFifth level
Click to edit Master title style
Blind Source Separation: Finding Needles in Haystacks
where f(y) is a simple vector-valued nonlinearity.Criterion: Density-based (Maximum Likelihood)Complexity: about four multiply/adds per tap
=
Blind Source Separation Toolbox
• A MATLAB toolbox of robust source separation algorithms for noisy convolutive mixtures (developed under govt. contract)
• Allows us to evaluate relationships and tradeoffs between different approaches easily and rapidly
• Used to determine when a particular algorithm or approach is appropriate for a particular (acoustic) measurement scenario
Speech Enhancement Methods
• Classic (frequency selective) linear filtering Only useful for the simplest of situations
• Single-microphone spectral subtraction: Only useful if the signal is reasonably well-
separated to begin with ( > 5dB SINR ) Tends to introduce “musical” artifacts
• Research Focus: How to leverage multiple microphones to achieve robust signal enhancement with minimal knowledge.
Novel Techniques for Speech Enhancement
• Blind Source Separation: Find all the talker signals in the room - loud and soft, high and low-pitched, near and far away … without knowledge of any of these characteristics.
• Multi-Microphone Signal Enhancement: Using only the knowledge of “target present” or “target absent” labels on the data, pull out the target signal from the noisy background.
• Data collection and processing entirely within MATLAB. • Allows for careful characterization, fast evaluation, and experimentation with artificial and human talkers.
Performance improvement: Between 10 dB and 15 dB for “equal-level” mixtures, and even higher for
unequal-level ones.
Blind Source Separation Example
Convolutive Mixing (Room)
Separation System (Code)
Talker 1
(MG)
Talker 2(SCD)
Unequal Power Scenario ResultsUnequal Power Scenario Results
Time-domain CBSS Time-domain CBSS methods provide methods provide the greatest SIR the greatest SIR improvements for improvements for weak sources; no weak sources; no significant significant improvement in SIR improvement in SIR if the initial SIR is if the initial SIR is already largealready large
Noise Source
Noise Source
Speech Source
Linear Processing
AdaptiveAlgorithm
Multi-Microphone Speech Enhancement
Contains most speech
Contains most noise
y1
y2
y3
yn
z1
z2
z3
zn
Speech Enhancement via Iterative Multichannel Filtering
• System output at time k: a linear adaptive filter
• is a sequence of (n x n) matrices at iteration k.
• Goal: Adapt , over time such that the multichannel output contains signals with maximum speech energy in the first output.
Multichannel Speech Enhancement Algorithm
• A novel* technique for enhancing target speech in noise using two or more microphones via joint decorrelation
• Requires rough target identifier (i.e. when talker speech is present)
• Is adaptive to changing noise characteristics• Knowledge of source locations, microphone
positions, other characteristics not needed.• Details in [Gupta and Douglas, IEEE Trans.
• Sensors– Omnidirectional Microphones (AT803b)– Linear array adjustable (4cm nominal
spacing)
6 7
867
8
Audio Examples
• Acoustic Lab: Initial SIR = -10dB, 3-Mic System
Before: After:• Acoustic Lab: Initial SIR = 0dB, 2-Mic System
Before: After:• Conference Room: Initial SIR = -10dB, 3-Mic System
Before: After:• Conference Room: Initial SIR = 5dB, 2-Mic System
Before: After:
Effect of Noise Segment Length on Overall Performance
31
Diffuse Noise Source Example
• Noise Source: SMU Campus-Wide Air Handling System
• Data was recorded using a simple two-channel portable M-Audio recorder (16-bit, 48kHz) with it associated “T”-shaped omnidirectional stereo array at arm’s length, then downsampled to 8kHz.
32
Air Handler Data Processing
• Step 1: Spatio-Temporal GEVD Processing on a frame-by-frame basis with L = 256, where Rv(k) = Ry(k-1); that is, data was whitened to the previous frame.
• Step 2: Least-squares multichannel linear prediction was used to remove tones.
• Step 3: Log-STSA spectral subtraction was applied to the first output channel.
Complex Blind Source Separation
A Bs(k) x(k) y(k)
• Signal Model: x(k) = A s(k)
• Both the si(k)’s in s(k) and the elements of A are complex-valued.
• Separating matrix B is complex-valued as well.
• It appears that there is little difference from the real-valued case…
Complex Circular vs. Complex Non-Circular Sources
• (Second-Order) Circular Source: The energies of the real and imaginary parts of si(k) are the same.
• (Second-Order) Non-Circular Source: The energies of the real and imaginary parts of si(k) are not the same.
Non-CircularCircular Circular
Why Complex Circularity Matters in Blind Source Separation
• Fact #1: It is possible to separate non-circular sources by decorrelation alone if their non-circularities differ [Eriksson and Koivunen, IEEE Trans. IT, 2006]
• Fact #2: The strong-uncorrelating transform is a unique linear transformation for identifying non-circular source subspaces using only covariance matrices.
• Fact #3: Knowledge of source non-circularity is required to obtain the best performance of a complex BSS procedure.
Complex Fixed Point Algorithm [Douglas 2007]
NOTE: The MATLAB code involves both transposes and Hermitian transposes… and no, those aren’t mistakes!