This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 3
Introduction
• Directivity pattern of a microphone – A microphone (*) is characterized by a `directivity pattern which specifies the gain & phase shift that the microphone gives to a signal coming from a certain direction (i.e. `angle-of-arrival’) – In general the directivity pattern is a function of frequency (ω) – In a 3D scenario `angle-of-arrival’ is azimuth + elevation angle – Will consider only 2D scenarios for simplicity, with one angle-of arrival (θ), hence directivity pattern is H(ω,θ) – Directivity pattern is fixed and defined by physical microphone design
(*) We do digital signal prcessing, so this includes front-end filtering/A-to-D/..
|H(ω,θ)| for 1 frequency
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 4
Introduction
• Virtual directivity pattern – By weighting or filtering (=freq. dependent weighting) and then
summing signals from different microphones, a (software controlled) virtual directivity pattern (=weigthed sum of individual patterns) can be produced
– This assumes all microphones receive the same signals (so are all in the same position). However…
∑=
=M
mmm HFH
1virtual ),().(),( θωωθω
01000
20003000 0
4590
135180
0
0.5
1
Angle (deg)
Frequency (Hz)
F1(ω)
F2 (ω)
FM (ω)
z[k]
y1[k]
y2[k]
yM [k]
+ :
Fm (ω) = fm,n.e− jωn
n=0
N−1
∑
3
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 5
F1(ω)
F2 (ω)
FM (ω)
z[k]
y1[k]
+ : dM
θ
dM cosθ
Introduction
• However, in a microphone array different microphones are in different positions/locations, hence also receive different signals
• Example : uniform linear array i.e. microphones placed on a line & uniform inter-micr. distances (d) & ideal micr. characteristics (p.9) For a far-field source signal (plane waveforms), each microphone receives the same signal, up to an angle-dependent delay… fs=sampling rate c=propagation speed
][][ 1 mm kyky τ+=
Hvirtual (ω,θ ) = Fm (ω).e− jωτm (θ )
m=1
M
∑
sm
m fc
d θθτ
cos)( =
dmdm )1( −=
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 6
F1(ω)
F2 (ω)
FM (ω)
z[k]
y1[k]
+ : dM
θ
dM cosθ
Introduction
• Beamforming = `spatial filtering’ based on microphone characteristics (directivity patterns) AND microphone array configuration (`spatial sampling’)
• Classification: Fixed beamforming: data-independent, fixed filters Fm e.g. delay-and-sum, filter-and-sum
Adaptive beamforming: data-dependent filters Fm e.g. LCMV-beamformer, generalized sidelobe canceler
4
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 7
Introduction
• Background/history: ideas borrowed from antenna array design and processing for radar & (later) wireless communications
• Microphone array processing considerably more difficult than antenna array processing: – narrowband radio signals versus broadband audio signals – far-field (plane wavefronts) versus near-field (spherical wavefronts) – pure-delay environment versus multi-path environment
• Applications: voice controlled systems (e.g. Xbox Kinect), speech communication systems, hearing aids,…
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 8
Data model and definitions
Data model: source signal in far-field (see p.13 for near-field)
• Microphone signals are filtered versions of source signal S(ω) at angle θ
• Stack all microphone signals (m=1..M) in a vector d is `steering vector’ • Output signal after `filter-and-sum’ is
[ ]TjM
j MeHeH )()(1 ).,(...).,(),( 1 θωτθωτ θωθωθω −−=d
)()}.,().({),().(),().(),(1
* ωθωωθωωθωωθω SYFZ HHM
mmm dFYF ===∑
=
H instead of T for convenience (**)
)( ..),(),(shift phase dep.-pos.
)(
pattern dir.
ωθωθω θωτ SeHY mjmm
!"!#$!"!#$−=
)().,(),( ωθωθω SdY =
5
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 9
Data model: source signal in far-field • If all microphones have the same directivity pattern Ho(ω,θ), steering
vector can be factored as…
• Will often consider arrays with ideal omni-directional microphones : Ho(ω,θ)=1 Example : uniform linear array, see p.5
Data model and definitions
)().,(),( ωθωθω SdY =
[ ]!!!!! "!!!!! #$!"!#$positions spatial
)()(
pattern dir.
0 ...1.),(),( 2Tjj MeeH θωτθωτθωθω −−=d
microphone-1 is used as a reference (=arbitrary)
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 10
Data model and definitions
Definitions: (1) • In a linear array (p.5) : θ =90o=broadside direction θ = 0o =end-fire direction
• Array directivity pattern (compare to p.3) = `transfer function’ for source at angle θ ( -π<θ< π )
• Steering direction = angle θ with maximum amplification (for 1 freq.)
• Beamwidth (BW) = region around θmax with (e.g.) amplification > -3dB (for 1 freq.)
),().()(),(),( θωω
ωθω
θω dFHSZH ==
),(maxarg)(max θωωθ θ H=
6
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 11
Data model and definitions
Data model: source signal + noise • Microphone signals are corrupted by additive noise
• Define noise correlation matrix as
• Will assume noise field is homogeneous, i.e. all diagonal elements of noise correlation matrix are equal :
• Then noise coherence matrix is
[ ]TMNNN )(...)()()( 21 ωωωω =N
})().({)( Hnoise E ωωω NNΦ =
iΦΦ noiseii ∀= , )()( ωω
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
==
1............1
)(.)(
1)( !ωωφ
ω noisenoise
noise ΦΓ
)()().,(),( ωωθωθω NdY += S
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 12
Data model and definitions
Definitions: (2) • Array Gain = improvement in SNR for source at angle θ ( -π<θ< π )
• White Noise Gain =array gain for spatially uncorrelated noise
(e.g. sensor noise) ps: often used as a measure for robustness • Directivity =array gain for diffuse noise (=coming from all directions)
DI and WNG evaluated at θmax is often used as a performance criterion
⎟⎟⎠
⎞⎜⎜⎝
⎛ −=Γ
cddf ijsdiffuse
ij
)(sinc)(
ωω
skip this formula )(.).(
),().(),(
2
ωω
θωωθω
FFdFdiffusenoise
H
H
DIΓ
=
)().(),().(
),(2
ωω
θωωθω
FFdF
H
H
WNG =Iwhite
noise =Γ
|signal transfer function|^2
|noise transfer function|^2 )().().(),().(
),(2
ωωω
θωωθω
FΓFdF
noiseH
H
input
output
SNRSNR
G ==
PS: ω is rad/sample ( -Π≤ω≤Π ) ω fs is rad/sec
7
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 13
• Far-field assumptions not valid for sources close to microphone array – spherical wavefronts instead of planar waveforms – include attenuation of signals – 2 coordinates θ,r (=position q) instead of 1 coordinate θ (in 2D case)
• Different steering vector (e.g. with Hm(ω,θ)=1 m=1..M) :
PS: Near-field beamforming
[ ]TjM
jj Meaeaea )()(2
)(1
21),( qqqqd ωτωτωτω −−−= …),( θωd
smref
m fc
pqpqq
−−−=)(τ
with q position of source pref position of reference microphone pm position of mth microphone
m
refma pq
pq−
−=
e
e=1 (3D)…2 (2D)
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 14
PS: Multipath propagation
• In a multipath scenario, acoustic waves are reflected against walls, objects, etc..
• Every reflection may be treated as a separate source (near-field or far-field)
• A more realistic data model is then.. with q position of source and Hm(ω,q), complete transfer function from
source position to m-the microphone (incl. micr. characteristic, position, and multipath propagation)
`Beamforming’ aspect vanishes here, see also Lecture-3 (`multi-channel noise reduction’)
)()().,(),( ωωωω NqdqY += S
[ ]TMHHH ),(...),(),(),( 21 qqqqd ωωωω =
8
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 15
Overview
• Introduction & beamforming basics – Data model & definitions
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 28
• Directivity patterns for end-fire steering (ψ=0):
Superdirective beamformer has highest DI, but very poor WNG (at low frequencies, where diffuse noise coherence matrix becomes ill-conditioned) hence problems with robustness (e.g. sensor noise) !
-20
-10
0
90
270
180 0
Superdirective beamformer (f=3000 Hz)
-20
-10
0
90
270
180 0
Delay-and-sum beamformer (f=3000 Hz)
M=5 d=3 cm fs=16 kHz
Maximum directivity=M.M obtained for end-fire steering and for frequency->0 (no proof)
ideal omni-dir. micr.’s
0 2000 4000 6000 80000
5
10
15
20
25
Frequency (Hz)
Direc
tivity
(line
ar)
SuperdirectiveDelay-and-sum
0 2000 4000 6000 8000-60
-50
-40
-30
-20
-10
0
10
Frequency (Hz)
Whit
e nois
e gain
(dB)
SuperdirectiveDelay-and-sum
WNG=M= 5
PS: diffuse noise ≈ white noise for high frequencies (cfr. ωèΠ and c/fs=λmin/2≈min(dj-di) in diffuse noise coherence matrix)
DI=WNG=5
DI=M 2=25
Super-directive beamforming : DI maximization
15
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 29
• First-order differential microphone = directional microphone 2 closely spaced microphones, where one microphone is delayed (=hardware) and whose outputs are then subtracted from each other
• Array directivity pattern:
– First-order high-pass frequency dependence – P(θ) = freq.independent (!) directional response – 0 ≤ α1 ≤ 1 : P(θ) is scaled cosine, shifted up with α1 such that θmax = 0o (=end-fire) and P(θmax )=1
d
Σ τ
θ
+
_
)cos(1),( c
djeH
θτω
θω+−
−=
ωd/c <<π, ωτ <<π
cd /1 +=τ
τα
θααθ cos)1()( 11 −+=P
H (ω,θ ) ≈ jω.(τ + dc
cosθ ) = jωhigh-pass!
.(τ + dc
). P(θ )angle dependence!
Differential microphones : Delay-and-subtract
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 30
– Parametrize all f’s that satisfy constraints (verify!) I.e. filter f can be decomposed in a fixed part fq and a variable part Ca. fa
bfCfRff
=⋅⋅⋅ Tyy
T k ,][min
f = fq −Ca.fa
)( JMNMNa
−×ℜ∈C
JJMNMN ℜ∈ℜ∈ℜ∈ × bCf ,,
CT .Ca = 0
fq =C.(CT .C)−1b
fa ∈ℜ(MN−J )
18
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 35
Generalized sidelobe canceler
GSC = Adaptive filter formulation of the LCMV-problem Constrained optimisation is reformulated as a constraint pre-processing,
followed by an unconstrained optimisation, leading to a simple adaptation scheme – LCMV-problem is
– Unconstrained optimization of fa : (MN-J coefficients)
bfCfRff
=⋅⋅⋅ Tyy
T k ,][min JJMNMN ℜ∈ℜ∈ℜ∈ × bCf ,,
).].([.).(min aaqyyT
aaq ka
fCfRfCff −−
f = fq −Ca.fa
fa ∈ℜ(MN−J )
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 36
GSC (continued)
– Hence unconstrained optimization of fa can be implemented as an adaptive filter (adaptive linear combiner), with filter inputs (=‘left- hand sides’) equal to and desired filter output (=‘right-hand side’) equal to – LMS algorithm :
Generalized sidelobe canceler
}))..][().][({min...).].([.).(min
2][~][
aa
kd
qaaqyyT
aaq
T
aakkEk fCyfyfCfRfCf
ky
TTff
!"!#$!"!#$ΔΔ==
−==−−
][~ ky
][kd
])[..][][..(][..][]1[][~][][~
kkkkkk a
k
aT
kd
Tq
k
Taaa
T
fCyyfyCffyy!"!#$"#$!"!#$ −+=+ µ
19
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 37
Generalized sidelobe canceler
GSC then consists of three parts: • Fixed beamformer (cfr. fq ), satisfying constraints but not yet minimum
variance), creating `speech reference’ • Blocking matrix (cfr. Ca), placing spatial nulls in the direction of the
• Multi-channel adaptive filter (linear combiner) your favourite one, e.g. LMS
][kd
][~ ky
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 38
Generalized sidelobe canceler
A popular & significantly cheaper GSC realization is as follows
Note that some reorganization has been done: the blocking matrix now generates (typically) M-1 (instead of MN-J) noise references, the multichannel adaptive filter performs FIR-filtering on each noise reference (instead of merely scaling in the linear combiner). Philosophy is the same, mathematics are different (details on next slide).
Postproc
y1
yM
20
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 39
Generalized sidelobe canceler
• Math details: (for Delta’s=0)
[ ][ ]
[ ]][.~][~
]1[~...]1[~][~][~
~...00:::0...~00...0~
][...][][][
]1[...]1[][][
][.][.][~
:1:1
:1:1:1
permuted,
21:1
:1:1:1permuted
permutedpermuted,
kk
Lkkkk
kykykyk
Lkkkk
kkk
MTaM
TTM
TM
TM
Ta
Ta
Ta
Ta
TMM
TTM
TM
TM
Ta
Ta
yCy
yyyy
C
CC
C
y
yyyy
yCyCy
=
+−−=
⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢
⎣
⎡
=
=
+−−=
==
=input to multi-channel adaptive filter
select `sparse’ blocking matrix such that :
=use this as blocking matrix now
Digital Audio Signal Processing Version 2015-2016 Lecture-2: Microphone Array Processing p. 40
Generalized sidelobe canceler
• Blocking matrix Ca (cfr. scheme page 24) – Creating (M-1) independent noise references by placing spatial nulls in
look-direction – different possibilities
(broadside steering)
• Problems of GSC: – impossible to reduce noise from look-direction – reverberation effects cause signal leakage in noise references adaptive filter should only be updated when no speech is present to avoid