Multiuser Detection and Statistical Mechanics Dongning Guo and Sergio Verd´ u Dept. of Electrical Engineering Princeton University Princeton, NJ 08544, USA email: dguo,verdu @princeton.edu June 26, 2002 Abstract We present a framework for analyzing multiuser detectors in the context of statistical mechanics. A multiuser detector is shown to be equivalent to a conditional mean estimator which finds the mean value of the stochastic output of a so-called Bayes retrochannel. The Bayes retrochannel is equivalent to a spin glass in the sense that the distribution of its stochastic output conditioned on the received signal is exactly the distribution of the spin glass at thermal equilibrium. In the large-system limit, the bit-error-rate of the multiuser detector is simply determined by the magnetization of the spin glass, which can be obtained using powerful tools developed in statistical mechanics. In particular, we derive the large-system uncoded bit- error-rate of the matched filter, the MMSE detector, the decorrelator and the optimal detectors, as well as the spectral efficiency of the Gaussian CDMA channel. It is found that all users with different received energies share the same multiuser efficiency, which uniquely determines the performance of a multiuser detector. A universal interpretation of multiuser detection relates the multiuser efficiency to the mean-square error of the conditional mean estimator output in the large-system limit. Index Terms: Multiuser detection, statistical mechanics, code-division multiple access, spin glass, self- averaging property, free energy, replica method, multiuser efficiency. 1 Introduction Multiuser detection is central to the fulfillment of the capabilities of code-division multiple access (CDMA), which is becoming the ubiquitous air-interface in future generation communication systems. In a CDMA system, all frequency and time resources are allocated to all users simultaneously. To distinguish between users, each user is assigned a user-specific spreading sequence on which the user’s information symbol is modulated before transmission. By selecting mutually orthogonal spreading sequences for all users, each user can be separated completely by matched filtering to one’s spreading sequence. It is not very realistic to maintain orthogonality in a mobile environment and hence multiple access interference (MAI) arises. The problem of demodulating in the presence of the MAI therefore becomes vital for a CDMA system. 1
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Multiuser Detection and Statistical Mechanics
Dongning Guo and Sergio Verdu
Dept. of Electrical Engineering
Princeton University
Princeton, NJ 08544, USA
email:�dguo,verdu � @princeton.edu
June 26, 2002
Abstract
We present a framework for analyzing multiuser detectors in the context of statistical mechanics. A
multiuser detector is shown to be equivalent to a conditional mean estimator which finds the mean value
of the stochastic output of a so-called Bayes retrochannel. The Bayes retrochannel is equivalent to a spin
glass in the sense that the distribution of its stochastic output conditioned on the received signal is exactly
the distribution of the spin glass at thermal equilibrium. In the large-system limit, the bit-error-rate of the
multiuser detector is simply determined by the magnetization of the spin glass, which can be obtained using
powerful tools developed in statistical mechanics. In particular, we derive the large-system uncoded bit-
error-rate of the matched filter, the MMSE detector, the decorrelator and the optimal detectors, as well as the
spectral efficiency of the Gaussian CDMA channel. It is found that all users with different received energies
share the same multiuser efficiency, which uniquely determines the performance of a multiuser detector. A
universal interpretation of multiuser detection relates the multiuser efficiency to the mean-square error of
the conditional mean estimator output in the large-system limit.
Let � � ������� ��� � be the � users’ respective received energies per symbol. The received signal in the � th chip
interval is then expressed as � � � �� � ����� � � � � � � � ���! �#" � �$� � ��������� � � (2)
where � " � � are independent standard Gaussian random variables, and �� the noise variance. Note that the
spreading sequences are randomly chosen for each user and not dependent on the received energies.
We can normalize the averaged transmitted energy by absorbing a common factor into the noise variance.
Without loss of generality, we assume
��
���%� � � � � ��� (3)
The SIR1 of user � under matched filtering in absence of interfering users is � ��&# �� and the average SIR of all
users is � &# �� . We assume that the energies are known deterministic numbers, and as �('*) , the empirical
distributions of �+� ��� converge to a known distribution, hereafter referred to as the energy distribution.
The characteristic of the Gaussian CDMA channel can be described as� � �,� �.- � �0/1� � � �#2 �� � �3465�78:9; � �� ��!< � � � �� � ����� � � � � � � � �#=
�%>?(4)
where the �A@ � matrix / � � � � ��������� � � � . Let B � � � � ��������� � � � , C � diag � � � � ��������� � � � � andD � � " � ��������� " � � , we have a compact form for (2) and (4)B �E/.C � �F � D (5)
and �G��� B - � �%/.� � � �#2 �� � IH 4 5�78$J � �� ��LK BM�F/NC � K �%O (6)
where KQP�K denotes the norm of a vector.
1The SIR is defined as the energy ratio of the useful signal to the noise in the output. In contrast, the SNR of user R is usually definedas SUTWVYX[Z%\�]^+_ .
4
2.2 Multiuser Detection: Known Results
Assume that all received energies and the noise variance are fixed and known. A multiuser detector observes a
received signal vector B in each symbol interval and tries to recover the transmitted symbols using knowledge
of the instantaneous spreading sequences / . In general, the detector outputs a soft decision statistic for each
user of interest, which is a function of � B��%/.� ,�� � � � � � B��%/.� � � ������������� � ��� (7)
Whenever the soft output can be separated as a useful signal component and an interference, their energy
ratio gives the SIR. Usually, a hard decision is made according to the sign of the soft output,�� � � sgn � �� � � � (8)
Assuming binary symmetric priors, the bit-error-rate for user � is
Another important performance index is the multiuser efficiency, which is the ratio between the energy that a
user would require to achieve the same BER in absence of interfering users and the actual energy [1],
� � ��� � P�� � � � ����� �� � � (10)
Immediately, the BER can be expressed in the multiuser efficiency as
� � � ��� � � � P � � � � � (11)
In this paper, we study the matched filter, the decorrelator, the MMSE detector and the optimal detectors.
The BER and the SIR performance of these CDMA detectors have received considerable attention in the
literature. In general, the performance is dependent on the system size � � � � � as well as the instantaneous
spreading sequences / , and is therefore very hard to quantify. It turns out that this dependency vanishes in
the so-called large-system limit, i.e., the user number � and the spreading factor � both tend to infinity but
with their ratio � & � converging to a constant � . In the following we briefly describe each of these detectors
and present previously known large-system results.
5
2.2.1 The Single-user Matched Filter
The most innocent detection is achieved by matched filtering using the desired user’s spreading sequence. A
soft decision is obtained for user � ,�� (mf)� � � H� B (12)
�� � � � �I� � � �� � � � H� �
� � � � � � � �F ��� � (13)
where � � is a standard Gaussian random variable. The second term, the MAI has a variance of � as � �� � ' ) . Hence the large-system SIR is simply
����� (mf)� � � � �� � � � (14)
It can be shown using the central limit theorem that the MAI converges to a Gaussian random variable in the
large-system limit. Thus the BER2 is
� (mf)� � ��� ����� (mf)� � � (15)
The multiuser efficiency is the same for all users
� (mf) � �� ��� 4 � (16)
Worth noting is that a single multiuser efficiency determines the matched filter performance for all users,
since the SIR can be obtained as
����� (mf)� � � � ��$P � (mf) � (17)
and then the BER by (15). This is a result of the inherent symmetry of the multiuser game. Indeed, from
every user’s point of view, the total interference from the rest of the users is statistically the same in the
large-system limit. The only difference among the decision statistics is their own energies. By normalizing
with respect to one’s own energy, the multiuser efficiency is the same for every user.
2.2.2 The MMSE Detector
The MMSE detector is a linear filter which minimizes the mean-square error between the original data and
its outputs: �� (mmse) ��C�� 3 � / H/ �! � C � � � 3 / H B�� (18)
2Precisely, we refer to the BER in the large-system limit. Unless otherwise stated, all performance indexes such as BER, SIR,multiuser efficiency and spectral efficiency refer to large-system performance hereafter.
6
The decision statistic for user � can be described as�� (mmse)� ��� � � � � � � � �� � � � � � � �F � � � (19)
where � � � is the element of � ��C � 3 � / H/ �! � C � � � 3 / H/.C on the � th row and the � th column, and � � is
a Gaussian random variable.
In the case of equal-energy users, the large-system SIR of the MMSE detector was first obtained in [1] as
Equation (21) is generalized in [4] to the case of arbitrary energy distribution using random matrix theory, and
the so-called Tse-Hanly equation is found. In the large-system limit, with probability 1, the output decision
statistic given by (19) converges in distribution to a Gaussian random variable [9, 13]. Hence the BER is
determined by the SIR by a simple expression similar to (15). Using (17), the Tse-Hanly equation is distilled
to the following fixed-point equation for the multiuser efficiency in [10]
� � � ���� � �� � �! ���� � � (22)
where �� � P � denotes the expectation taken over the subscript variable � , which denotes a random energy
drawn according to the energy distribution here. Note that the subscript in an expectation expression is often
omitted if no ambiguity arises. Again, due to the fact that the output MAI seen by each user has the same
asymptotic distribution, the multiuser efficiency is the same for all users.
2.2.3 The Decorrelator
The decorrelating detector, or, the decorrelator, removes the MAI in the expense of enhanced thermal noise.
Its output is �� (dec) � C � 3 � / H/ ���G/ H B (23)
where � P � � denotes the Moore-Penrose psudo-inverse, which reduces to the normal matrix inverse for non-
singular square matrices. In case of ��� � , the large-system multiuser efficiency is [1]
� (dec) � � � � � (24)
7
It is incorrectly claimed in [28] that the multiuser efficiency is 0 if ��� � . We find the correct answer in
section 4.
2.2.4 The Optimal Detectors
The jointly optimal detector maximizes the joint posterior probability and result in�� (jo) ������� �� 7��� ��� � ��� �G��� � - B��0/1� (25)
where � � � � - B��0/1� �� ����� B - � �0/1���� �� ��� � ��� �G��� B - � �0/1� �G��� � � � (26)
The individually optimal detector maximizes the marginal posterior probability and result in�� (io)� ������� ��� 7������ ��� � � ����� � � - B��%/.� (27)
Moreover, we show in Section 7 that the spectral efficiency of a CDMA channel without power control is
exactly (31) where the expectation is taken over the energy distribution.
New results culminate in a general and original interpretation of multiuser detection from the statistical
mechanics perspective in Section 6.
3 Conditional Mean Estimator and Spin Glass
3.1 Bayes Retrochannel and Conditional Mean Estimator
We study a general estimation problem as depicted in Fig. 1. A (vector) source symbol � � is drawn according
to a prior distribution � � � � � � . The channel, upon an input � � , generates an output B according to a conditional
probability distribution � � � B - ��� . We want to find an estimator that, upon receipt of B , gives a good estimate
of the originally transmitted data � � .One interesting candidate is an adjunct channel with a conditional distribution �1� � - B � , which is induced by
a postulated prior distribution � � ��� and a postulated conditional distribution �1� B - ��� using the Bayes formula� � � - B � � �1� B - ��� �1� � �� � �1� B - � � �1� � � � (38)
We call this channel a Bayes retrochannel. If the postulated prior and conditional distributions are the same
as the true ones, i.e., � � ���YX �G��� ��� and �1� B - ���YX ����� B - ��� , then � � � - B � is exactly the posterior probability
distribution corresponding to the true source and the true channel. In general, the postulated prior as well as
10
the conditional distribution can be different to the true ones. In fact, � � ��� and � ��� � � may even have different
allowed symbol sets. In case the support of �1� � � is the Euclidean space instead of a discrete set, the sum
in (38) shall be replaced by an integral. Clearly, the retrochannel output is different to the usual notion of an
estimate of � � since it is a random variable instead of a deterministic function of B . Hence, the retrochannel
can be regarded as a “stochastic estimator”.
We also consider a so called conditional mean estimator, which gives a deterministic output upon an inputB as �� ��� ���
�� �M��� - B�� (39)
where, by definition, the operator � P � gives the expectation taken over the distribution �1� � - B#� , which depends
on the postulated prior and conditional distributions assumed for the Bayes retrochannel. In this case, the
output of the estimator is exactly the mean value of the stochastic estimate generated by the Bayes retrochan-
nel.
Essentially, the conditional mean estimator is a soft decision detector with the freedom of choosing the
postulated prior and conditional distributions. Interestingly, tuning the postulated distributions allows us
to realize arbitrary detectors. In particular, if the source and channel symbols are both scalar, the prior is
symmetric binary, � ��� � � �W� � ����� � � � �+� � �� , and the postulated distributions are the same as the true
ones, then the conditional mean estimate is
� � � � � � � � � - � ��� �1� � � � � - � � (40)
whose sign is the maximum a posteriori detector output for the scalar channel.
We now study the conditional mean estimator in the Gaussian CDMA channel setting. Recall that the
spreading sequence matrix is / . Let the conditional distribution be�1� B - � �0/1� � � �#2 � � H 4 5�78�J � �� � K B � /.C � K � O (41)
which differs from the true channel law � ��� B - � �0/1� by a positive control parameter , where in case � � ,����� B - � �%/.�/X �1� B - � �0/1� . The posterior probability distribution is then�1� � - B��0/1� ��� � � B��0/1� � � ��� 5�78 J � �� � K BM�F/.C � K �%O (42)
where
� � B��0/1� � � �1� � � 5�78�J � �� � K BM�F/NC � K �%O � (43)
The conditional mean estimator outputs the mean value of the posterior probability distribution,�� � ��� � ��� � � � � � - B��0/6� (44)
11
where the expectation is taken over � � � - B��0/1� . We identify a few choices for the prior distribution � � ��� and
the control parameter for the conditional distribution under which the conditional mean estimator becomes
equivalent to each of the multiuser detectors discussed in section 2.2.
3.1.1 Linear Detectors
We assume standard Gaussian priors,� (l) � ��� � ���%� � J �� �#2 $ �� 4 �4 O � � �#2.�% � 4 5�78�J � �� K � K � O � (45)
The posterior probability distribution is then� (l) � � - B��0/1� � � � (l) � B��0/1� � � 3 5 7 8�J � �� K � K � � �� � K BM�F/NC � K � O (46)
where
� (l) � B��0/1� � #d � 5�78$J � �� K � K �%O 5�78�J � �� � K B �F/.C � K �%O � (47)
Here we use the superscript (l) to denote linear detectors.
Since the probability given by (46) is exponential in a quadratic function in � , hence � is a Gaussian
random vector conditioned on B and / , i.e., given B and / , the components of � , � � ��������� � � , are jointly
Gaussian. The conditional mean �M� � � - B��%/�� is therefore consistent with � ��� - B��%/�� , which maximizes the
posterior probability distribution � (l) � � - B��0/1� as a property of Gaussian distributions. The extremum is the
solution to � � (l) � � - B��%/.��� H
� � (48)
which yields
� ��� (l)� ��C � 3 � / H/ �F � C � � � 3 / H B (49)
where we use the subscript to denote an average over the posterior probability distribution with the control
parameter .
By choosing different values for , we arrive at different linear detectors. If ' ) , then � P � � � � (l)��� ' � � � � H� B�� (50)
Hence the conditional mean estimate is consistent in sign with the matched filter output. If � � , equa-
tion (50) gives exactly the soft output of the MMSE receiver as in (19). If ' , we get the soft output of the
decorrelator as given by (23). We have seen that the control parameter can be used to tune a parameterized
conditional mean estimator to a desired one in a set of detectors.
12
3.1.2 The Optimal Detectors
Let � � ��� be the true binary symmetric priors � � � ��� . The posterior probability distribution is then� (o) � � - B��%/.� � � � (o) � B��%/.� � � 3 5�78 J � �� � K BM�F/.C � K � O � � ��� ����� � � (51)
where
� (o) � B��%/.� � � �� ��� � � � 5 7 8$J � �� � K B � /.C � K � O � (52)
We use the superscript (o) for the optimal detectors.
Suppose that the control parameter takes the value of � , then the conditional probability distribution is
the true channel law. The conditional mean estimate is
Clearly, this soft output is consistent in sign with the hard decision of the IO detector as given by (27).
Alternatively, if ' , most of the probability mass of the distribution � (o) � � - B��0/1� is concentrated
on a vector�� that achieves the minimum of K B � /NC � K , which also maximizes the posterior probability����� � - B��%/.� . The conditional mean estimator output
/�� ���� � � � � � (o)� (54)
then singles out the � th component of this�� at the minimum. Therefore by letting ' the conditional
mean estimator is equivalent to the JO detector as given by (25).
Worth mentioning here is that if ' ) , the conditional mean estimator reduces to the matched filter.
This can be verified by noticing that� (o) � � - B��0/1��� �I� �� � K B � /.C � K � ��� � � � � (55)
In all, at thermal equilibrium, the entropy is maximized and the free energy minimized. The system is
found in a configuration with probability that is negative exponential in the associated energy of the configu-
ration. The most probable configuration is the ground state which has the minimum associated energy.
3.3 The Bayes Retrochannel and Spin Glass
A spin glass is a magnetic system consisting of many directional spins, in which the interaction of the spins is
determined by the so-called quenched random variables, i.e., whose values are determined by the realization
of the spin glass [30].3 The energy, the entropy and the free energy of a spin glass are similarly defined as in
Section 3.2 with � � � � replaced by a Hamiltonian ��� � � � parameterized by some quenched random variableB .In the multiuser detection context, we study an artificial spin glass system which has a Hamiltonian� � � � � � � , in which the received signal B and the spreading sequences / are regarded as quenched random
variables. Let the energy�
be such that the inverse temperature is � . Then, at thermal equilibrium, the
probability distribution of the spin glass system configuration is�1� � - B��0/1� ��� � � B��0/1� 5 7 8 � ����� � � � � � � (66)
where
� � B��0/1� � � 5�78 � ����� � � � ��� ��� (67)
Compare (66) to the postulated posterior probability distributions associated with the corresponding
3Imagine a system consisting molecules with random magnetic spins that evolve over time, while the random positions of themolecules are fixed for each concrete instance as in a piece of glass.
15
Bayes retrochannels, � (l) � � - B��%/.� and � (o) � � - B��0/1� , given by (46) and (51) respectively. The configuration
distribution �1� � - B��%/.� can be made to take each of the two posterior probability distributions if we choose the
corresponding Hamiltonian as
� (l)� � � � ��� � �� K � K � � �� � K B � /.C � K � � � �� � (68)
Clearly, a macroscopic property may be obtained by way of calculating the free energy of the modified system
�� / 1 � � � B��%/ � G� �� (81)
using the replica method.
3.5 BER
3.5.1 Correlation
In the multiuser detection setting, we study the most important performance measure, the bit-error-rate. For
the time being we consider equal-energy users, i.e., � � � � for all � . Conditioned on � B��0/1� , the percentage
of erroneously detected bits in the current symbol interval is �� � �I��� � B��0/1� � where
� � B��0/1� � ���� � � � � �
����%� � sgn � � � � � � � � � (82)
is the correlation of the detector output and the original transmitted symbols. Although not in the form of
� � � � � � , the correlation is a macroscopic quantity and hence also satisfies the self-averaging principle. Hence,� � B��%/.� '� with probability 1 as � ' ) where the limit
� � / � �� � � ��� � B��%/.� �� (83)
is independent of the spreading sequences and the noise realization. The BER of every user converges in the
large-system limit to
� � �� � �I��� � � (84)
Due to symmetry of the spreading sequences, as far as the BER is concerned, we can assume that all
18
transmitted symbols are � � , i.e., � � � � � for all � . The average correlation is now
��� ��
���%� � sgn � � � � ��� ��� � (85)
If � � represents the local directional magnetization of a spin glass, then the average correlation is the overall
magnetization of the system.
3.5.2 Arbitrary Energy Distribution
To obtain the correlation for a general received energy distribution, we consider first a system in which the
� users are divided into � equal-energy groups. The number of users in the � th group, ��� , is ��� ����� � ,
and each user in this group takes the same energy �� . Let the users be numbered such that user 1 through � �consist of 1, user � � � � through � � � � � consist of group 2, and so on. The parameters satisfy
� � � � � � P PYP � �� � � (86a)
� � � � � � � � � � PYP P � ��� ��� � � (86b)
so that the average energy is � . We call such a distribution a “simple” energy distribution.
The � � users in group � � share the same BER, which can be written as
� � � �� � ��� � � � (87)
where
��� � /�� �� �� ��� �� �
�� ���� sgn � � � ����� ��� � (88)
The reason is that the correlation inside �� P �� is also self-averaging as ��� ���� � '*) .
In case of an arbitrary energy distribution, we consider a sequence of “simple” distributions, � � � ��� �
���0� ��������� that converge to
. For each � , assume that the system consists of � groups of equal-energy users.
� � � 5�78 J � �� � � � � � � � � � � O T d � T (112)
where the inner expectation is now over the distribution of the random vector � � � ��� � � � ��������� � � � � , which
is dependent on ��� � ’s. Note that the � ’s are i.i.d. random chips. For fixed � � � ’s, each � � is a sum of �weighted independent random chips, which converges to a Gaussian random variable as � ' ) . Because
of a generalization of the central limit theorem, the variables � � � � � � � � converge to a set of jointly Gaussian
random variables, with zero mean and covariance matrix�
Interested readers may refer to [28, Appendix B] for a justification. Therefore the expectation over � � ’s is
dependent on the symbols � � � � � only through�
in the large-system limit. Note that although not made
explicit in the notation, � ��� is a function of � � � � � � � � � � �%� � . Trivially, � ��� � � .It is clear that the outer integral in (112) can be regarded as an expectation taken over the distribution of
the data symbols ��� � � . Since the integrand is only dependent on�
, it is easier to first take the expectation
conditioned on�
and then take a second expectation with respect to the distribution of�
that can be resulted from varying � � � ’s in (113).
The problem is now to evaluate � � and � � . An implication of Varadhan’s theorem is that only a single
subshell will contribute to the integral in the large-system limit. To find the extremum with respect to � ��� ’sdirectly is a prohibitive task. Instead, we assume replica symmetry, i.e., for � ��� � ��������� �
�, � �� � ,
� � � � � � � � � � ��� � (118a)
� ��� � � � � � � � ��� � (118b)
� ��� � ��� � ��� � � � (118c)
The replica symmetry is a convenient choice over all possible ansatz. It is valid in most cases of interest in
the large-system limit due to symmetry over replicas. The readers are referred to [30, 28] for more discussion
in the justification of this assumption. In Appendix 9, we evaluate (114) and obtain
Following the same procedure as that for the linear detectors, in Appendix 12 we obtain the correlation
for a group of equal energy users as
/ � �� �� ��� �� �
� 3��%� � � � ���
���� � � � � ( ��* +
� � � � � � � � � , ��� � (156)
Hence we have the same expression for the correlation as (131). The multiuser efficiency is given by the
same (139) where the parameters � and � are the solution to (155).
As shown in Section 3.1.2, various types of detectors can be achieved by tuning the control parameter .
In particular, the conditional mean estimator reduces to the matched filter if ' ) . In this case � � � ' and we get the matched filter efficiency (16) by (139).
In case of � � , the conditional mean estimator is exactly the IO detector. Notice that for all ,
� � � / 1 �L� � B - ���0� � � ��� /21 �L� � B �0� � (177)
Assume the spreading codes are known and fixed. By (6), � �MK is a Gaussian density, and hence
�M� /21 �L� � B - � �%� � � � /21 �L� � �� D � � (178)
� � � � � � � / 1 � � �#2 �� � � � (179)
The Gaussian prior is known to give the maximum of the conditional divergence J � � �MK -[- � � - � � . Let � � ’sbe independent standard Gaussian random variables so that the prior distribution is given by in (45). Then � �is given by � � B � �
#d � � � 2.� � 4 5 7 8$J � �� K � K �0O � �#2 �� � H 4 5�78�J � �� �� K B � /.C � K ��O (180)
� � � 2 �� �% H 4 � (l) � B��0/1� (181)
33
where � (l) � B��%/.� is given by (47) with � � . Hence