1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing Two Stage Constant-Envelope Precoding for Low Cost Massive MIMO Systems An Liu, Member IEEE, and Vincent Lau, Fellow IEEE Abstract Massive MIMO is a key technology to meet increasing capacity demands in 5G wireless systems. However, a base station (BS) equipped with M 1 antennas requires M radio frequency (RF) chains with linear power amplifiers, which are very expensive. In this paper, we propose a two stage constant- envelope (CE) precoding scheme to enable low-cost implementation of massive MIMO BS with S M RF chains and nonlinear power amplifiers. Specifically, the MIMO precoder at the BS is partitioned into an RF precoder and a baseband precoder. The RF precoder is adaptive to the slow timescale channel statistics to achieve the array gain. The baseband precoder is adaptive to the fast timescale low dimensional effective channel to achieve the spatial multiplexing gain. Both the RF and baseband precoders are subject to CE constraints to reduce the implementation cost and the peak-to-average power ratio of the transmit signal. The two stage CE precoding is a challenging non-convex stochastic optimization problem and we propose an online alternating optimization algorithm which can autonomously converge to a stationary solution without explicit knowledge of channel statistics. Simulations show that the proposed solution has many advantages over various baselines. Index Terms Massive MIMO, Constant-Envelope Precoding, Limited RF Chains, PAPR, Online Alternating Op- timization Copyright (c) 2015 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to [email protected]. This work was partially supported by NSFC Grant No.61571383, and partially supported by RGC 614913 and Huawei. An Liu and Vincent K. N. Lau are with the Department of ECE, The Hong Kong University of Science and Technology (email: [email protected]; [email protected]).
24
Embed
Two Stage Constant-Envelope Precoding for Low Cost Massive … · High Peak-to-Average Power Ratio (PAR): In today’s wideband systems, orthogonal frequency-division multiplexing
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
Two Stage Constant-Envelope Precoding for
Low Cost Massive MIMO SystemsAn Liu, Member IEEE, and Vincent Lau, Fellow IEEE
Abstract
Massive MIMO is a key technology to meet increasing capacity demands in 5G wireless systems.
However, a base station (BS) equipped with M � 1 antennas requires M radio frequency (RF) chains
with linear power amplifiers, which are very expensive. In this paper, we propose a two stage constant-
envelope (CE) precoding scheme to enable low-cost implementation of massive MIMO BS with S �M
RF chains and nonlinear power amplifiers. Specifically, the MIMO precoder at the BS is partitioned into
an RF precoder and a baseband precoder. The RF precoder is adaptive to the slow timescale channel
statistics to achieve the array gain. The baseband precoder is adaptive to the fast timescale low dimensional
effective channel to achieve the spatial multiplexing gain. Both the RF and baseband precoders are subject
to CE constraints to reduce the implementation cost and the peak-to-average power ratio of the transmit
signal. The two stage CE precoding is a challenging non-convex stochastic optimization problem and we
propose an online alternating optimization algorithm which can autonomously converge to a stationary
solution without explicit knowledge of channel statistics. Simulations show that the proposed solution
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
I. INTRODUCTION
Massive MIMO is regarded as the main contributor for spectrum efficiency gain in 5G wireless
systems [1]. While the theoretical research and even some prototype demo show considerable performance
advantages of massive MIMO, the practical deployment still faces two big challenges.
High RF Chain Cost: For a massive MIMO system with hundreds of antennas, it is very expensive to
have one RF chain behind every antenna. In practical massive MIMO systems, it is desirable to deploy
fewer RF chains than the number of antennas to reduce the hardware cost.
High Peak-to-Average Power Ratio (PAR): In today’s wideband systems, orthogonal frequency-
division multiplexing (OFDM) is widely used to deal with frequency selective channels. However, OFDM
combined with linear MIMO precoding yields transmit signals with very high PAR especially for massive
MIMO systems [2]–[4]. As a result, more expensive linear power amplifiers are required to avoid out-
of-band radiation and signal distortions. Therefore, a better transmission scheme which yields low-PAR
transmit signals would be desirable to enable low-cost and low-power BS implementations with nonlinear
power amplifiers.
There have been some works aiming at solving one of the above practical issues in massive MIMO
systems. For example, to reduce the number of RF chains, hybrid RF/baseband precoding has been
proposed for MIMO and mmWave systems [5]–[8], where the MIMO precoder is a cascade of a high
dimensional RF precoder followed by a low dimensional baseband precoder. However, the high PAR
issue in frequency selective fading channel is not addressed in [5]–[8]. A few low-PAR precoders for
massive MIMO systems have been proposed in [2]–[4]. In [3], [4], a constant envelope (CE) constraint is
imposed on the transmit signals of massive MIMO systems. In this case, extra power is required in order
to achieve the same sum-rate as without the CE constraint. However, the overall power consumption
may decrease because the amplifier back-off is reduced and thus the amplifier efficiency is improved [3],
[4]. Moreover, CE transmit signals are much more RF-friendly (which leads to cheaper and simpler BS
design) and can reduce the out-of-band radiation and signal distortions. However, the existing low-PAR
precoders in [3], [4] suffers from high RF chain cost. To the best of our knowledge, there is no systematic
method reported in the literature to simultaneously solve all of the above issues.
In this paper, we propose a novel two stage constant-envelope precoding architecture for massive
MIMO systems to simultaneously solve all of the aforementioned practical issues. Specifically, the MIMO
precoder at the BS is partitioned into a RF precoder and a baseband precoder as illustrated in Fig. 1. The
baseband precoder generates baseband transmit vectors with CE elements from the input data symbol
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
vectors. Then the CE baseband transmit vector is converted to RF signal vector using S �M RF chains
and the resulting RF signal vector is precoded using a RF precoder before being transmitted using the
M antennas. To facilitate low cost implementation of the RF precoder using RF phase shifting networks
[5], [6], the RF precoder is adaptive to channel statistics only with a CE constraint. The proposed two
stage CE precoding has several advantages. First, since only S � M RF chains is required at the BS,
the cost of RF chains can be significantly reduced. Second, the CE constraint on the baseband precoder
ensures that the baseband signal at the input of the RF chain has low PAR, which enables low-cost and
low-power BS with nonlinear power amplifiers. Third, the CSI signaling overhead can also be alleviated
since only low dimensional effective CSI is required at each BS. Finally, the frequency selective fading
can also be combated by the two stage CE precoding and there is no need to use OFDM, which greatly
reduces the complexity at user side. Therefore, the proposed two stage CE precoding architecture is a
good candidate for practical implementation of massive MIMO systems. However, these good features
of the proposed solution cannot be achieved by a naive combination of the existing techniques. There
are several new technical challenges associated with the design of two stage CE precoding.
• Two Stage Non-convex Stochastic Optimization: Due to the mixed timescale precoding structure
and the CE constraints, the design of two stage CE precoding is a two stage non-convex stochastic
optimization problem [9], which cannot be solved by the existing stochastic optimization algorithms
such as stochastic subgradient [10] or stochastic cutting plane [9].
• Infinite Dimensional Problem: When the CSI has a continuous distribution, the two stage CE
precoding problem becomes an infinite dimensional problem with uncountable infinite number of
optimization variables. In this case, it is difficult to even find a stationary solution1 of the problem,
because this involves solving a fixed point equation over the functional space.
In this paper, we propose a novel online alternating optimization (AO) algorithm to solve the two stage
non-convex stochastic optimization problem for two stage CE precoding design. The proposed online AO
solution does not require explicit knowledge of the channel statistics and can autonomously converge to a
stationary solution using observations of the (outdated) channel only. Analysis and simulations show that
the proposed two stage CE precoding with online AO optimization achieves the best tradeoff between
performance, hardware cost, power efficiency, CSI signaling overhead and computational complexity,
compared with various state-of-the-art baselines. Moreover, the proposed online AO method can be
1Stationary solution is a natural extension of stationary point for infinite dimensional problem as will be defined in Definition
1.
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
Figure 1: An illustration of the massive MIMO downlink under two stage CE Precoding. The blue / red blocks
represent long timescale / short timescale processes. The blue / red arrows represent long-term / short-term signaling.
potentially used to solve a general class of two stage non-convex stochastic optimization problems.
II. MASSIVE MIMO SYSTEM WITH TWO STAGE CE PRECODING
A. Transmit Signal Model under Two Stage CE Precoding
Consider a massive MIMO downlink system where a BS serves K � M single-antenna users as
illustrated in Fig. 1. The BS is equipped with M � 1 antennas but only S �M transmit RF chains to
reduce the hardware cost. The key components of the transmitter at the BS are elaborated below.
1) Constant-envelope Baseband Precoder: The CE-baseband precoder is a mapping from a block of
N data symbol vectors u ,[uT [0] , ...,uT [N − 1]
]T ∈ CNK and the effective channel2 h ∈ CKLS
to a block of N baseband transmit vectors x ,[xT [0] , ...,xT [N − 1]
]T ∈ CNS , where u [n] =
[u1 (n) , ..., uK (n)]T ∈ CK is the data symbol vector and uk (n) with E[|uk (n)|2
]= 1 is the data
symbol of user k at time n, x [n] = [x1 [n] , ..., xS [n]]T ∈ CS is the baseband transmit vector and xs [n]
is the input signal of the s-th RF chain at time n. The baseband precoder satisfies the CE constraint:
|xs [n]| = 1, ∀n, s. (1)
As a result of the CE constraint in (1), the baseband precoder can be specified by a phase angle vector
θ = [θ1, ..., θNS ]T . Specifically, given the phase angle vector θ, the baseband transmit vector is given by
xs [n] = ejθnS+s ,∀n, s.
2The definition of the effective channel will be given in (7).
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
Figure 2: An illustration of frame and block structures.
2) Insertion of Cyclic Prefix : To facilitate the communications over a frequency selective fading
channel, a cyclic prefix (CP) of length Nc is inserted at the beginning of each block such that the
transmit vectors from time 0 to time N − 1 is x [n] , n = 0, 1, ..., N − 1 generated from the CE baseband
precoder, and the transmit vectors from time −Nc to time −1 is the CP generated according to
x [n] = x [N + n] , n = −Nc, ...,−1.
The CP is used to absorb the inter-symbol-interference caused by frequency selective fading.
3) Constant-envelope RF Precoder: The RF precoder F ∈ CM×S is used to convert the signal vector√Px [n] output from the S power amplifiers to a signal vector
√PFx [n] ∈ CM , which is eventually
transmitted from the M antennas, where P is the transmit power of each power amplifier. The RF
precoder F also satisfies the CE constraints
Fms =1√Mejφ(m−1)S+s ,∀m, s, (2)
where Fms denotes the (m, s)-th entry of F. Hence, the RF precoder can be specified by a phase angle
vector φ = [φ1, ..., φMS ].
In the proposed two stage CE precoding, the time domain is divided into frames, where each frame
consists of Tf blocks. The frame and block structures are illustrated in Fig. 2.
B. Frequency Selective MIMO Channel Model
Consider a frequency selective channel which can be modeled as an FIR filter with L taps. The channel
is specified by L matrices H [l] , l = 0, ..., L − 1, whose (k,m)-th elements {Hkm [0] , ...,Hkm [L− 1]}
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
form the impulse response from antenna m to user k. The received signal vector at time n
y [n] =√P
L−1∑l=0
H [l]Fx [n− l] + w [n] ,
where w [n] ∼ CN (0, IK) is the AWGN vector. After removing the CP at the receiver, the input-output
relation of the channel is given by [11]
y =√PHFx + w, (3)
where H ∈ CKN×MN is the composite channel for the each block and it is a block-circulant matrix
given by
H =
H [0] 0 · · · 0 H [L− 1] · · · H [1]
H [1] H [0] 0 · · · · · · · · · H [2]...
......
......
......
0 · · · · · · 0 H [L− 1] · · · H [0]
(4)
F = Block Diag [F, ...,F] ∈ CMN×NS is a block diagonal matrix, and w =[wT [0] , ...,wT [N − 1]
]T .
For convenience, define the concatenated channel vector h ,[VecT (H [0]) , ...,VecT (H [L− 1])
]T ∈CLKM . We consider a block fading model where the the concatenated channel vector h (t) at block t
is generated according to a general distribution H (ζ (t)) with a slow time-varying parameter ζ (t). The
parameter ζ (t) is called channel statistics and it can be used to model the large scale fading such as path
loss and shadow fading which usually changes at a much slower timescale compared to the duration of
a block. In other words, H (ζ) is the conditional distribution of h when the channel statistics is ζ. This
channel model includes multipath Rayleigh and Rician fading channels as special cases.
C. Matched Filter Receiver
To reduce the complexity of the receiver at the user (mobile station), a simple matched filter detection
is employed at each user. Specifically, at user k, the baseband received signal after the matched filter is
scaled by a complex receive coefficient Gk to obtain the estimated data symbol. From (3), the estimated
data symbol vector can be expressed as
u = Gy = u + ξ + Gw, (5)
where G = Block Diag [G, ...,G] ∈ CNK×NK is a block diagonal matrix with G = Diag [G1, ..., GK ],
and ξ =√P GHFx−u can be interpreted as multiuser interference (MUI) vector. Assuming uk (n) ,∀k, n
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
to be i.i.d. complex Gaussian with zero mean and unit variance, and following similar analysis as that
in [3], [4], it can be shown that the corresponding achievable rate is lower bounded as
rk (H) =I (uk;uk|H)
N
≥
− log∣∣∣E [ξkξHk |H]+ |Gk|2 I
∣∣∣N
+
, (6)
where uk = [uk [0] , ..., uk [N − 1]]T and ξk = [ξk [0] , ..., ξk [N − 1]]T .
III. PROBLEM FORMULATION FOR TWO STAGE CE-PRECODING
A. Two Timescale Optimization Variables
1) Long-term optimization variable (RF Precoder φ): The RF precoder φ is used to achieve the array
gain provided by the massive MIMO. It is adaptive to the channel statistics ζ only to reduce the signaling
overhead [8], [12].
2) Short-term optimization variables (Baseband Precoder θ and Receive Coefficients G): For fixed
RF precoder φ, the effect of the concatenated channel h on the performance is completely characterized
by the effective concatenated channel
h ,[VecT (H [0]F) , ...,VecT (H [L− 1]F)
]T. (7)
Hence, the baseband precoder θ and the receive coefficients G are adaptive to the effective CSI h and
data symbol vector u to achieve MUI mitigation under constant envelope constraints on the baseband
transmit vector x.
B. Two-Timescale Stochastic Optimization Formulation
Define z ,[hT ,uT
]T as the concatenated channel-symbol vector, and Θ = {θ (z) ,G (z) : ∀z} as
the collection of the baseband precoders and receive coefficients for all possible channel-symbol vectors.
For a given realization of the channel statistics ζ, the two stage CE precoding design is formulated as
the following two timescale stochastic optimization problem which minimizes the MSE conditioned on
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
where
I (φ,θ,G; z) , E[‖u− u‖2
∣∣∣ z]=∥∥∥√P GHF (φ)x (θ)− u
∥∥∥2+ Tr
(GGH
),
is the MSE for given z. Note that for clarity, we explicitly write F and x as a function of φ and θ
respectively.
Problem P0 is a very challenging non-convex stochastic optimization problem and it is highly non-
trivial even to design a sub-optimal algorithm which converges to a local optimal solution. In this paper,
we will propose a low complexity online algorithm to find a ε-accurate stationary solution of Problem
P0 defined below.
Definition 1 (ε-accurate stationary solution of P0). A solution (φ?,Θ? = {θ? (z) ,G? (z) : ∀z}) is called
a ε-accurate stationary solution of Problem P0 for some ε ≥ 0 if it satisfies the following conditions:
∇Tθ I (φ?,θ? (z) ,G? (z) ; z) (θ − θ? (z))
+
K∑k=1
∂I (φ?,θ? (z) ,G? (z) ; z)
∂GIk(GIk −G?Ik (z))
+
K∑k=1
∂I (φ?,θ? (z) ,G? (z) ; z)
∂GQk
(GQk −G?Qk (z)
)≥ −ε,∀θ ∈ [−π, π)NS ,G (9)
for every z outside a set of probability zero, and
∇TφI(φ?,Θ?) (φ− φ?) ≥ −ε, ∀φ ∈ [−π, π)MS , (10)
where GIk and GQk are the real and imaginary parts of Gk respectively. A 0-accurate stationary solution
is simply called a stationary solution of P0.
The stationary solution is a natural extension of the stationary point for a deterministic optimization
problem. The global optimal solution of P0 must be a stationary solution. However, the set of stationary
solutions may also contain local optimal solutions and a certain type of saddle points. In the simulations, it
is observed that the proposed algorithm always converges to a stationary solution with good performance.
The problem of finding a stationary solution of P0 is still highly non-trivial. First, there is no closed-
form characterization of the MSE I(φ,Θ). Furthermore, when the channel-symbol vector z has a
continuous distribution, finding a stationary solution of P0 involves solving a fixed point equation in
(9) and (10) with uncountable number of variables.
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
IV. ONLINE ALTERNATING OPTIMIZATION FOR P0
In this section, we propose an online alternating optimization (AO) algorithm to solve a stationary
solution of P0. Our target is to develop a robust online solution which does not require explicit knowledge
of the channel statistics ζ. The solution should autonomously converge to a stationary solution of P0
using observations of the (outdated) channel only.
Challenge 1 (Design an online algorithm for P0). Design an online AO algorithm to solve a ε-accurate
stationary solution of the non-convex stochastic optimization problem P0 for arbitrarily small but fixed
ε > 0.
We shall first summarize the proposed online AO algorithm. Then we elaborate the key steps and prove
the convergence to a ε-accurate stationary solution.
A. Online AO Algorithm
The proposed online AO algorithm is summarized in Algorithm 1. The indexes J and t are indicators
for referring to a frame and a block, respectively.
In each frame, the BS obtains one channel-symbol vector and stores it in the memory. In the J-th frame
(iteration), the BS has obtained J channel-symbol vectors z1:J ,{zq =
[hqT ,uqT
]T, q = 1, ..., J
}, with
which the BS can construct the following approximated problem of P0:
The subproblem PS is still non-convex. However, we can solve a ε-accurate stationary point of PS (φ; z)
(i.e., a point (θ?,G?) satisfying (9) with φ? replaced by φ) using a low complexity procedure called
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
Procedure PS as will be elaborated in Section IV-B. For fixed Θ(z1:J
Similarly, although the subproblem PJL is non-convex, we can solve a ε-accurate stationary point of
PJL using a low complexity procedure called Procedure PL as will be elaborated in Section IV-C. This
motivates us to use an AO-like update to solve PJ as summarized in the J-th iteration of Algorithm
1. Specifically, in step 2 of the J-th iteration, Procedure PS is called to solve a ε-accurate stationary
point(θJ (zq) ,GJ (zq)
)of PS
(φJ−1; zq
)for q = 1, ..., J . For convenience, we express Procedure PS
as a mapping FPS from some inputs to the short-term control as given in (13) in Algorithm 1. When
solving PS(φJ−1; zq
), the input of Procedure PS includes the threshold ε, the q-th sample of effective
channel-symbol vector zq ,[hqT ,uqT
]T, and the initial baseband precoder θJ−1 (zq), where hq is the
effective channel determined by the RF precoder φJ−1 and the q-th channel sample hq according to (7).
Note that φJ and ΘJ(z1:J
)={θJ (zq) ,GJ (zq) , q = 1, ...J
}denote the RF precoder and short-term
control variables after the J-th iteration, respectively. In step 3 of the J-th iteration, Procedure PL is
called to solve a ε-accurate stationary point of PJL with input ε, z1:J (the J samples of channel-symbol
vectors), ΘJ(z1:J
), and φJ−1 (the initial RF precoder). Similarly, in (14) of Algorithm 1, Procedure PL
is expressed as a mapping FPL from the inputs to the RF precoder.
In the (J + 1)-th iteration, the BS obtains a new sample zJ+1 of channel-symbol vector to improve the
sample average approximation of the MSE I(φ,Θ). Then it performs one AO-like update on PJ+1 as
described above and enters the (J + 2)-th iteration. The AO-like iteration is carried out until convergence.
The overall solution is illustrated in Fig. 3. Intuitively, we can conjecture that Algorithm 1 converges to
a stationary solution of P0 as J →∞. However, the formal proof is quite involved as will be elaborated
in Section IV-D.
Note that in the J-th frame, the updated control variables φJ and ΘJ(z1:J
)are output at the end
of J-th frame to allow sufficient time for the BS to obtain the sample zJ and calculate the control
variables. Hence, in the J-th frame, the BS only has φJ−1 and ΘJ−1(z1:J−1
), and φJ−1 will be used as
the RF precoder in the J-th frame for downlink transmission. However, the short-term control variables
for each block t ∈ [(J − 1)Tf + 1, JTf ] in the J-th frame is still unknown because we usually have
z (t) /∈ z1:J−1. Hence, at the beginning of each block t ∈ [(J − 1)Tf + 1, JTf ], the BS needs to call
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
Figure 3: Summary of overall solution for P0.
Algorithm 1 Online Alternating Optimization Algorithm for Solving a ε-accurate Stationary Solution of
P0
1: Initialization: Choose a fixed ε > 0.
2: Let J = 1. Choose an initial RF precoder φ0.
3: Let z0 = 0. Choose an initial baseband precoder θ0(z0).
4: Step 1: Obtain one realization of channel-symbol vector during frame J , indexed as zJ =[hJT ,uJT
]T .
Initialize θJ−1(zJ)
= θJ−1(zq
J
zJ
)for zJ , where qJzJ = 0 if J = 1 and qJzJ = argmin
q∈[1,J−1]
∥∥zJ − zq∥∥
otherwise.
5: Step 2 (Short-term control optimization):
6: For q = 1 to J
Let (θJ (zq) ,GJ (zq)
)= FPS
(ε, zq,θJ−1 (zq)
). (13)
7: Step 3 (Long-term RF precoder optimization):
8: Let
φJ = FPL(ε, z1:J ,ΘJ
(z1:J
),φJ−1
). (14)
9: Termination:
10: Let J = J + 1 and return to Step 1 until convergence.
Procedure PS to calculate the optimized short-term control variables as:(θJ∗ (z (t)) ,GJ∗ (z (t))
)= FPS
(ε, z (t) ,θJ−1
(zq
Jz
)), (15)
using the input ε, z (t) =[hT (t) ,uT (t)
]Tand θJ−1
(zq
Jz
), where h (t) is the effective channel
determined by the RF precoder φJ−1 and the current channel h (t), qJz = argminq∈[1,J−1]
‖z (t)− zq‖ and
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
zqJz is the stored channel-symbol sample which has the minimum distance with z (t).
In (15), only the effective channel h (t) is required to calculate the optimized short-term control
variables θJ? (z (t)) ,GJ? (z (t)). On the other hand, the update of the long-term RF precoder in step
3 of the J-th iteration only requires the outdated CSI {hq, q = 1, ..., J}, where hq can be obtained in
the q-th frame using downlink pilot training and uplink CSI feedback. Since only a low dimensional
effective channel h (t) ∈ CKLS is required at each block and one outdated high dimensional channel
hq ∈ CKLM is required at each frame (recall that each frame contains many blocks), the CSI signaling
overhead caused by Algorithm 1 is much smaller than the conventional single stage precoding schemes.
In the following, we elaborate Procedure PS for solving the short-term control subproblem PS and
Procedure PL for solving the long-term RF precoding subproblem PJL .
B. Procedure PS for Solving PS
The short-term control subproblem (11) is a deterministic non-convex optimization problem. For fixed
θ, the optimal receive coefficients are given by
G? = argminGI (φ,θ,G; z) . (16)
For fixed θ−i = [θ1, ..., θi−1, θi+1, ..., θNS ] and G, the optimal θi is given by
θ?i = argminθi∈[−π,π)
I (φ,θ,G; z) . (17)
The optimal solutions of (16) and (17) have closed form solutions as will be given in Lemma 1. Before
stating Lemma 1, we first define some notations. Let H ∈ CKN×SN denote the composite effective
channel obtained by replacing the channel matrices H (l) , l = 0, ..., L − 1 in (4) with the effective
channel matrices H (l) = H (l)F, l = 0, ..., L − 1. Note that H can be determined by the the effective
channel h. Let H−i =[H1, ..., Hi−1, Hi+1, ..., HSN
]∈ CKN×(SN−1) denote a matrix obtained by
deleting the i-th column of H.
Lemma 1 (Solutions of Short-term Subproblems). For given φ, z =[hT ,uT
]T and baseband precoder
θ, the solution of (16) is uniquely given by
G?k =
√P∑N−1
n=0 H∗θ [nK + k]uk (n)
P∑N−1
n=0
∣∣∣H∗θ [nK + k]∣∣∣2 +N
, ∀k (18)
where Hθ = Hx (θ), Hθ [nK + k] is the (nK + k)-th element of Hθ, and H is the composite effective
channel corresponding to φ,h.
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
For given φ, z =[hT ,uT
]T , θ−i = [θ1, ..., θi−1, θi+1, ..., θNS ] and G, the solution of (17) is uniquely
given by
θ?i = π + arg{HHi G
H(√
P GH−ix−i − u)}
, (19)
when∣∣∣HH
i GH(√
P GH−ix−i − u)∣∣∣ 6= 0, where x−i =
[ejθ1 , ..., ejθi−1 , ejθi+1 , ..., ejθSN
]. When∣∣∣HH
i GH(√
P GH−ix−i − u)∣∣∣ = 0, the optimal θ?i can take any value in [−π, π).
Please refer to Appendix A for the proof.
Note that to calculate the G? and θ?i in (18) and (19), the BS only needs to know the effective
channel-symbol vector z since φ, z only appear in the composite effective channel H.
The above analysis motivates us to use an alternating optimization method to solve (11). However, the
standard AO method may not converge to a stationary point of a non-convex problem [13]. We need to
address the following challenge.
Challenge 2 (AO algorithm to solve a ε-accurate stationary point of (11)). Design an AO algorithm
which is guaranteed to converge to a ε-accurate stationary point of (11) for arbitrarily small but fixed
ε > 0.
To achieve this, the specific structure of the short-term control subproblems as characterized in Lemma
1 must be exploited in the design of the AO algorithm. Based on Lemma 1, we propose Procedure PS
in Algorithm 2 to find a ε-accurate stationary point of (11) using a modified AO method, where in Line
7, we let θ(l)i = θ
(l−1)i whenever
∣∣∣HHi G
(l)H(√
P G(l)H−ix(l)−i − u
)∣∣∣ < ε. This modification ensures the
convergence to a ε-accurate stationary point of (11) as proved in the following theorem.
Theorem 1 (Convergence of Procedure PS). Every accumulation point (θ?,G?) of the sequence of
iterates(θ(l),G(l)
), l = 1, 2, ... generated by Procedure PS is a ε-accurate stationary point of (11).
Moreover, any accumulation point (θ?,G?) of(θ(l),G(l)
), l = 1, 2, ... is a stationary point of (11) if it
satisfies the following condition:∣∣∣HHi G
?H(√
P G?H−ix?−i − u
)∣∣∣ ≥ ε,∀i (20)
where x?−i =[ejθ
?1 , ..., ejθ
?i−1 , ejθ
?i+1 , ..., ejθ
?SN
].
Please refer to Appendix B for the proof. In practice, we can set ε to be a very small number. In this
case, the probability that∣∣∣HH
i G?H(√
P G?H−ix?−i − u
)∣∣∣ < ε is very small for randomly generated
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
Algorithm 2 Procedure PS: Finding a ε-accurate stationary point of (11)
1: Input: ε, z =[h,u
]Tand θ(0).
2: Initialization: Let l = 1.
3: Step 1 (Receive coefficients optimization):
4: Let
G(l)k =
√P∑N−1
n=0 H∗θ(l−1) [nK + k]uk (n)
P∑N−1
n=0
∣∣∣H∗θ(l−1) [nK + k]
∣∣∣2 +N,∀k (21)
where Hθ(l−1) = Hx(θ(l−1)
)and H is the composite effective channel corresponding to h.
5: Step 2 (Baseband precoder optimization):
6: For i = 1 to NS
7: If∣∣∣HH
i G(l)H
(√P G(l)H−ix
(l)−i − u
)∣∣∣ < ε, let θ(l)i = θ
(l−1)i ;
Otherwise, let
θ(l)i = π + arg
{HHi G
(l)H(√
P G(l)H−ix(l)−i − u
)}, (22)
where x(l)−i =
[ejθ
(l)1 , ..., ejθ
(l)i−1 , ejθ
(l−1)i+1 , ..., ejθ
(l−1)SN
].
8: Termination:
9: Let l = l + 1 and return to Step 1 until convergence or l ≥ NPS .
channel-symbol vector z. In the simulations, we set ε = 10−4 and Procedure PS is always observed to
converge to a stationary point of (11), i.e., condition (20) is always satisfied.
C. Procedure PL for Solving the PJL
Similarly, we can use AO method to solve the RF precoding subproblem in (12). For fixed φ−i =
[φ1, ..., φi−1, φi+1, ..., φMS ], the optimal φi is given by
φ?i = argminφi∈[−π,π)
1
J
J∑q=1
I (φ,θ (zq) ,G (zq) ; zq) . (23)
In the following lemma, we give closed form solution for (23).
We first define some useful notations. Let F−ms denote a matrix obtained by setting the (m, s)-th
element of F to be zero and let F−ms = Block Diag [F−ms, ...,F−ms] ∈ CMN×NS denote the (m, s)-th
extended RF precoder. Note that F−ms and F−ms are determined by φ−i , [φ1, ..., φi−1, φi+1, ..., φMS ],
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
Algorithm 3 Procedure PL: Finding a ε-accurate stationary point of (12)
1: Input: ε, z1:J , Θ(z1:J
)and φ(0).
2: Initialization: Let l = 1.
3: For i = 1 to MS
4: If∣∣∣ 1J
∑Jq=1 gi
(φ
(l)−i,θ (zq) ,G (zq) ; zq
)∣∣∣ < ε, let φ(l)i = φ
(l−1)i ;
5: Otherwise, let
φ(l)i = π + arg
1
J
J∑q=1
gi
(φ
(l)−i,θ (zq) ,G (zq) ; zq
) , (26)
where φ(l)−i =
[φ
(l)1 , ..., φ
(l)i−1, φ
(l−1)i+1 , ..., φ
(l−1)MS
].
6: Termination:
7: Let l = l + 1 and return to Line 3 until convergence or l ≥ NPL.
where i = (m− 1)S + s. For given φ−i , z =[hT ,uT
]T and (θ,G), define a function
gi(φ−i,θ,G; z
),
N−1∑n=0
√Pe−jθnS+sHH
nM+mGH(√
P GHF−msx (θ)− u), (24)
where m and s are determined by (m− 1)S + s = i, and H is the composite channel corresponding to
h.
Lemma 2 (Solution of Long-term Subproblem). For given φ−i, a set of sampled channel-symbol vectors
z1:J and the corresponding short-term control variables Θ(z1:J
), the solution of (23) is uniquely given
by
φ?i = π + arg
1
J
J∑q=1
gi(φ−i,θ (zq) ,G (zq) ; zq
) (25)
providing that∣∣∣ 1J
∑Jq=1 gi
(φ−i,θ (zq) ,G (zq) ; zq
)∣∣∣ 6= 0. When∣∣∣ 1J
∑Jq=1 gi
(φ−i,θ (zq) ,G (zq) ; zq
)∣∣∣ =
0, the optimal φ?i can take any value in [−π, π).
The proof is similar to that of Lemma 1.
Based on Lemma 2, we propose Procedure PL to solve the RF precoding subproblem in (12) as
summarized in Algorithm 3. Similar to Theorem 1, it can be shown that Procedure PL converges to a
ε-accurate stationary point of (12).
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
D. Convergence of the Online AO Algorithm
In this section, we address the following challenge.
Challenge 3 (Convergence to a ε-accurate stationary solution of P0). Establish the analytical proof for
convergence of the online AO Algorithm to a ε-accurate stationary solution of P0 for arbitrarily small
but fixed ε > 0.
Compared to the convergence proof of the conventional AO algorithm for deterministic optimization
problems, there are several new technical challenges in the convergence proof of the online AO algorithm.
• Non-monotonic property due to stochastic optimization: In the propose online AO algorithm,
due to the stochastic nature of Problem P0, the objective value no longer decreases monotonically
after each online AO iteration. Hence, the techniques used to prove the convergence of conventional
AO algorithm cannot be applied to the online AO algorithm.
• Infinite Dimensional Problem: Another challenge is that P0 is an infinite dimensional problem
which has uncountable infinite number of optimization variables when the channel-symbol vector
z has a continuous distribution. As a result, the convergence proof of the online AO algorithm is
much more difficult than the conventional AO algorithm which only works for a problem with a
finite number of optimization variables.
Despite of the above challenges, we prove that the online AO algorithm converges to a ε-accurate
stationary solution of Problem P0 as summarized in the following Theorem.
Theorem 2 (Convergence of Algorithm 1). Let(φJ ,ΘJ ,
{θJ? (z) ,GJ? (z) : ∀z
}), J = 1, 2, ... be
the sequence of iterates generated by Algorithm 1, where θJ? (z) ,GJ? (z) are given in (15). Then any
accumulation point (φ?,Θ?) of(φJ ,ΘJ
), J = 1, 2, ... is a ε-accurate stationary solution of P0 with
probability 1.
Please refer to Appendix C for the detailed proof.
Since ε can be set to be an arbitrarily small (but fixed) positive number, we can say that Algorithm 1
converges to a stationary solution of P0 for all practical purposes.
V. SIMULATION RESULTS AND DISCUSSIONS
Consider the downlink of a multi-user massive MIMO cellular system operating in FDD mode. The
coverage area of the BS is a circle with a radius of 250m. There are a total number of K = 8 users, 6 of
whom are clustered around 2 hotspots. The two hotspots and the other two users are uniformly distributed
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
within the cell. There are L = 4 resolvable paths in the frequency selective fading channel. The channel
vector of user k corresponding to the l-th resolvable path is modeled as hk [l] = ζ1/2k hwk , ∀k, where
hwk ∈ CM has i.i.d. complex entries of zero mean and unit variance; and ζk = E[hk [l]hHk [l]
], ∀l is the
spatial correlation matrix generated according to the local scattering model in [8]. The block length is
N = 128 and the frame length is Tf = 20 blocks. The path gains PLk’s are generated using the “Urban
Macro NLOS” model in [14]. We compare the performance of the proposed solution with the following
3 baselines.
• Baseline 1 (Single Stage ZF Precoding [15]): OFDM with N = 128 subcarriers is used to combat
frequency selective fading and conventional single stage MU-MIMO ZF precoding [15] is used at
each subcarrier.
• Baseline 2 (Single Stage CE Precoding [4]): There is no RF precoder and a CE constraint is
imposed on the baseband transmit signals. The block length is N = 128.
• Baseline 3 (Two Stage Precoding [16]): OFDM with N = 128 subcarriers is used to combat
frequency selective fading and the two stage precoding in [16] is used at each subcarrier. The
dimension of the pre-beamforming matrix in [16] is set as S for fair comparison.
In the simulations, both cases with perfect CSI and outdated CSI will be considered. Specifically,
the outdated CSI is related to the actual CSI by the autoregressive model in [17] with the following
parameters: the user speed is 6 km/h; the carrier frequency is 2GHz; the CSI delay is given by 8Nh
M ms,
where Nh is the dimension of the per user CSI vector required at the BS (e.g., Nh = S for the proposed
solution).
We compare the ergodic sum rate of different schemes under different simulation setup. Note that the
ergodic sum rate depends on the effective total transmit power Pe = PT η, which further depends on
the total transmit power PT and power efficiency η of power amplifiers. Typically, a non-linear power
amplifier is 4 − 6 times more power efficient than a highly linear power amplifier [18]. Since the transmit
signal of Baseline 1 and Baseline 3 has high PAR, we have to use linear power amplifiers. On the other
hand, the proposed solution and Baseline 2 can use more efficient non-linear power amplifiers. In the
simulations, we will consider two cases. In the case with ideal power efficiency, we assume the power
efficiency of both linear and non-linear power amplifiers are 1. In the case with practical power efficiency,
the power efficiency of non-linear power amplifiers are 4 times higher than that of linear power amplifiers.
In this case, the effective total transmit power of Baseline 1 and Baseline 3 is 10 log10(4) ≈ 6dB smaller
than that of the proposed solution and Baseline 2 under the same total transmit power.
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
15 20 25 30 35 40 45 5016
18
20
22
24
26
28
30
32
34
36
Number of RF chains (S)
Erg
odic
sum
rat
e (B
its/c
hann
el u
se)
Proposed under ideal PEBaseline 1 under ideal and practical PEBaseline 2 under ideal PEBaseline 3 under ideal and practical PEProposed under practical PEBaseline 2 under practical PE
Figure 4: Ergodic sum rate versus S with M = 100 antennas and perfect CSI at the BS.
In Fig. 4, we plot the ergodic sum rate versus S with M = 100 antennas and perfect CSI at the BS.
The effective total transmit power of Baseline 1 and Baseline 3 are fixed as -5dB, and the total transmit
power of all schemes are set to be identical (i.e., the effective total transmit power of the proposed
solution and Baseline 2 is -5dB under ideal power efficiency and 1dB under practical power efficiency).
With only S = 32 RF chains, the proposed solution already achieves similar performance as Baseline 2
(single stage CE precoding) which requires M = 100 RF chains. Under the same effective total transmit
power, the two stage/single stage CE precoding schemes achieve lower ergodic sum rate compared to the
linear precoding counterpart (Baseline 3/Baseline 1) due to the more stringent CE constraint. However,
under practical power efficiency, the two stage/single stage CE precoding schemes achieve much higher
ergodic sum rate (see the dot curves) than the linear precoding counterpart due to higher power efficiency
of non-linear power amplifiers. In Fig. 5, we consider outdated CSI at the BS. In this case, with only
S = 24 RF chains, the proposed solution already outperforms Baseline 2 which requires M = 100 RF
chains. This is because in Baseline 2, the BS requires full channel vector which has higher dimension
than the effective channel vector. As a result, the CSI delay (and CSI error) of Baseline 2 is also larger.
The other results are similar to Fig. 4. In summary, with S = 16 RF chains, the proposed solution
achieves the best performance under practical scenario with outdated CSI and non-ideal power efficiency
of power amplifiers, as shown in Fig. 5.
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
15 20 25 30 35 40 45 5016
18
20
22
24
26
28
30
32
Number of RF chains (S)
Erg
odic
sum
rat
e (B
its/c
hann
el u
se)
Proposed under ideal PEBaseline 1 under ideal and practical PEBaseline 2 under ideal PEBaseline 3 under ideal and practical PEProposed under practical PEBaseline 2 under practical PE
Figure 5: Ergodic sum rate versus S with M = 100 antennas and outdated CSI at the BS.
VI. CONCLUSION
We propose a two stage CE precoding solution to resolve the key practical issues associated with the
implementation of massive MIMO systems. While the proposed solution can potentially enable low cost
massive MIMO BS implementations, the optimization of two stage CE precoding is a very challenging
non-convex stochastic optimization problem which cannot be solved by the existing algorithms. We
propose an online AO algorithm which is guaranteed to converge to a stationary solution of this problem
without requiring the explicitly knowledge of channel statistics. Simulations show that the propose solution
can achieve the first order gain provided by the massive MIMO array under practical constraints such
as limited number of RF chains, nonlinear power amplifiers and limited resource for CSI signaling.
Therefore, the proposed solution is a good candidate for practical implementation of massive MIMO
systems.
APPENDIX
A. Proof of Lemma 1
After some calculations, we have
I (φ,θ,G; z) = P∥∥∥GHx (θ)
∥∥∥2+ ‖u‖2 + Tr
(GGH
)−2√PRe
[uHGHx (θ)
], (27)
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
which is a strictly convex function of G. Then the optimal power in (16) can be obtained by solving the
first order optimality condition of the convex optimization problem minGI (φ,θ,G; z).
It can be verified that
I (φ,θ,G; z) =
Tr(GGH
)+∥∥∥√P GH−ix−i − u
∥∥∥2+ P
∥∥∥GHi
∥∥∥2
+2Re(√
P HHi G
H(√
P GH−ix−i − u)e−jθi
), (28)
from which it is easy to see that the solution in Lemma 1 minimizes I (φ,θ,G; z) over θi ∈ [−π, π).
B. Proof of Theorem 1
For convenience, define f (x,G) = I (φ,θ,G; z), where θ = arg (x). Clearly, Procedure PS decreases
the objective f (x,G) after each iteration. Since f (x,G) is lower bounded by zero, the objective
f (x,G) converges to some value I?. Let x(l) = ejθ(l)
and let (x?,G?) denote an accumulation point of(x(l),G(l)
), l = 1, 2, .... Then there exists a subsequence lk, k = 1, ... such that limk→∞G(lk) → G?
and limk→∞ x(lk) → x?.
We first prove G? = G◦ , argminG
f (x?,G). Since G(lk+1) = argminG
f(x(lk),G
), we have limk→∞G(lk+1) =
G◦. If G? 6= G◦, we have
limk→∞
f(x(lk),G(lk)
)− f
(x(lk),G(lk+1)
)= f (x?,G?)− f (x?,G◦) 6= 0, (29)
where the last inequality holds because minGf (x?,G) has a unique solution G◦. (29) contradicts with
liml→∞ f(x(l),G(l)
)= I?. Hence, we must have G? = G◦.
Second, we prove x?1 = x◦1 , argmin|x1|=1
f (x?0,G?), where x?0 = [x1, x
?2, ..., x
?NS ]. There are two cases.
Case 1:∣∣∣HH
1 G?H(√
P G?H−1x?−1 − u
)∣∣∣ ≥ ε, where x?−1 = [x?2, ..., x?NS ]. In this case, if x?1 6= x◦1,
we have
limk→∞
f(x(lk),G(lk+1)
)− f
(b
(lk+1)1 ,G(lk+1)
)= f (x?,G?)− f (b◦1,G
?) 6= 0, (30)
where b(lk+1)1 =
[x
(lk+1)1 , x
(lk)2 , ..., x
(lk)NS
], b◦1 = [x◦1, x
?2, ..., x
?NS ], the first equality holds because limk→∞G(lk+1) =
G◦ = G? and limk→∞ x(lk+1)1 = x◦1, the last inequality holds because argmin
|x1|=1
f (x?0,G?) has a unique
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
solution when∣∣∣HH
1 G?H(√
P G?H−1x?−1 − u
)∣∣∣ ≥ ε. (30) contradicts with liml→∞ f(x(l),G(l)
)= I?.
Hence, x?1 = x◦1.
Case 2:∣∣∣HH
1 G?H(√
P G?H−1x?−1 − u
)∣∣∣ < ε. It follows from (28) that f (x?,G?)−argmin|x1|=1
f (x?0,G?) <
ε. Moreover, there exists a sufficiently large k0, such that∣∣∣HH
1 G(lk)H(√
P G(lk)H−1x(lk)−1 − u
)∣∣∣ <ε,∀k ≥ k0. By the definition of Procedure PS, x(lk+1)
1 = x(lk)1 , ∀k ≥ k0 and thus we also have
limk→∞ x(lk+1)1 = x?1.
Repeating similar analysis as that for x?1, it can be shown that for i = 1, ..., NS, we have x?i = x◦i ,
argmin|x1|=1
f(x?i−1,G
?), if∣∣∣HH
i G?H(√
P G?H−ix?−i − u
)∣∣∣ ≥ ε; and f (x?,G?)−argmin|x1|=1
f(x?i−1,G
?)< ε
otherwise, where x?i−1 =[x?1, ..., x
?i−1, xi, x
?i+1, ..., x
?NS
]. Then Theorem 1 follows immediately from this
result and that G? = G◦ , argminG
f (x?,G).
C. Proof of Theorem 2
Define IJ (φ) = 1J
∑Jq=1 I
(φ,θJ (zq) ,GJ (zq) ; zq
)and IJ0 (φ) = 1
J
∑Jq=1 I
(φ,θJ−1 (zq) ,GJ−1 (zq) ; zq
).
Let 4IJ = IJ(φJ)− IJ0
(φJ−1
). Clearly, we have 4IJ ≤ 0. On the other hand,
4IJ = I(φJ , ΘJ
)− I
(φJ−1, ΘJ−1
)+ IJ
(φJ)
−I(φJ , ΘJ
)+ I
(φJ−1, ΘJ−1
)− IJ0
(φJ−1
). (31)
where ΘJ ={θJ(zq
J+1z
),GJ
(zq
J+1z
): ∀z
}, J = 1, 2, ..., qJ+1
z = argminq∈[1,J ]
‖z− zq‖, and z is the
effective channel-symbol vector corresponding to z. According to the the strong law of large numbers,
we have
limJ→∞
IJ(φJ)− I
(φJ , ΘJ
)→ 0, (32)
limJ→∞
IJ0(φJ−1
)− I
(φJ−1, ΘJ−1
)→ 0, (33)
with probability 1. Combining (31) to (33), we have
limJ→∞
4IJ −(I(φJ , ΘJ
)− I
(φJ−1, ΘJ−1
))= 0 (34)
with probability 1. From (34), 4IJ ≤ 0 and the fact that I(φJ , ΘJ
)is lower bounded by zero, we
have limJ→∞4IJ
= 0 with probability 1, from which it follows that
limJ→∞
I(φJ , ΘJ
)− I
(φJ−1, ΘJ−1
)= 0, (35)
with probability 1. Based on the above analysis, we use contradiction to prove that any accumulation
point(φ?, Θ? ,
{θ?
(z) , G? (z) : ∀z})
of(φJ , ΘJ
), J = 1, 2, ... satisfies the following conditions if
(35) is true.
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
C1: For every z outside a set of probability zero, we have 1) G? (z) = argminGI(φ?, θ
?(z) ,G; z
)and
2) if∣∣∣∣HH
i˜G?H
(z)
(√P˜G?
(z) H−ix?−i (z)− u
)∣∣∣∣ ≥ ε, θ?i (z) = argminθi∈[−π,π)
I(φ?, θ
?
i−1 (z) , G? (z) ; z)
,
where θ?
i−1 (z) =[θ?1 (z) , ..., θ?i−1 (z) , θi, θ
?i+1 (z) , ..., θ?NS (z)
], x?−i (z) =
[ejθ
?1 (z), ..., ejθ
?i−1(z), ejθ
?i+1(z), ..., ejθ
?SN (z)
];
otherwise I(φ?, θ
?(z) , G? (z) ; z
)− argminθi∈[−π,π)
I(φ?, θ
?
i−1 (z) , G? (z) ; z)< ε.
C2: If E[g(φ?−i, θ
?(z) , G? (z) ; z
)]≥ ε, φ?i = argmin
φi∈[−π,π)
I(φ?i−1, Θ
?)
, where φ?−i =[φ?1, ..., φ
?i−1, φ
?i+1, ..., φ
?MS
]and φ?i−1 =
[φ?1, ..., φ
?i−1, φi, φ
?i+1, ..., φ
?MS
]; otherwise I
(φ?, Θ?
)− argminφi∈[−π,π)
I(φ?i−1, Θ
?)< ε.
The proof can be obtained by contradiction. Since(φ?, Θ?
)is an accumulation point, there exists a
subsequence Jk, k = 1, ... such that limk→∞φJk → φ? and limk→∞ ΘJk → Θ?. Suppose C1 and C2
are not satisfied. Using (32), (33) and following similar analysis as that in Appendix B, it can be shown
that limk→∞ I(φJk , ΘJk
)− I
(φJk+1, ΘJk+1
)6= 0, which contradicts with (35).
Next, we show that limJ→∞ θJ? (z) − θJ(zq
J+1z
)= 0 and limJ→∞GJ? (z) −GJ
(zq
J+1z
)= 0 for
all z and thus any accumulation point (φ?,Θ?) of(φJ ,ΘJ
), J = 1, 2, ... is also an accumulation point(
φ?, Θ?)
of(φJ , ΘJ
), J = 1, 2, .... Note that
(θJ? (z) ,GJ? (z)
)is the output of Procedure PS with
input z and(θJ−1
(zq
Jz
),GJ−1
(zq
Jz
)), and
(θJ(zq
J+1z
),GJ
(zq
J+1z
))is the output of Procedure PS
with input zqJ+1z and
(θJ−1
(zq
Jz
),GJ−1
(zq
Jz
)). It can be verified that Procedure PS is a continuous
mapping from the input to the output and limJ→∞ zqJ+1z −z = 0. As a result, we have limJ→∞ θJ? (z)−
θJ(zq
J+1z
)= 0 and limJ→∞GJ? (z)−GJ
(zq
J+1z
)= 0.
Combining the above results, we have proved that any accumulation point (φ?,Θ?) of(φJ ,ΘJ
), J =
1, 2, ... satisfies C1 and C2 with probability 1, from which Theorem 2 follows.
REFERENCES
[1] F. Rusek, D. Persson, B. K. Lau, E. Larsson, T. Marzetta, O. Edfors, and F. Tufvesson, “Scaling up MIMO: Opportunities
and challenges with very large arrays,” IEEE Signal Processing Magazine, vol. 30, no. 1, pp. 40–60, Jan. 2013.
[2] C. Studer and E. Larsson, “Par-aware large-scale multi-user mimo-ofdm downlink,” IEEE J. Select. Areas Commun., vol. 31,
no. 2, pp. 303–313, February 2013.
[3] S. Mohammed and E. Larsson, “Per-antenna constant envelope precoding for large multi-user MIMO systems,” IEEE
Trans. Commun., vol. 61, no. 3, pp. 1059–1071, March 2013.
[4] ——, “Constant-envelope multi-user precoding for frequency-selective massive MIMO systems,” IEEE Wireless Commu-
nications Letters, vol. 2, no. 5, pp. 547–550, October 2013.
[5] X. Zhang, A. Molisch, and S.-Y. Kung, “Variable-phase-shift-based rf-baseband codesign for mimo antenna selection,”
IEEE Trans. Signal Processing, vol. 53, no. 11, pp. 4091–4103, 2005.
[6] P. Sudarshan, N. Mehta, A. Molisch, and J. Zhang, “Channel statistics-based RF pre-processing with antenna selection,”
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
[7] O. El Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. Heath, “Spatially sparse precoding in millimeter wave MIMO
systems,” IEEE Trans. Wireless Commun., 2014.
[8] A. Liu and V. K. N. Lau, “Phase only RF precoding for massive MIMO systems with limited RF chains,” IEEE Trans.
Signal Processing, vol. 62, no. 17, pp. 4505–4515, Sept 2014.
[9] J. R. Birge and F. Louveaux, Introduction to stochastic programming. Springer, 2011.
[10] S. Boyd and A. Mutapcic, “Stochastic subgradient methods,” 2008. [Online]. Available: http://see.stanford.edu/materials/
lsocoee364b/04-stoch_subgrad_notes.pdf
[11] V. van Zelst and T. Schenk, “Implementation of a mimo ofdm-based wireless lan system,” IEEE Trans. Signal Processing,
vol. 52, no. 2, pp. 483–494, Feb 2004.
[12] A. Liu and V. Lau, “Hierarchical interference mitigation for massive mimo cellular networks,” IEEE Trans. Signal
Processing, vol. 62, no. 18, pp. 4786–4797, Sept 2014.
[13] L. Grippo and M. Sciandrone, “On the convergence of the block nonlinear gauss-seidel method under convex constraints,”
Operat. Res. Lett., vol. 26, pp. 127–136, 2000.
[14] Technical Specification Group Radio Access Network; Further Advancements for E-UTRA Physical Layer Aspects, 3GPP
[15] T. Yoo and A. Goldsmith, “On the optimality of multiantenna broadcast scheduling using zero-forcing beamforming,” IEEE
Journal on Selected Areas in Communications, vol. 24, no. 3, pp. 528 – 541, mar. 2006.
[16] A. Adhikary, J. Nam, J.-Y. Ahn, and G. Caire, “Joint spatial division and multiplexing - the large-scale array regime,”
IEEE Trans. Info. Theory, vol. 59, no. 10, pp. 6441–6463, Oct 2013.
[17] K. Baddour and N. Beaulieu, “Autoregressive modeling for fading channel simulation,” IEEE Trans. Wireless Commun.,
vol. 4, no. 4, pp. 1650–1662, 2005.
[18] S. C. Cripps, RF Power Amplifiers for Wireless Communications. Artech Publishing House, 1999.
An Liu (S’07–M’09) received the Ph.D. and the B.S. degree in Electrical Engineering from Peking
University, China, in 2011 and 2004 respectively.
From 2008 to 2010, he was a visiting scholar at the Department of ECEE, University of Colorado at
Boulder. From 2011 to 2013, he was a Postdoctoral Research Fellow with the Department of ECE, HKUST,
and he is currently a Research Assistant Professor. His research interests include wireless communication,
stochastic optimization and compressive sensing.
1053-587X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSP.2015.2486749, IEEE Transactions on Signal Processing
Vincent K. N. Lau (SM’04–F’12) obtained B.Eng (Distinction 1st Hons) from the University of Hong
Kong (1989-1992) and Ph.D. from the Cambridge University (1995-1997). He joined Bell Labs from
1997-2004 and the Department of ECE, Hong Kong University of Science and Technology (HKUST) in
2004. He is currently a Chair Professor and the Founding Director of Huawei-HKUST Joint Innovation
Lab at HKUST. His current research focus includes robust and delay-optimal cross layer optimization
for MIMO/OFDM wireless systems, interference mitigation techniques for wireless networks, massive