In this paper, we propose an iterative transceiver design in a multirelay multiuser multipleinput multipleoutput (MIMO) system. The design criterion is to minimize sum mean squared error (SMSE) under relay sum power constraint (RSPC) where only local channel state information (CSI)s are available at relays. Local CSI at a relay is defined as the CSI of the channel between BS and the relay in the 1
^{st}
hop link, and the CSI of the channel between the relay and all users in the 2
^{nd}
hop link. Exploiting BS transmitter structure which is concatenated with block diagonalization (BD) precoder, each relay’s precoder can be determined using local CSI at the relay. The proposed scheme is based on sequential iteration of two stages; stage 1 determines BS transmitter and relay precoders jointly with SMSE duality, and stage 2 determines user receivers. We verify that the proposed scheme outperforms simple amplifyandforward (SAF), minimum mean squared error (MMSE) relay, and an existing good scheme of
[13]
in terms of both SMSE and sumrate performances.
1. Introduction
M
utipleinput multipleoutput (MIMO) and relay have been dealt with serviceable technologies which greatly contribute to the beyond fourth generation (4G) wireless system. In
[1]
, various relay technologies embodied in long term evolution – advanced (LTEA) standard are described, and some open research issues which should be resolved in order for relay technology to be practically incorporated into the 4G standard are presented. In a realistic network environment which suffers from significant path loss and shadowing, relays are essential to overcome severe signal power attenuation. Amplifyandforward (AF) and decodeandforward (DF) are conventional relaying protocols
[2]
, which are also known as nonregenerative and regenerative relaying, respectively. In a practical implementation perspective, AF relay has been a preferred protocol for its simplicity since DF relay requires processing delay for full decoding. As a solution to improve the QoS of multiple endusers, the deployment of multiple AF relays with multiple antennas can be considered. If all users are also assumed to equip multiple antennas in this case, the property of this allMIMO network can be efficiently exploited to enhance the endusers’ performances. As a simple example, we can also consider that onetoone coupling between relays and users, which means that each user is served by a certain relay. Similarly, we can imagine that onetomany coupling between relays and users. In these cases, BS transmits each user’s desirable data streams to the user’s corresponding relay in the 1
^{st}
hop channel. Each relay processes the received signal and transmits to their serving users in the 2
^{nd}
hop channel. Since users experience interference from the relays serving other user groups, interference management among relays should be considered to use multiple antennas of relays and users. According to the relay deployment scenarios above, we should design the linear processing matrices at BS, relay, and users in order to maximize the performance of the network in a given operation environment. To avoid confusion of terminology, we denote the linear processing matrices at BS, each relay, and each user, as transmitter, precoder, and receiver, respectively, for the rest of this paper. Thus, transceiver design for MRMU MIMO network means to determine BS transmitter, relay precoders and user receivers.
There have been several researches dealing with multirelay multiuser (MRMU) network. Zhao et al.
[3]
proposed cooperative relaying transmission schemes for the network in which there are two users and two decodeandforward (DF) relays with single BS, and multiple antennas are equipped to relays and users. Authors optimized the 1
^{st}
hop link and the 2
^{nd}
hop link separately rather than jointly. Well known singular value decomposition (SVD) transmission scheme
[4]
is used for the 1
^{st}
hop channel and two relays cooperatively choose their beamforming vectors so that each user’s desired data can be combined with maximum ratio combining (MRC)
[4]
for the 2
^{nd}
hop channel. However, they cannot be easily extended to the network setting of arbitrary numbers of relays and users. Long et al.
[5]
proposed an energy efficient relaying scheme MRMU network, where multiple antennas are employed to BS while single antenna is employed to each relay and each user in the network. Proposed design of
[5]
chooses the coefficient of each relay to mitigate interuser interference at users and minimize transmit power of relays. This cannot be directly applied to the network with multiple antenna relay. In addition, authors of
[5]
assumed BS transmitter to be zeroforcing (ZF) precoder to make the problem more tractable, which is not certainly an optimal way. Talebi el al.
[6]
proposed a relaying scheme for MRMU network where BS and each relay employ multiple antennas and each user employs single antenna. It was designed to increase sum rate with the assumption of DF relaying. The schemes in
[6]
assumed that BS transmitter is fixed to ZFdirty paper coding (DPC) precoder
[7]
and each user employs single antenna. After all, even if there have been several researches which investigated transceiver design for MRMU network, relays or users are assumed to have single antenna and BS transmitter is fixed to ZF or ZFDPC precoder for simplicity of analysis. If multiple antennas are equipped to all nodes and BS transmitter is not fixed to predefined precoder, the transceiver design problem becomes much more involved. Many properties of ultimate sum mean squared error (SMSE) and sum rate of MRMU MIMO networks have yet to be investigated. It should be also noted that there are still many research problems to be studied for regarding the general scope of the MRMU MIMO investigation.
[8]
,
[9]
and
[10]
addressed precoder designs in a multipointtomultipoint network with multiple relays. Phan et al.
[8]
considered multirelay assisted multi pair network in which all nodes equip single antenna and investigated on the determination of relay coefficients (relay amplification factor), modeling coefficients of multiple relay as a beamforming vector. They presented three kinds of optimization problem such as beamforming with minimum relay power, minimax optimization of individual beamforming power, and beamforming optimization with orthogonal source transmission. Nonsmooth optimization algorithms were proposed as a method to solve three optimization problems. Oyman and Paulraj introduced matched filter (MF), ZF, and linear minimum mean squared error (MMSE) relaying and provided performance verification of those schemes in terms of per stream signal to interference ratio (SIR) distribution in
[9]
. Chalise and Vandendorpe
[10]
proposed a relay precoder design to satisfy each user’s target SINR. Focuses of
[9]
[10]
were limited to the optimization of relay precoders without considering the source precoder structure. Moreover, each source and destination is paired one to one. Thus our MRMU MIMO network is not special case of multipointtomultipoint network since one BS serves multiple users.
Most transceiver design in relay problems require all channel state information (CSI) in the system with the assumption on centralized processing. Xing et al.
[11]
provided a unified linear MMSE transceiver design framework based on quadratic matrix programming (QMP) for various wireless networks, which include multi source, relay and multi user MIMO network. In this paper, we are more interested in relay precoder design with local CSIs at a relay to make it more practical in a relay network. We proposed a transceiver design to minimize user SMSE under relay sum power constraint (RSPC) where only local CSIs are available at relays. Local CSI at a relay is defined as the CSI of the channel between BS and the relay in the 1
^{st}
hop link, and the CSI of the channel between the relay and all users in the 2
^{nd}
hop link.
Fig. 1
illustrates the local channel of relays. In order to make it possible to design relay precoder with local CSIs at relays, we employ BS transmitter structure composed of two concatenated matrices, of which the former part is fixed to blockdiagonalization (BD) precoder
[12]
. The latter part of the BS transmitter is jointly optimized with relay precoders. Thus, the proposed scheme has advantage over the schemes in
[5]
[6]
which simply fix the total BS transmitter to be ZF based precoder. The proposed scheme is based on sequential iteration of two stages. In stage 1, the latter part of BS transmitter and relay precoders are jointly determined by applying SMSE duality for the 2
^{nd}
hop channel with fixed user receivers. In stage 2, user receivers are determined to MMSE receivers with fixed BS transmitter and relay precoders. Algorithm repeats iteratively until it converges. We numerically verify that the proposed scheme outperforms simple AF (SAF), MMSE relaying
[9]
, and BDMMSE
[13]
in terms of both SMSE and sum rate performances. The proposed algorithm considered in this paper is yet to be practical. However, it is expected that this work gives a foundation to examine practical design in the MRMU MIMO network by introducing useful tools such as BD precoder and SMSE duality and showing how to approach the transceiver design problem. To make the problem more practical, per antenna power constraint (PAPC) or imperfect CSI may be additionally considered, which is left for future research.
Multiple relay multiuser MIMO system
The following notations are introduced for the rest of the paper. We use[·]
^{H}
and blkdiag[·] for the transpose, the Hermitian, and the block diagonal matrix composed of [·], respectively. E[·], ║·║
_{F}
, and tr[·] denote the expectation, the Frobenius norm, and the trace operator, respectively. Boldface uppercase and lowercase fonts denote matrices and vectors, respectively. Finally,
I
_{X}
denotes the identity matrix of rank
X
.
The remainder of this paper is organized as follows. The mathematical expression of the considered system model is specified in section II. In section III, we explain theoretical foundation of transceiver design with only local CSI and provide the structure of BS transmitter. Specific transceiver design procedure is described in section IV. Numerical results are provided in section V. Finally, we make conclusions in section VI.
2. System Model
A downlink network which consists of a single BS,
R
relays and
K
users is considered as in
Fig. 1
. All the nodes in the system equip multiple antennas and the number of antennas of BS, each relay, and each user is denoted as
N_{S}
,
N_{R}
, and
N_{U}
, respectively. The number of the
k
^{th}
user’s desired data streams is denoted by
L_{k}
and BS transmits
data streams. The
k
^{th}
user’s data
x
_{k}
∈
C
^{Lk×1}
is transmitted to a relay after being precoded by
T
_{BS,k}
∈
C
^{NSxLk}
at BS. The received signal at the
r
^{th}
relay is written by
where
G
_{r}
∈
C
^{NR×NS}
denotes the 1
^{st}
hop channel matrix from BS to the
r
^{th}
relay, and
n
_{1,r}
denotes noise vector at the
r
^{th}
relay. Two stacked matrix
and
T
_{BS}
=[
T
_{BS,1}
T
_{BS,2}
⋯ ]
_{k}
∈
C
^{NS×L}
are introduced. The received signal at all relays can be written as
where
and
The
r
^{th}
relay weights
y
_{r}
by
W
_{r}
∈
C
^{NR×NR}
and transmits it to users. Stacked precoder matrix for all relays can be written as
W
=blkdiag[
W
_{1}
W
_{2}
⋯ ∈
C
^{NRR×NRR}
. The 2
^{nd}
hop MIMO channel between the
r
^{th}
relat and the
k
^{th}
user is denoted by
H
_{kr}
∈
C
^{NU×NR}
For notational convenience, we define three kinds of stacked matrices, relaywide, userwide, and systemwide 2
^{nd}
hop channel matrices.
The
k
^{th}
user’s received signal
z
_{k}
∈
C
^{NR×NS}
is given by
where
n
_{2,k}
is additive noise at the receiver of the
k
^{th}
user. With the stacked matrices
and
the received signal for all users can be written by
Finally,
z
_{k}
is equalized by linear filter
R
_{k}
∈
C
^{Lk×NU}
in order to estimate its desired data
The stacked receiver matrix is given by
R
= blkdiag [
R
_{1}
R
_{2}
⋯ ∈
C
^{L×NRR}
. The error covariance matrix of the system can be represented as follows.
where
x
is assumed to be uncorrelated Gaussian random vector which has zero mean and identity autocovariance matrix. Both
n
_{1}
and
n
_{2}
are also assumed to be uncorrelated Gaussian random vector with zero mean and
and
Our proposing scheme is applicable for the network configuration in which the number of users is larger than that of relays. In this case, multiple users are classified into each relay’s serving group, and then one relay may serve multiple users. If we set the index of relay which serves the
k
^{th}
user to
r
(
k
) then BS transmits the
k
^{th}
user’s desirable data to the
r
(
k
) th relay. Other relays generate interference to the
k
^{th}
user. For mathematical convenience in the derivation of algorithm, we assume that the number of relays and users are fixed to be the same (
R
=
K
). It is also assumed that
r
(
k
)=
k
, which means that the
k
^{th}
user is served by the relay which has the same index
k
. For the feasibility of the scenario, we assume that
L_{k}
≤min{
N_{U}
,
N_{R}
} for all
k
.
3. The Structure of BS Transmitter
To solve joint BS transmitter and relay precoder design problem for SMSE minimization, we can formulate the following optimization problem.
where
P
_{BS}
denotes the maximum power of a BS,
denotes maximum total power of all relays. The determination of the
r
^{th}
relay precoder
W
_{r}
generally requires CSI of the 2
^{nd}
hop channel between the other relays and all users. It results in increased signaling overhead and making practical implementation be formidable. In order to understand why the other relays’ CSI is needed, we rewrite the objective function of (7) as follows.
Observe that
W
_{r}
and
∀
l
≠
r
are multiplied in the 2
^{nd}
term in the right hand side of (8). It means that they jointly affect SMSE. Thus the determination of
W
_{r}
requires
∀
l
≠
r
.
We introduce BD precoder
[12]
to eliminate the 2
^{nd}
term in the right hand side of (8). Originally, BD precoder is a well known precoder which blockdiagonalizes downlink channel in the MIMO BC network, so that it can make interuser interference be perfectly eliminated. Let us consider a downlink multiuser MIMO network which consists of a BS and
K
users. The number of BS antenna and the number of antenna of the
k
^{th}
user is denoted as
n_{T}
and
n_{Rk}
respectively, and
Let
M
_{k}
and
H
_{k}
be the transmit matrix which precodes the
k
^{th}
user’s data streams and the channel matrix from BS to the
k
^{th}
user, respectively. We denote the other channels, which means that the channel matrix from BS to other users, as
M
_{k}
projects the
k
^{th}
user’s data streams into the null space of ]
^{~}
. When
is a rank of 】
^{~}
, the SVD of ]
^{~}
can be written as
where
means the first
right singular vectors, and
means the last(
n_{T}

singular vetors. Since
forms an orthogonal basis for the null space of]
^{~}
, we set
M
_{k}
to be
. By
M
_{k}
, the
k
^{th}
user’s desired data streams is nullified at the other users. Consequently, there is no user which experiences other user interferences. We call the columnstacked matrix
M
=[
M
_{1}
M
_{2}
⋯ as BD precoder. BD precoder effectively transforms downlink channel to block diagonal matrix.
We apply this BD precoder to blockdiagonalize the 1
^{st}
hop channel of our network. For the feasibility of BD precoder at BS, we assume that
N_{S}
>
N_{R}
(
K
1) and
L_{k}
≤
N_{S}

N_{R}
(
K
1) for all
k
[12]
. BS transmitter is assumed to have following structure.
where
is the BD precoder (the former part) and
is an additional precoder (the latter part).
makes the
k
^{th}
relay receive the
k
^{th}
user’s data streams without suffering interference from the other users’ data streams. Note that the
k
^{th}
user is served by the
k
^{th}
relay as explained in the previous section.
denotes the rank of the
GT
_{BD,k}
which is the effective 1
^{st}
hop channel for the
k
^{th}
user’s data streams, and
If
N_{S}
≥
N_{R}K
, then
Otherwise,
Thus it always hold that
The 1
^{st}
hop channel after being precoded by
T
_{BD}
is blockdiagonalized. It is denoted by
G
_{eff}
=
GT
_{BD}
, and the
k
^{th}
block diagonal element of
G
_{eff}
is denoted by
We also assume that
T
has block diagonal structure, where the
k
^{th}
block diagonal element
precodes the
k
^{th}
user’s data streams. The determination of
T
_{k}
is discussed in the next section. Inserting
T
_{BS}
in (9) into (8), we obtain the following.
The detailed derivation is presented in the Appendix. Observe that the 2
^{nd}
term in the right hand side of (8) is perfectly eliminated by the BD precoder in (10).
Now, we can rewrite BS power constraint in (7) using the following equation.
The detailed derivation is explained in the Appendix. Problem (7) can be written as follows.
To avoid notational confusion, we clarify the subscripts in this paper. We used subscript
r
to indicate relay index, and
k
to indicate user index until so far. In the standpoint of the
r
^{th}
relay,
l
was used to indicate the index of the other relay except for the
r
^{th}
relay itself. As we assumed that
R
=
K
and the
k
^{th}
user is served by the relay of the same index
k
, we use
k
to indicate both relay and user indices from now on this paper. (12) is still nonconvex problem over
T
_{k}
,
W
_{k}
, and
R
_{k}
for all
k
and thus it is hard to derive jointly suboptimal solution. We investigate indepth on the solution (12) in the following section.
4. BS Transmitter and Relay Precoder Optimization
We try to derive suboptimal solution based on sequential iterative algorithm. Sequential iteration based algorithm is composed of two stages. In the 1
^{st}
stage,
T
and
W
are determined by fixing
R
. Then
R
is determined with resultant fixed
T
and
W
in the 2
^{nd}
stage. The iteration repeats untils SMSE converges.
 4.1 The Structure of a Relay Precoder
The
k
^{th}
relay precoder can be decomposed into two signal processing matrices as follows.
where
B
_{k}
and
F
_{k}
respectively denote receive matrix and transmit matrix of the
k
^{th}
relay. Overall receive matrix and transmit matrix of all relays are expressed as
B
= blkdiag[
B
_{1}
B
_{2}
⋯ ∈
C
^{L×NRK}
and
F
= blkdiag]undefined[
F
_{1}
F
_{2}
⋯ ∈
C
^{NRK×L}
, respectively.
B
_{k}
equalizes the
k
^{th}
user’s data streams and is determined to be a MMSE filter as follows.]undefined
After substituting
B
_{k}
into (10), some mathematical manipulation yields,
where
and
(See the Appendix for the detailed derivation)
The physical meaning of
Q
_{k}
is the autocorrelation matrix of the
k
^{th}
user’s desirable signal at the
k
^{th}
relay. The correlation matrices of all the users’ data are originally identity matrices at BS. However, the recovered signal by (14) at the relay is contaminated compared to the original
x
since interstream interference for each relay is residual even if interrelay interference is perfectly eliminated. Therefore the correlation matrix of the recovered signal is not identity matrix any more. By matrix inversion lemma, the following equation is hold.
Duality for the 2^{nd} hop link
Thus the 1
^{st}
term
in (15) implies SMSE at all relays. We denote it tr [
E
_{1}
]. The latter part in (15) is denoted by tr[
E
_{2}
].
With (14) and
Q
_{k}
, optimization problem (12) is rewritten as follows.
 4.2 SMSE Duality for the 2ndHop Channel
Dual uplink network is illustrated in
Fig. 2
. SMSE duality is used for the 2
^{nd}
hop channel in the 1
^{st}
stage to jointly determine
T
and
F
in a smarter way. If SMSE duality would not be exploited,
T
and
F
should be determined in an iterative way. That is, updating
F
with fixed
T
and updating
T
with fixed
F
, which results in the requirement of additional iteration stage. SMSE duality refers to the property that the SMSE in the downlink can be also achieved in the dual uplink, and vice versa while total transmit power of relays in downlink is kept same as the total transmit power of users in the dual uplink. For the given downlink network, we can imagine a network in which the communication direction is reversed. We call this network having the communication link with reversed direction as dual uplink network. While the original downlink network is switched into the dual uplink network, the role of transmitter and receiver is changed each other and the channel is flipped. If we assume that the channel of the original downlink network is denoted by
H
, the flipped channel is accordingly denoted by
H
^{H}
. SMSE duality can be applied for various types of networks which include MIMO relay aided network. In this paper, we apply SMSE duality only for the 2
^{nd}
hop channel as in
Fig. 2
. In the original downlink network, each relay acts as a receiver for the 1
^{st}
hop channel, and a transmitter for the 2
^{nd}
hop channel. While on the other hand in the dual uplink channel, the roles of relays and users are changed in the 2
^{nd}
hop channel.
In the dual uplink network, we denote the user’s transmitter as
P
_{k}
and the
k
^{th}
relay’s receiver as
K
_{k}
. We also use two stacked matrices
P
= blkdiag [
P
_{1}
P
_{2}
⋯ ∈
C
^{NUK×L}
and
K
= blkdiag[
K
_{1}
K
_{2}
⋯ ∈
C
^{L×NRR}
Noise at the relay is assumed to have zero mean and
variance. The covariance matrix for SMSE in the dual uplink is defined as follows.
where we assume that the transmit signals from users have zero mean and unit variance. SMSE duality provides the rule on how to determine
K
and
P
from given
F
and
R
, respectively, with constraint on
where
Q
= blkdiag[
Q
_{1}
Q
_{2}
⋯ ∈
C
^{L×L}
. Following proposition explains the SMSE duality for the 2
^{nd}
hop channel and reveals the relationship among
F
_{k}
,
K
_{k}
,
R
_{k}
, and
P
_{k}
.
Proposition 1
:
Let us set
F
_{k}
,
K
_{k}
,
R
_{k}
,
P
_{k}
, and
α
to be
Then, two equalities hold.
Proof
: (22) is proved straightforwardly by mathematical manipulation. (The detailed derivation is presented in the Appendix.) (23) is also derived directly from (21). These complete the proof of the proposition.
(23) implies that total transmit power in the dual uplink is kept same as one in the downlink. In summary, SMSE duality means that SMSE in the dual uplink can be kept same as SMSE in the downlink, with constraint that sum power consumption at users in the dual uplink shall be the same as sum power consumption at relays in the downlink.
Using the transformation in the proposition 1, optimization problem (19) can be reformulated as
Observe that the variables are changed from
T
_{k}
,
F
_{k}
, and
R
_{k}
to
T
_{k}
,
K
_{k}
, and
P
_{k}
. We provide a method to solve
T
_{k}
,
K
_{k}
, and
P
_{k}
, in the next subsection.
 4.3 Determination of BS Transmitter and Relay Precoders
In the 1
^{st}
stage,
R
is assumed to be fixed. Consequently,
P
in the dual uplink is assumed to be fixed in accordance with
R
. In a mathematical expression,
is a function of
K
_{k}
for all
k
. The
k
^{th}
relay filters its received signal from users by
K
_{k}
and recovers the signal of the
k
^{th}
user. Then,
K
_{k}
is determined to a following MMSE filter.
Note that
K
_{k}
is determined using ony the local CSI of the
k
^{th}
relay,
F
_{k}
for all
k
is calculated by conversion from uplink to downlink in an analogous way to (21).
Now in order to solve
T
_{k}
for all
k
, the variables of optimization problem (24) are unified to
T
_{k}
for all
k
. Inserting
K
_{k}
for all
k
in (25) into
, we obtain the following (See the Appendix for the detailed derivation.)
In order to clarify the problem, we introduce following several substitutions.
Using (27), the objective function of (24) is simplified as
Since every
f
_{0,k}
(
T
_{k}
) does not depend on
T
_{l}
, ∀
l
≠
k
, minimizing
is decomposed into
K
individual subproblems which minimize individual
f
_{0,k}
(
T
_{k}
). Next proposition provides the optimal structure for minimizing
f
_{0,k}
(
T
_{k}
)
Proposition 2:
The optimal structure for minimizing
f
_{0,k}
(
T
_{k}
)
is given by
,
where
Ꮩ
_{k}
is a diagonal matrix of size L
_{k}
,
and
and
U
_{2,k}
∈
C
^{Lk×Lk}
are unitary matrices which are generated from following eigenvalue decompositions
.
where
(
λ
_{1,k,1}
, ⋯
and
(
λ
_{2,k,1}
, ⋯
are arranged up in a descending order
.
Proof
: Using (29),
f
_{1,k}
(
T
_{k}
) is written as follows.
To make
f
_{1,k}
(
T
_{k}
) be minimized, we can consider following structure without loss of generality
[14]
.
Note that
for all
k
as mentioned in the section III. Next, we rewrite
f
_{2,k}
(
T
_{k}
) as follows.
where
T
_{k}
which minimizes
f
_{2,k}
(
T
_{k}
) is known to have following structure
[14]
.
Since
minimizes both
f
_{1,k}
(
T
_{k}
) and
f
_{2,k}
(
T
_{k}
), and hence minimize
f
_{0,k}
(
T
_{k}
). This completes the proof.
It is left to calculate
Ꮩ
_{k}
. Inserting
T
_{k}
in (28) into
yields,
where Ꮩ
^{k,lk}
is the
l_{k}
th diagonal element
Ꮩ
_{k}
. (See the Appendix for the detailed derivation.) Finally a reformulated optimization problem is written as,
(35) is convex optimization problem over Ꮩ
_{k,lk}
and easily solved by KarushKuhnTucker (KKT) Theorem
[15]
or other wellknown solvers. After
T
_{k}
is determined,
B
_{k}
is readily calculated by (14) and
W
_{k}
is spontaneously yielded by (13).
In the 2
^{nd}
stage,
R
_{k}
for all
k
is determined to following MMSE filter with fixed
T
and
W
.
Flow chart of transceiver design with local CSI at relays
In summary,
T
and
W
are updated with given fixed
R
in the 1
^{st}
stage. Subsequently with given fixed
T
and
W
,
R
is updated in the 2
^{nd}
stage. If
R
is fixed in the 1
^{st}
stage, then
P
is also fixed due to the connection between them (21). From fixed
P
,
K
_{k}
for all
k
is updated by (25), while simultaneously
T
_{k}
for all
k
is updated by (28). Using the connection between
K
_{k}
and
F
_{k}
(21), k F is readily updated. Simultaneously,
B
_{k}
is updated depending on
T
_{k}
by (14). Since
W
_{k}
=
F
_{k}
B
_{k}
(13), determination of
F
_{k}
and
B
_{k}
for all
k
readily yields
W
. In the next stage,
R
_{k}
for all
k
is updated to (37) with resultant fixed
T
and
W
. The 1
^{st}
and the 2
^{nd}
stages repeat iteratively until SMSE converges. The whole procedure is summarized in
Fig. 3
. We call this proposed algorithm as joint BS and distributed multirelay (JBDMR) for the rest of this paper. Matrix updates at each stage yield a nonincreasing SMSE value that is lower bounded by zero. Thus the SMSE is guaranteed to converge through some number of iterations. However, the convergence to optimal point is not guaranteed since the primal problem is nonconvex.
5. Numerical Results
We analyze the SMSE and sum rate performance of JBDMRs with other relaying schemes in this section. The unit of sum rate is bit per second per hertz (bps/Hz).
is set to be equal to
P
_{BS}
and SNR in following figures is defined as
and
Network configuration is denoted by(
R
,
K
,
N_{S}
,
N_{R}
,
N_{U}
).
L_{k}
is set to be one for all
k
, which can achieve the best sum rate of all relaying schemes in each network configuration. Both the 1
^{st}
hop and the 2
^{nd}
hop channels experience uncorrelated Rayleigh block fading with unit variance.
SMSE comparison of various relaying schemes for (3,3,9,3,3) network
SMSE comparison of various relaying schemes for (4,4,8,2,2) network
For both schemes,
T
_{BS}
and
W
are initialized as
where
I
_{X}
(1:
L
)denotes the 1
^{st}
L
columns of
I
_{X}
. JBDMR terminates if the SMSE difference between each iteration becomes smaller than a predefined accuracy
ε
which is set to be 10
^{6}
. All users are assumed to be simultaneously served by a two hop relay aided BS simultaneously
SAF relaying and MMSE relaying are introduced for comparison, which are the most referred schemes in many literatures of relay aided network. For SAF relaying,
T
_{BS}
and
W
_{k}
follow (37) and
R
_{k}
follows below.
For the MMSE relaying, a welldefined form of relay precoders from
[9]
is used. We also consider a more refined MMSE scheme, BDMMSE
[12]
. Originally BDMMSE operates under PRPC. However, performance of BDMMSE modified for RSPC is also evaluated for fair comparison.
 5.1 Comparisons of SMSE
Fig. 4
compares SMSE performance of a (3,3,9,3,3) network. BDMMSE shows approximately the same SMSE for both RSPC and PRPC cases. JBDMR is shown to provide the best SMSE performance among all compared relaying schemes in all SNR regions. JBDMR achieves about 6.5 dB and 2.5 dB gain over MMSE relaying and BDMMSE (RSPC), respectively, at the SMSE of 10
^{2}
.
Fig. 5
illustrates SMSE comparison in a (4,4,8,2,2) network. JBDMR shows 3.2 dB gain over BDMMSE (RSPC) at an SMSE of 0.5×10
^{1}
. In this network configuration, SMSE of MMSE relaying is significantly degraded in all SNR regions, while SMSE of BDMMSE and JBDMR keep decreasing with increasing SNR. In the cases of BDMMSE and JBDMR, BD precoder at the transmitter makes interrelay interference be free, while SAF and MMSE relaying suffer severe interrelay interference in interference limited region. Using SAF and MMSE relaying, residual degree of freedom at each relay is sufficient to mitigate interference in the (3,3,9,3,3) network, while it is insufficient in the (4,4,8,2,2) network. Moreover, due to the additional optimization of
T
(the latter part of
T
_{BS}
), JBDMR outperforms BDMMSE. Thus JBDMR significantly improves SMSE performance for all considered typical network configuration and SNRs.
Sum rate comparison of various relaying schemes for (3,3,9,3,3) network
Sum rate comparison of various relaying schemes for (4,4,8,2,2) network
 5.2 Comparison of Sum Rate
For the same network configurations in the preceding subsection, we compare sum rate performances in
Fig. 6
and
Fig. 7
. It is observed that characteristics of sum rate performance have characteristics similar to those of SMSE in
Fig. 4
and
Fig. 5
. JBDMR always outperform SAF, MMSE relaying and BDMMSE in sum rate performance in all SNR region. More specifically, JBDMR outperforms SAF and MMSE relaying in sum rate by about 76.7%, 24.6%, and 11.6%, respectively, when SNR = 20 dB in the (3,3,9,3,3) network, which is shown in
Fig. 6
.
Fig. 7
displays sum rate comparison in the (4,4,8,2,2) network. Sum rate of MMSE relaying is significantly degraded when SNR is over 20dB. While on the other hand, sum rate of BDMMSE and JBDMR continuously increases even in that SNR region. JBDMR provides 194.24%, 58.45%, and 13.35% improved performance over SAF, MMSE relaying and BDMMSE (RSPC), respectively, for the (4,4,8,2,2) network in
Fig. 7
. These results verify that JBDMR shows superior performance to those of conventional schemes such SAF, MMSE relaying in terms of sum rate for all considered typical network configuration and SNRs.
6. Conclusion
We proposed an iterative transceiver design called JBDMR using local CSI at relays in an MRMU MIMO network. Construction of BS transmitter as the product of BD precoder and individual precoder for each relay made it possible to design the transceiver with local CSI. The numerical results verified that JBDMR outperforms simple SAF, MMSE relaying, and BDMMSE in terms of both SMSE and sum rate performances. Even though the proposed scheme made a step forward to make MRMU MIMO system more practical, there are still many issues to remain to be resolved. The most realistic power constraint will be PAPC which often makes optimum solution for transceiver design difficult to be determined. Even though we assumed perfect local CSI, it cannot be available in practice due to noise and limited backhaul capacity. In multicell environment, intercell interference from relays in other cells may degrade performance significantly. These problems will be addressed thoroughly in future research.
BIO
YoungMin Cho received his B. S. degree in Electrical and Electronic Engineering from Yonsei University, Seoul, Korea, in 2007, where he is currently working toward the Ph. D. degree since 2007. His main research interests include multicell multiuser MIMO, cooperative relay, energy efficient coomunication network.
Janghoon Yang received his Ph.D. in Electrical Engineering from University of Southern California, Los Angeles, USA, in 2001. He is currently an Assistant Professor at the Department of Newmedia, Korean German Institute of Technology, Seoul, Korea. From 2001 to 2006, he was with communication R&D center, Samsung Electronics. From 2006 to 2009, he was a Research Assistant Professor at the Department of Electrical and Electronic Engineering, Yonsei University. He has been a Professor in the Department of Korean German Institute of Technology, Seoul, since 2010. He has published numerous papers in the area of multiantenna transmission and signal processing. His research interest includes wireless system and network, artificial intelligence, neuroscience, and brain computer interface.
Dong Ku Kim received the B. Eng. Degree from Korea Aerospace University, Korea, in 1983, and the M. Eng. and the Ph. D. degrees from the University of Southern California, Los Angeles, in 1985 and 1992, respectively. He was a Research Engineer with the Cellular Infrastructure Group, Motorola by 1994, and he has been a Professor in the School of Electrical and Electronic Engineering, Yonsei University, Seoul, since 1994. He was a Director of Radio Communication Research Center at Yonsei University and also a Director of Qualcomm Yonsei CDMA Joint Research Lab since 1999. His main research interests are next generation (5G) communication, small cell technology, interference alignment, cooperative relaying network, and compressive sensing. Prof. Kim is currently a director of Journal of Communications and Networks.
Loa K.
,
Wu C. C.
,
Sheu S. T.
,
Yuan Y.
,
Chion M.
,
Huo D.
,
Xu L.
2010
“IMTadvanced relay standards”
IEEE Communications Magazine
Article (CrossRef Link)
48
(8)
40 
48
DOI : 10.1109/MCOM.2010.5534586
Laneman J. N.
,
Tse D. N. C.
,
Wornell G. W.
2004
“Cooperative diversity in wireless networks: efficient protocols and outage behavior”
IEEE Trans. Inform.
Article (CrossRef Link)
50
(12)
3062 
3080
DOI : 10.1109/TIT.2004.838089
Zhao J.
,
Kuhn M.
,
Witteneben A.
,
Bauch G.
“Cooperative transmission schemes for decodeandforward relaying”
in Proc. of PIMRC ‘07
Article (CrossRef Link)
Tse D.
,
Viswanath P.
2005
Fundamentals of Wireless Communication
Cambridge University Press
Article (CrossRef Link)
Long H.
,
Ziang W.
,
Zhang Y.
,
Wang J.
,
Wang W.
“Multiuser precoding and energyefficient relaying scheme in multirelay systems”
in Proc. of ICC 2011
Article (CrossRef Link)
Talebi A.
,
Krzymien W. A.
“Multipleantenna multiplerelay system with precoding for multiuser transmission”
in Proc. of IWCMC 2009
Article (CrossRef Link)
Caire G.
,
Shamai S.
2003
“On the achievable throughput of a multiantenna gaussian broadcast channel”
IEEE Trans. Inf. Theory
Article (CrossRef Link)
49
(7)
1691 
1706
DOI : 10.1109/TIT.2003.813523
Phan A. H.
,
Kham H. D. Tuan. H. H.
,
Nguyen H. H.
2012
“Beamforming optimization in multiuser amplifyandforward wireless relay networks”
IEEE Trans. Wireless Commun.
Article (CrossRef Link)
11
(4)
1510 
1520
DOI : 10.1109/TWC.2012.021512.111040
Oyman O.
,
Paulraj A. J.
2006
“Design and analysis of linear distributed MIMO relaying algorithms”
IEE Proc. Commun.
Article (CrossRef Link)
153
(4)
565 
572
DOI : 10.1049/ipcom:20050406
Chalise B. K.
,
Vandendorpe L.
2008
“Optimization of MIMO relays for multipointtomultipoint communications: nonrobust and robust designs”
IEEE Trans. Signal Process.
Article (CrossRef Link)
58
(12)
6355 
6368
DOI : 10.1109/TSP.2010.2077632
Xing C.
,
Li S.
,
Fei Z.
,
Kuang J.
2013
“How to understand linear minimum meansquareerror transceiver design for multipleinputmultipleoutput systems from quadratic matrix programming”
IET Commun.
Article (CrossRef Link)
7
(12)
1231 
1242
DOI : 10.1049/ietcom.2012.0651
Spencer Q. H.
,
Swindlehurst A. L.
,
Haardt M.
2004
“Zeroforcing methods for downlink spatial multiplexing in multiuser channels”
IEEE Trans. Signal Process.
Article (CrossRef Link)
52
(2)
461 
471
DOI : 10.1109/TSP.2003.821107
Cho Y.M.
,
Jang S.
,
Kim D. K.
,
Yang J.
2011
“Minimum summse design for distributed multirelay aided multiuser MIMO network”
in Proc. of IEEE MWSCAS.
Aug.
Article (CrossRef Link)
Palomar D. P.
,
Cioffi J. M.
,
Lagunas M. A.
2003
“Joint txrx beamforming design for multicarrier MIMO channels: a unified framework for convex optimization”
IEEE Trans. Signal Process.
Article (CrossRef Link)
51
(9)
2381 
2401
DOI : 10.1109/TSP.2003.815393
Boyd S.
,
Vandenberghe L.
2004
Convex Optimization.
Cambridge University Press
Article (CrossRef Link)