In this paper, we propose two novel user selection algorithms for multiuser multipleinput and multipleoutput downlink wireless systems, in which both a base station (BS) and mobile stations (MSs) are equipped with multiple antennas. Linear transmit beamforming at the BS and receive combining at the MSs are used to avoid interference between users and find a better sumrate capacity performance. An optimal technique for selecting users would entail an exhaustive search, which in practice becomes computationally complex for a realistic number of users. Suboptimal algorithms with low complexity are proposed for a coordinated beamforming scheme. Simulation results show that the performance of the proposed algorithms is better than that provided by previous algorithms and is very close to an optimal approach with reduced complexity.
I. Introduction
Singleuser multipleinput and multipleoutput communications promise large gains for both channel capacity and reliability, essentially via the use of space–time codes (diversity gain oriented) combined with stream multiplexed transmission (rate maximization oriented)
[1]
. The situation with multiuser multipleinput and multipleoutput (MUMIMO) techniques is radically different as these techniques imply the use of spatial sharing of the channel by users. The sumrate capacity in a multiuser broadcast channel is defined to be the maximum aggregation of all user data rates. The optimum transmit strategy for an MUMIMO downlink involves a theoretical interference precancellation technique, known as dirty paper coding (DPC)
[2]
, combined with an implicit user scheduling and power loading algorithm; DPC achieves the capacity region of MUMIMO broadcast channels
[3]
.
Since DPC has high implementation complexity, lowcomplexity linear multiuser methods have been proposed; for example, block diagonalization (BD) [4] and coordinated beamforming (CBF)
[5]
,
[6]
. A key limitation of these techniques is that, for CBF, the maximum number of users that can be supported cannot exceed the number of transmit antennas; whereas, for BD, the number of transmit antennas must exceed the aggregate number of receive antennas across all users. In the CBF method, the base station (BS) sends a single data stream per user and removes the restriction on the number of receive antennas, existing in the BD scheme. CBF performs a joint optimization of beamforming and combining weight vectors to completely remove the interference between users, assuming full channel state information (CSI) at the BS. CSI can be obtained in time division duplex (TDD) systems if the BS estimates forward channels from the reverse ones through channel sounding and reciprocity
[7]
.
Broadly speaking, when the total number of users in an MUMIMO system is larger than the maximum number of supportable users, a BS is required to select a set of users to maximize the data rate. An optimal selection of users can be achieved via a brute force approach; however, this inevitably carries with it a heavy computational burden. To solve this problem, several user selection algorithms with different cost metrics have been proposed, obtaining a tradeoff between complexity and performance
[8]
–
[16]
. In
[8]
, the authors proposed an algorithm that builds a set of semiorthogonal users, semiorthogonal user selection, for terminals with a single antenna. In
[9]
, two suboptimal algorithms for BD are proposed —
c
algorithm and
n
algorithm. In each step of the
c
algorithm, it is the user that maximizes the sum rate who is selected; whereas,
n
algorithm selects the user that maximizes the channel Frobenius norm.
Jin and others
[10]
proposed a volumebased user selection (VUS) algorithm that maximizes the product of diagonal elements of the upper triangular matrix
R
after performing QR factorization. A similar approach was presented in
[11]
. The difference between these last two approaches lies in the fact that the latter is based on a determinant of a matrix composed of users’ channel matrices, so that the orthogonality as well as the channel quality of the selected users is measured. Furthermore, the authors in
[12]
demonstrated that the maximum energy efficiency and the corresponding sum capacity upper bound are both increasing functions of
λ
, where
λ
= det(
HH
^{H}
) can be viewed as the volume of the channel matrix
H
. In
[13]
and
[14]
, the authors proposed the chordal distance as a measure of orthogonality between channel spaces for user selection; they managed to obtain a performance comparable to that of
[11]
and a further reduction in complexity. Recently, Gupta and Chaturvedi
[15]
used the total conditional differential entropy as a measure to select users iteratively, which can be seen as an extension of
[11]
. The authors in
[15]
showed that the average sumrate capacity of their conditional entropy algorithm is strictly better than those of the
n
algorithm, volumealgorithm, and chordalalgorithm; even though the sumrate plots of the
c
algorithm and conditional entropy–based algorithm overlap, the flop count of the latter algorithm is significantly lower. None of the proposed algorithms have been developed for the particular case of serving the same number of users as transmit antennas (which have an indistinct number of receive antennas, as in the case of the CBF scheme).
Our goal is to find lowcomplexity user selection techniques that come close to attaining the maximum possible sumrate for the practical case wherein the number of downlink users is the same as the number of transmit antennas at the BS and multiple antennas at the user side. We propose two lowcomplexity user selection algorithms for CBF systems. The proposed algorithms perform close to an optimal exhaustive search and outperform previous algorithms while maintaining low complexity.
Throughout this paper, uppercase and lowercase boldface letters are used to denote matrices and vectors, respectively. We use (.)
^{T}
, (.)
^{H}
, (.), (.)
_{F}
, and (.)
^{†}
to indicate the transpose, conjugate transpose, 2norm, Frobeniusnorm and pseudoinverse of (.), respectively, where (.) can be a matrix or a vector. The
l
th column vector of
A
is represented by
A
(:,
l
). The number of elements of set {.} is denoted by card({.}). The ceiling and expectation operations are denoted by ⌈.⌉ and E{.}, respectively.
II. System Model and Background
Consider an MUMIMO downlink system with
N
_{t}
transmit antennas at the BS and
K
users that are equipped with
N
_{r}
receive antennas. We assume highly loaded scenarios (that is,
K
≫
N
_{t}
) in which the BS cannot serve all users simultaneously, because of the fact that serving more than
N
_{t}
users will result in a decreased capacity. Therefore, the system requires user selection to simultaneously attend to a subset of users.
Consider a subset of users,
Γ
, which has been scheduled for transmission, where card(
Γ
) =
K
_{S}
is the total number of selected users to be served simultaneously. Let
x_{k}
be the transmit signal of user
k
(
k
∈
Γ
). Each selected user will obtain a single string of data, and a signal,
y_{k}
, after applying a combiner vector,
w
_{k}
, to user
k
will be given by
(1) y k = w k H H k f k x k + w k H H k ∑ l∈Γ,l≠k f l x l + w k H n k ,
where the first term on the righthand side of (1) is the desired signal for user
k
; the second term is the interference from other users signals, and
n
_{k}
is a Gaussian noise vector with zero mean and variance
σ n 2 .
The beamforming weight vector is represented by
f
_{k}
. The channel between the transmitter and the
k
th user is represented by the matrix
H
_{k}
∈ ℂ
^{Nr×Nt}
with independent and identically distributed (i.i.d.) complex Gaussian entries.
 1. CBF
In CBF
[5]
,
[6]
, the transmitter chooses
f
_{k}
such that the subspace spanned by its columns is in the null space of
w l H H l (∀l ≠ k);
that is,
w l H H l f k =0
for
l
=1, …,
k
−1,
k
+1, …,
K
subject to 
w
_{k}

^{2}
= 1(∀
k
). Therefore,
f
_{k}
causes zero interference with the
l
th user. After removing the interference term in (1), the received signal is represented by
(2) y k = w k H H k f k x k + w k H n k .
At the receiver side, using the technique of maximalratio combining (MRC), which means that
w
_{k}
=
H
_{k}
f
_{k}
/
H
_{k}
f
_{k}
, is a reasonable choice for reaching transmission rates close to interferencefree capacity.
To compute beamforming vector
f
_{k}
, the BS first initializes a combining vector equal to the right singular vector corresponding to the maximum singular value. Let the singular value decomposition (SVD) of
H
_{k}
be
H k = U k Σ k V k H ,
then the combining vector is given for
w
_{k}
(0) =
U
_{k}
(:,1). Notation (
i
) is used for specifying the
i
th iteration; thus,
w
_{k}
(0) is the initial combining vector. Then, the transmitter calculates the beamforming vector as
(3) f k (1)= q k ‖ q k ‖ ( ∀k),
where
(4) [ q 1 ,…, q K ]= [ w 1 H H 1 ⋮ w K H H K ] † .
For iterations
i
≥ 2, if the difference 
f
_{k}
(
i
)−
f
_{k}
(
i
−1) is smaller than a predefined constant,
ε
, then the algorithm provides the results; otherwise, it moves to the next iteration,
i
+ 1, where the combining vector is recalculated by MRC prior to going back to (3). The convergence of the iterative algorithm is not guaranteed, but the method converges in almost all test cases
[5]
. For this reason, a maximum number of iterations are used; for example,
i
max = 50 for the case of
N
_{r}
=
N
_{t}
=
K
_{S}
= 4. The signaltointerferenceplusnoise ratio (SINR) of user
k
can be expressed as
(5) SINR k = P k  w k H H k f k  2 σ n 2 ,
where
P_{k}
represents the transmit power allocated to the
k
th user. The CBF scheme achieves the following sumrate capacity:
(6) C=E{ ∑ k log 2 ( 1+ SINR k ) }.
 2. User Preselecting Stage (UPS)
To reduce the complexity of the user selection algorithm, in
[16]
, the authors introduced a metric to evaluate the correlation among two MIMO channels,
H
_{i}
and
H
_{j}
. They used the square of the Frobenius norm of the average crosscorrelation matrix as an estimate, as follows:
(7) ξ i,j = ‖ H ˜ i H i H j H H ˜ j ‖ F 2 ( N r,i N r,j ) ,
where
H ˜
is the reciprocal of the Euclidean norm of the row vector of matrix
H
∈ ℂ
^{Nr×Nt}
whose diagonal elements are given by
(8) H ˜ =diag{ ‖ [ H ] 1 ‖ −1 , … , ‖ [ H ] N r ‖ −1 }.
With the UPS, a large fraction of users can be dropped off at intermediate iterations, which results in reduced processing time when user selection algorithms are applied.
III. Proposed User Selection Algorithms
In this section, we present two novel lowcomplexity user selection schemes for MUMIMO downlink channels based on the CBF method mentioned in Section II1.
Let
Ω
be the set of users to be served in the system,
Γ
the set of selected users for simultaneous transmission (whose card(
Γ
) =
N
_{t}
), and
i
the
i
th iteration.
 1. Coordinated Effective Channel Gain–Based User Selection (CECUS)
The basic idea of the CECUS algorithm is to greedily maximize the sum of the coordinated effective channel gain corresponding to the selected user subset; however, instead of using the CBF method for computing the weight vectors (which is iterative and does not converge in some cases), we propose to calculate the beamforming and combining weight vectors in a single step within the user selection process. The detailed user selection process is provided below.

1) Initialization: leti= 1,Ωi= {1, 2, …,K},Γ= ∅Nt×1and computeπk=‖Hk‖F2,∀k∈Ωi.

2) Select the first useru1with the highest power gain asu1=arg max k∈Ωiπkand updatei=i+ 1,Ωi=Ωi−1− {ui−1},Γ=Γ+ {ui−1}.

3) Implement the UPS and obtain the new setΩi={k∈Ωiξui−1,k<α}.

4) Fork∈Ωi,i) LetΨbe a new provisional set of usersΨ=Γ+k, where card(Ψ) =L.ii) Find the combinerwl=Ul(:,1),∀l∈Ψ,and obtain[q1, … ,qL,]={[(w1HH1)T , … ,(wLHHL)T]T}†.Then, the normalized beamforming vector is given byfl=ql/ql. Compute the sum of the effective channel gain asEffk=∑u∈ΓwuHHufu2+wkHHkfk2.iii) Selectui=arg max k∈ΩiEffkand updatei=i+ 1,Ωi=Ωi−1− {ui−1},Γ=Γ+ {ui−1}; if card(Γ)

5) Finish the process of user selection and perform the CBF algorithm described in Section II2 on the selected user setΓ.
Our first proposal is inspired by the capacitybased suboptimal user selection presented in
[9]
, which was specially designed for BD as a precoding method. The first contrast with our algorithm is found in step 1, where instead of choosing the first user with the highest individual throughput, the first user is selected to maximize the overall energy of the channel; that is, the sum of the eigenvalues of
HH
^{H}
equals
‖ H ‖ F 2 .
The second difference is in step 3, where we apply the UPS mentioned in Section II. In this step, we propose to use a threshold value (a positive small number),
α
, so that
Ω_{i}
is nonempty, and card(
Γ
), the number of selected users, satisfies card(
Γ
) =
N
_{t}
. It should be taken into consideration that if
α
is too small, then the multiuser diversity gain decreases; on the other hand, if
α
is too large, then there is an increase in the complexity. More details are shown in Section V.
It is necessary to observe a change in the dimension of channel matrix
H
by using combiners ((
w
^{H}
H
) ∈ ℂ
^{1×Nt}
), so they can calculate beamforming vectors using the pseudoinverse, as in the case of
[8]
, for systems with a single antenna; moreover, the combiner vectors need to be calculated only once for the users that are in
Ω_{i}
.
 2. FrobeniusNorm Grouping–Based User Selection (FGUS)
In this section, to further reduce the complexity of the CECUS algorithm, we propose the FGUS algorithm. In the FGUS algorithm, users are sorted according to the energy of the channel, then rearranged to form a matrix
χ
containing a user index. The main difference between the algorithms is that instead of performing an iterative search among all remaining users, the searching process only involves users belonging to the
i
th column of
χ
, thereby reducing the complexity of the algorithm described in Section III1. Our algorithm is summarized by the following steps:

1) Initialization: leti= 1,Ωi= {1, 2, …,K},Γ= ∅Nt×1and computeπk=‖Hk‖F2,∀k∈Ωi;sortπkin descending order and store the index corresponding to each user inγ. Letχ=redim(γ,⌈K/Nt⌉,Nt), where redim(A,b,c) returns a returns a matrix of sizeb×crearranging the vectorA.

2) Select the first useru1with the highest power gain asu1=χ(1,1)and updatei=i+ 1,Γ=Γ+ {ui−1}

3) For allk∈χ(:,i),i) let a new set of users,Ψ=Γ+χ(k,i).ii) for alll∈ Ψ, find the combinerwl=Ul(:, 1) and obtain [(q1), …, (qL)]; then, the normalized beamforming vector isfl=ql/ql, whereL= card(Ψ).iii) Compute Effkjust as in the previous algorithm.

4) Select the index of userui=arg max k∈χ(:,i)Effkand updatei=i+ 1,Γ=Γ+ {ui−1}; if card(Γ) =Nt, then go to step 3; otherwise, terminate and go to step 5.

5) Repeat step 5 of the algorithm of Section III1.
In step 1, the algorithm computes the Frobeniusnorm for
K
users in the cell; users are then sorted in descending order according to the channel energy. The algorithm then divides users into
N
_{t}
groups. In step 2, the algorithm selects the user with the highest channel energy. In the algorithm presented in this section, the preselection is deleted, because the number of users in each iteration is ⌈
K
/
N
_{t}
⌉, thus avoiding the extensive work to find appropriate values of
α
. In step 3, the algorithm takes the already selected users and users in the
i
th group to form a new user set, then computes the effective channel gain of the new set. In step 4, the user that maximizes the effective channel gain in the
i
th group is selected. Steps 3 and 4 are repeated until the selected users are equal to
N
_{t}
. Finally, in step 5, the algorithm updates the beamforming and combiner vectors using the CBF algorithm described in Section II2.
IV. Complexity Analysis
For a complexvalued matrix
H
∈ ℂ
^{N×M}
, and vectors
w
∈ ℂ
^{N×1}
and
f
∈ ℂ
^{M×1}
, we evaluate the user selection algorithms quantitatively. The complexity is measured by the number of real floatingpoint operations, known as flops
[17]
and denoted by
F
. Matrixvector multiplication takes the form
F
_{Hf}
= 2
NM
. The sesquilinear form of
w
^{H}
Hf
is
F
_{wHHf}
= 2
NM
+
N
− 1. Each multiplication of nonsquare matrices
A
∈ ℂ
^{N×M}
and
B
∈ ℂ
^{M×P}
has
F
_{AB}
= 2
NMP
flops. The SVD complexity of a complete matrix is
F
_{svd(H)}
= 4
NM
^{2}
+ 8
N
^{2}
M
+ 9
N
^{3}
, which is between four and sixtimes higher than that of a real matrix
[18]
. A computationally simple and accurate way to calculate the pseudoinverse is by using the SVD; the computational cost for the pseudoinverse is dominated by the cost of the SVD, which is several times higher than that of matrixmatrix multiplication.
Table 1
summarizes the flop counts of the various aforementioned matrix operations.
Complexity analysis of matrix operations.
Matrix operations  Flop count approximate 
Frobenius norm $\left({\text{\Vert}H\text{\Vert}}_{\text{F}}^{2}\right)$  4MN 
Pseudoinverse, SVD  24NM^{2} + 48N^{2}M + 54N^{3} 
Multiplication by a vector  2MN 
Multiplication  2MNP 
Sesquilinear form  2NM + 2N − 1 
 1. Algorithm 1: CECUS
The complexity of the proposed algorithms is divided into three parts —
F
_{a}
corresponding to the first user selection,
F
_{b}
to the user selection, and
F
_{c}
to the extra
ξ_{i,j}
calculations needed for implementing the UPS strategy. For the CECUS algorithm, we first compute the square Frobenius norm of
K
users, assuming that after the preselection stage, we have card(
Ω_{i}
) =
β_{i}K
users, where 1 >
β
_{1}
> ⋯ >
β
_{KS}
> 0.
F Alg1 = F a + F b + F c ,
where
F a =4K N r N t ,
F b = ∑ i=2 K S { ( β i K−i+1 )×( 24(i) N t 2 +48 (i) 2 N t +54 (i) 3 +⋯+2 N r N t + N r −1 ) } +( β 1 K+1 )×( 24 N r N t 2 +48 N r 2 N t +⋯+ 54 N r 3 +2 N r N t ),
F c = ∑ i=2 K S ( β i−1 K−i−1 )× ( 8 N r 2 N t +8 N r N t −2 N r N t 2 ).
 2. Algorithm 2: FrobeniusNorm Grouping–Based User Selection
In the FGUS algorithm, we propose a subgrouping of users to further reduce the complexity of the user selection algorithm. In the CECUS algorithm,
F
_{a}
is the same as in the FGUS algorithm; nonetheless, in
F
_{b}
,
K
is replaced by
K ^ =⌈ K/ N t ⌉
and
F
_{c}
is not considered in the last algorithm as the UPS is not applied; thus, we have
β_{i}
= 1, ∀
i
, and
F
_{c}
= 0.
F Alg2 = F a + F b ,
where
F b = ∑ i=2 K S { ( K ^ −i+1 ) × ( 24 (i) 2 N t +48(i) N t 2 +54 (i) 3 +2 N r N t + N r −1 ) } +( K− K ^ +1 )×( 24 N r 2 N t + 48 N r N t 2 +54 N r 3 +2 N r N t ).
For completeness of the achievable results, in this section, we present the total flop count of lowcomplexity algorithms proposed in the literature; that is, for VUS
[11]
, chordal distance–based user selection (ChUS)
[14]
, and conditional entropy–based user selection (CEUS)
[15]
. The complexity of the volumebased algorithm is as follows:
(9) F VUS ≈ ∑ i=2 K s { ( 8 N t 2 N r +8 N t N r 2 + 4 3 N r 3 − 3 2 N r 2 + 13 6 N r ) × ( K−i+1 ) } +( 8 N t N r 2 + 4 3 N r 3 − 3 2 N r 2 + 19 6 N r )×K.
The complexity of the ChUS algorithm is expressed as
(10) F ChUS ≈ ∑ i=2 K s { [ 8 (i−1) 2 N r 2 N t −2(i−1) N r N t +7(i−1) N r N t 2 ]+ [ 8 N r 2 N t −2 N r N t +7 N r N t 2 +4 N t 2 ]×( K−i+1 ) } +4K N r N t .
The total flop count of the CEUS algorithm is
(11) F CEUS ≈ ∑ i=1 K s { F Ω ×( i−1 )+ [ 4 3 N r 3 − 3 2 N r 2 + 19 6 N r + 8 N t 2 N r +8 N t N r 2 ]×i } ×( K−i+1 )+K× F Ω ,
where
F_{Ω}
is defined in
[15]
as
(12) F Ω = 32 N t 2 N r +16 N t N r 2 +2 N t 2 + N r +4 N r 3 − 1 2 N r 2 − 3 2 N r .
V. Simulation Results
In this section, we show simulation results to compare the performance of the proposed algorithms. In our simulations we take into account the following assumptions: (i) channels between users and between each pair of antennas are considered independent; (ii) unitary total power and uniform distribution of power between users; (iii) perfect channel estimation at the receiver; and (iv) complete knowledge of user channels in the BS. Unlike the algorithm proposed in
[19]
for frequency division duplex systems, our proposal is designed for TDD systems that employ reciprocity to estimate forward channels from reverse channels. We compare the performance of the following algorithms:

■ Optimal user selection by complete search (optimalcbf)

■ CECUS and FGUS algorithms, described in Section III

■ Volumebased user selection (VUS)[11]

■ Chordal distance–based user selection (ChUS)[14]

■ Conditional entropy–based user selection (CEUS)[15]

■ Roundrobin algorithm forNtsimultaneous users (RUS, random user selection).
Figure 1
shows the results of the sumrate capacity against the values of alpha (
α
) for
N
_{r}
= 4,
N
_{t}
= 4, SNR = 20 dB and the total number of users,
K
, in the range of 8 to 100. We observe that if
α
is too small, then the sumrate decreases due to the loss in the diversity gain. In terms of sumrate, the optimal value of
α
for
K
≥ 40 is
α
> 0.26. However, remember that the higher the value of
α
, the higher the complexity in the user selection algorithm.
Sumrate vs. alpha values (α) when N_{r} = 4, N_{t} = 4.
Figure 2
shows the results of the sumrate capacity against the number of users when the number of transmit antennas is four and the number of receive antennas for each user terminal is four. The SNR is 20 dB and
K
_{S}
= 4. It is shown that the sumrate of the CECUS (
Alg1
for disambiguation) algorithm proposed in this paper is better than the CEUS, ChUS, and VUS algorithms. The FGUS algorithm (
Alg2
for disambiguation) has a lower performance than Alg1, but as the number of user’s increases, Alg2 approaches the CEUS algorithm. Unless otherwise stated, in the simulations of Alg1,
β_{i}
= 1, ∀
i
. In the case of Alg1 with UPS, different
α
values are taken for different numbers of users; in
Fig. 2
, we have
α
= {0.35, 0.32, 0.31, 0.30, 0.29, 0.285, 0.28, 0.27} for
K
= {8, 12, 16, 20, 40, 80, 100}. It is of interest to consider solutions to optimize the values of alpha. We defer this, however, to future work.
Sumrate vs. number of users when N_{r} = 4, N_{t} = 4.
Figure 3
shows the results of the sumrate capacity against the number of users when the number of transmit antennas is eight and the number of receive antennas for each user terminal is four. The SNR is 20 dB,
K
_{S}
= 8, and we have
α
= {0.18, 0.175, 0.17, 0.165, 0.16, 0.155, 0.15, 0.145} for
K
= {16, 24, 32, 40, 60, 80, 104, 128, 160}. It can be seen, again, that Alg1 outperforms CEUS, ChUS, and VUS. As the number of users increases, Alg2 outperforms the VUS algorithm and its performance approaches that of the CEUS algorithm.
Sumrate vs. number of users when N_{r} = 4, N_{t} = 8.
Figure 4
compares the number of flops of various lowcomplexity user selection algorithms for
N
_{r}
= 4,
N
_{t}
= 4. It is observed that both the ChUS and the VUS require fewer flops among the lowcomplexity scheduling algorithms; however, recall from
Fig. 2
that they present poor performance in terms of sumrate. Our proposed Alg1 and Alg2 require fewer flops than the CEUS algorithm. An interesting result is observed by Alg1 with UPS, where the number of flops approaches Alg2. This is because when increasing the number of users, the optimal value of
α
decreases (as seen from
Fig. 1
); thus, in this way, the complexity is decreased yet the algorithm still provides a good sumrate performance.
Number of flops vs. number of users when N_{r} = 4, N_{t} = 4.
Figure 5
shows the number of flops of the scheduling algorithms with respect to the total number of users in a system with eight transmit antennas and four receive antennas. It is observed that when
N
_{t}
>
N
_{r}
, the proposed algorithms achieve a better tradeoff between complexity and performance. It appears that the proposed Alg2 (FGUS) is a good option when the system has a large number of users and a strategy of low complexity is sought for practical implementation of nextgeneration wireless systems such as 3GPP LTE Advanced.
Number of flops vs. number of users when N_{r} = 4, N_{t} = 8.
VI. Conclusion
In this paper, two suboptimal user selection algorithms for MUMIMO systems based on a coordinated beamforming criterion were proposed. The core idea behind the proposed algorithms is to maximize the sum of the coordinated effective channel gain of a set of selected users. Simulation results show that the behavior of the proposed algorithms for a reasonable number of users (< 200) is comparable to the performance obtained by a full search, and the sumrate capacity is better than that provided by previous algorithms proposed in the literature; moreover, our proposal has some advantages — maintaining low complexity (it is also proved that a tradeoff between low complexity and high sumrate upon applying the UPS step can be found), no restrictions for selecting the number of receiving antennas, and the ability to cater to a number of users equal to the number of transmit antennas.
This research was supported by CONACYT Mexico.
BIO
Corresponding Author fmaciel@cicese.edu.mx
Fermín Marcelo MacielBarboza received his BS degree in communications and electronics engineering and his MEng degree from Universidad de Colima, Mexico, in 2010 and 2012, respectively. He is currently working toward his PhD degree in electronics and telecommunications at CICESE Research Center, Ensenada, Mexico. His research interests include multiuser MIMO communications and cooperative networks.
lsoriano@ucol.mx
Leonel SorianoEquigua received his BS degree in communications and electronics engineering from Universidad de Colima, México, in 1997 and his MS and PhD degrees in electronics and telecommunications from CICESE Research Center, Ensenada, Mexico, in 2000 and 2011, respectively. He was an exchange visitor in the ECE department at the University of Texas, Austin, TX, USA, in 2007 and 2013. Since 2000, he has been an assistant professor with the Faculty of Mechanical and Electric Engineering, Universidad de Colima. His area of interest is MIMO communications.
jasan@cicese.mx
Jaime SánchezGarcía received his BS degree in communications and electronics from Instituto Politécnico Nacional, Mexico City, Mexico, in 1976; his MS degree in electronics and telecommunications from CICESE, Ensenada, Mexico, in 1979; and his DS degree in electrical engineering, with a major in communications from the George Washington University, Washington, DC, USA, in 2001. Since 1979, he has held a research and faculty position with the Electronics and Telecommunications Department, CICESE. He was a visiting scholar at the University of Arizona, Tucson, USA, in 1997 and at the University of Texas, Austin, USA, in 2008. His current research interests include wireless networks, softwaredefined radio, channel modeling, multicarrier modulation (OFDM), and MIMOOFDM.
frcsoria@bianni.unistmo.edu.mx
Francisco Rubén CastilloSoria received his MS in telecommunications engineering from Instituto Politécnico NacionalESIME, Research and Graduate Section, Mexico City, Mexico, in 2004 and his PhD degree in electronics and telecommunications from CICESE Research Center, Ensenada, Mexico, in 2015. Since 2005, he has been a research professor at Unistmo University, Oaxaca, Mexico. His current research interests include spatial modulation, STF coding, MIMO multiplexing, and OFDM.
vcastillo@ucol.mx
Victor Hugo Castillo Topete received his PhD degree in computer science from CICESE Research Center, Ensenada, México, in 2010. He is currently a professor of computer science at FIME, University of Colima, Mexico. He has been involved with several experiences using knowledgebased systems for managing telecommunication processes.
Gesbert D.
2007
“Shifting the MIMO Paradigm,”
IEEE Signal Process. Mag.
24
(5)
36 
46
Peel C.B.
,
Hochwald B.M.
,
Swindlehurst A.L.
2005
“A Vector Perturbation Technique for NearCapacity Multiantenna Multiuser Communications,”
IEEE Trans. Commun.
53
(1)
195 
202
DOI : 10.1109/TCOMM.2004.840638
Spencer Q.
,
Swindlehurst A.L.
,
Haardt M.
2004
“Zero Forcing Methods for Downlink Spatial Multiplexing in Multiuser MIMO Channels,”
IEEE Trans. Signal Process.
52
(2)
462 
471
Chae C.B.
2008
“Coordinated Beamforming for the Multiuser MIMO Broadcast Channel with Limited Feedforward,”
IEEE Trans. Signal Process.
56
(12)
6044 
6056
DOI : 10.1109/TSP.2008.929869
SorianoEquigua L.
2011
“Noniterative Coordinated Beamforming for Multiuser MIMO Systems with Limited Feedforward,”
IEEE Signal Process. Lett.
18
(12)
701 
704
DOI : 10.1109/LSP.2011.2171483
Goldsmith A.
2005
“Wireless Communications,”
Cambridge University Press
New York, USA
422 
424
Yoo T.
,
Goldsmith A.
2006
“On the Optimality of Multiantenna Broadcast Scheduling Using ZeroForcing Beamforming,”
IEEE J. Sel. Areas Commun.
24
(3)
528 
541
DOI : 10.1109/JSAC.2005.862421
Shen Z.
2006
“Low Complexity User Selection Algorithms for Multiuser MIMO Systems with Block Diagonalization,”
IEEE Trans. Signal Process.
54
(9)
3658 
3663
DOI : 10.1109/TSP.2006.879269
Jin L.
,
Gu X.
,
Hu Z.
2011
“LowComplexity Scheduling Strategy for Wireless Multiuser MultipleInput MultipleOutput Downlink System,”
IET Commun.
5
(7)
990 
995
DOI : 10.1049/ietcom.2010.0358
Ko K.
,
Lee J.
2012
“DeterminantBased MIMO Scheduling with Reduced Pilot Overhead,”
EURASIP J. Wireless Commun. Netw.
2012
(1)
71 
DOI : 10.1186/16871499201271
Sun C.
2014
“Low Complexity User Scheduling Algorithm for EnergyEfficient Multiuser MultipleInput MultipleOutput Systems,”
IET Commun.
8
(3)
343 
350
DOI : 10.1049/ietcom.2013.0328
Zhou B.
“Chordal DistanceBased User Selection Algorithm for the Multiuser MIMO Downlink with Perfect or Partial CSI,”
Proc. IEEE AINA
Biopolis, Singapore
Mar. 22–25, 2011
77 
82
Youtuan Z.
,
Zhihua T.
,
Jinkang Z.
“An Improved NormBased User Selection Algorithm for Multiuser MIMO Systems with Block Diagonalization,”
Proc. IEEE Veh. Technol. Conf.
Baltimore, MD, USA
Sept. 30–Oct. 3, 2007
601 
605
Golub G.H.
,
Van Loan C.F.
1996
“Matrix Computations,”
3rd ed.
The John Hopkins Univ. Press
Baltimore, MD, USA
Bae Y.
,
Lee J.
2011
“Low Complexity Antenna Selection for VBLAST Systems with OSIC Detection,”
EURASIP J. Wireless Commun. Netw.
2011
(1)
6 
DOI : 10.1186/1687149920116
Kum D.
,
Kang D.
,
Choi S.
2014
“Novel SINRBased User Selection for an MUMIMO System with Limited Feedback,”
ETRI J.
36
(1)
62 
68
DOI : 10.4218/etrij.14.0113.0267