Advanced
Optimal Power Allocation for Channel Estimation of OFDM Uplinks in Time-Varying Channels
Optimal Power Allocation for Channel Estimation of OFDM Uplinks in Time-Varying Channels
ETRI Journal. 2015. Feb, 37(1): 11-20
Copyright © 2015, Electronics and Telecommunications Research Institute(ETRI)
  • Received : June 25, 2014
  • Accepted : September 11, 2014
  • Published : February 01, 2015
Download
PDF
e-PUB
PubReader
PPT
Export by style
Share
Article
Author
Metrics
Cited by
TagCloud
About the Authors
Rugui Yao
Yinsheng Liu
Geng Li
Juan Xu

Abstract
This paper deals with optimal power allocation for channel estimation of orthogonal frequency-division multiplexing uplinks in time-varying channels. In the existing literature, the estimation of time-varying channel response in an uplink environment can be accomplished by estimating the corresponding channel parameters. Accordingly, the optimal power allocation studied in the literature has been in terms of minimizing the mean square error of the channel estimation. However, the final goal for channel estimation is to enable the application of coherent detection, which usually means high spectral efficiency. Therefore, it is more meaningful to optimize the power allocation in terms of capacity. In this paper, we investigate capacity with imperfect channel estimation. By exploiting the derived capacity expression, an optimal power allocation strategy is developed. With this developed power allocation strategy, improved performance can be observed, as demonstrated by the numerical results.
Keywords
I. Introduction
Orthogonal frequency-division multiplexing (OFDM) modulation can transmit data in parallel by modulating a number of orthogonal subcarriers and has been widely used in modern communication systems [1] . In an OFDM system, the frequency-selective channel is converted into multiple flat fading subchannels, which can greatly simplify the design of the equalizer in the receiver [2] .
The optimization problem for the channel estimation of an OFDM system has been widely addressed in existing literature [3] [7] . For pilot-based channel estimation, an optimal interpolation can be achieved using a two-dimensional Wiener filter [3] . However, due to the complexity of its implementation, an optimal Wiener interpolator cannot be used in practice; thus, other interpolators have to be adopted [6] [7] . The optimal pilot pattern in terms of sampling efficiency has been suggested in [4] and further addressed in [5] . The optimal pilot design for capacity maximization is also investigated [8] . However, since pilot-based channel estimation assumes a channel to be unchanged in one OFDM symbol, the corresponding optimization is not valid for the situation where a channel changes inside of one OFDM symbol.
For an even faster time-varying channel response, a series of basis expansion model (BEM)–based algorithms have been proposed in [9] [12] , where different kinds of basis functions are employed to model the time-varying channel. Mathematically, the BEM can be considered as an application of the rank reduction decomposition [13] , and an optimal decomposition strategy has been developed in terms of minimizing the mean square error (MSE) [12] . Although BEM-based algorithms allow a channel to vary inside of one OFDM symbol, several drawbacks are obvious [14] . Due to the finite expansion order of basis functions for channel response, the inherent model error by truncation cannot be avoided. Moreover, the estimated model coefficients are only valid for one symbol duration; thus, they have to be reestimated on a symbol-by-symbol basis.
Recently, algorithms for estimating time-varying channels in a macrocellular uplink environment have been proposed in [15] [17] . Accordingly, the optimal power allocation ratio of training symbol to data symbol has been derived in [17] in terms of minimizing the MSE of the channel estimation. Although this particular kind of optimum power allocation strategy is useful for channel estimation, the final goal for channel estimation is to enable the application of coherent detection, which usually means high spectral efficiency. Therefore, the optimization should be more meaningful if carried out in terms of optimizing the capacity rather than MSE. As an extension of the research of [17] , capacity is chosen as the performance indicator in this paper. Correspondingly, the power allocation strategy is re-derived in this paper, aiming to maximize the capacity for OFDM transmission with imperfect channel estimation.
The rest of this paper is organized as follows. The system model is described in Section II. The expression of capacity with imperfect channel estimation is derived in Section III. The practical capacity with the inverse channel detection (ICD) estimator is derived in Section IV. In Section V, the optimal power allocation strategy is discussed. Numerical results are presented in Section VI. Finally, the conclusion is given in Section VII.
II. System Model
- 1. Signal Model
For OFDM transmission in a time-varying channel, the detected signal can be represented in vector form as
𝒚=𝑯𝒅+𝒘,
where d = ( d 0 , d 1 , ..., d N−1 ) T denotes the frequency domain symbols with covariance matrix E( dd H ) =
σ d 2
I , and w = ( w 0 , w 1 , ..., w N−1 ) T is the additive noise with covariance matrix E( ww H ) =
σ w 2
I . Here, I is an N × N identity matrix and N is the subcarrier number during the OFDM transmission. Usually, it is assumed that d and w have zero means. The symbol H represents the channel matrix, whose ( m , n )th element is given by
[𝑯] (m,n) = 1 N l=0 L1 k=0 N1 h [l,k] e j 2π(nm)k N e j 2πnl N ,
where h [ l , k ] denotes the time-varying channel response for the current OFDM symbol. The variable k denotes the discrete sample time, and l denotes the index of the resolvable path. As shown in [17] , h [ l , k ] can be explicitly represented as
h[l,k]= h l e j 2π v l k N
for the uplink environment, where hl and vl are the complex amplitudes and Doppler shifts of the l th path, respectively. This channel model is particularly suitable for the uplink transmission, where the angular spread is small for nonresolvable components of a channel tap. Therefore, the Doppler shifts for those non-resolvable components inside a channel tap are approximately the same and can thus be combined [18] . Similar models are also adopted in [19] [20] .
It should be noted that due to the fast time variation scenario, the channel response cannot hold unchanged during one OFDM transmission; thus, H is not a diagonal matrix. This is essentially different from the slow time variation scenario, where the channel matrix is diagonal [21] .
- 2. Channel Estimation
To recover the transmitted data symbols, the channel matrix H should be known to the receiver. With the method proposed in [17] , h [ l , k ] can be regenerated by estimating the complex amplitudes and Doppler shifts. Let ĥ [ l , k ] and Ĥ denote the regenerated channel response and regeneration of H , respectively. We then have
[ 𝑯 ^ ] (m,n) = 1 N l=0 L1 k=0 N1 h ^ [l,k] e j 2π(nm)k N e j 2πnl N .
Correspondingly, the estimation error for h [ l , k ] and H can be given, respectively, as
h ˜
[ l , k ] = ĥ [ l , k ] − h [ l , k ] and
𝑯 ˜ = 𝑯 ^ −𝑯.
Likewise, a similar relation between
h ˜
[ l , k ] and
𝑯 ˜
can be given as
[ 𝑯 ˜ ] (m,n) = 1 N l=0 L1 k=0 N1 h ˜ [l,k] e j 2π(nm)k N e j 2πnl N .
With the estimated channel matrix Ĥ , the transmitted data symbols can be recovered. However, due to the presence of the estimation error, the performance obtained is worse than that obtained with perfect channel estimation. Therefore, it is necessary to evaluate the system performance with imperfect channel estimation.
III. Capacity
In this paper, the capacity is chosen as the performance indicator since it is a good metric of the channel efficiency without having to consider implementation details.
- 1. Capacity with Perfect Channel Estimation
Assuming that the channel estimation is perfect, the capacity for OFDM transmission can be given as
C perf = 1 N max p d (𝒅) {I(𝒚;𝒅)}  bits/symbol,
where p d (·) denotes the probability density function of d . The mutual information between y and d is denoted by I( y ; d ) = H( y ) − H( y | d ), where H(·) denotes the information entropy. Similar to the derivation for the multi-input and multi-output (MIMO) system in [21] , the mutual information in (6) can achieve a maximum only when d is a Gaussian distributed random vector. In this case, the capacity for OFDM transmission with perfect channel estimation is given as [22]
C perf = 1 N log 2 det( I+ σ d 2 σ w 2 𝑯 𝑯 H ),
where det(·) denotes the determinant of a matrix.
Note that the capacity representations for an OFDM system and MIMO system are actually the same. This is not surprising because both systems have the same system model in (1). However, the difference is that for a MIMO system, the elements in H can be considered as independently distributed, while for an OFDM system, the elements in H are actually correlated.
- 2. Capacity with Practical Channel Estimation
Consider now that the estimate of H is imperfect with an estimation error
𝑯 ˜ = 𝑯 ^ −𝑯.
Since Ĥ is known, the received signal in (1) can be rewritten as
𝒚= 𝑯 ^ 𝒅+𝒖,
where
𝒖=𝒘− 𝑯 ˜ 𝒅
can be considered as the additive noise. Bearing in mind that d has zero mean, the correlation matrix R uu = E( uu H ) can be given as
𝑹 uu = σ w 2 I+ σ d 2 E( 𝑯 ˜ 𝑯 ˜ H ).
From (9), it is observed that the data symbol also contributes to the overall noise power due to the presence of channel estimation error.
With Ĥ known to the receiver, the capacity for OFDM transmission in the presence of channel estimation error, as shown in (8), can be given as
C prac = 1 N max p d (𝒅) {I(𝒚;𝒅)}  bits/symbol,
where
I(𝒚;𝒅)=H(𝒅)H(𝒅|𝒚)
denotes the mutual information for the practical channel estimator.
To achieve the maximal mutual information, assume that the data vector d is Gaussian distributed; thus, we have [21]
H(𝒅)= log 2 det( πe σ d 2 I ).
Note that under this assumption, y and u are also Gaussian distributed since both vectors are a linear combination of Gaussian vectors.
Recalling the relation in (8), we find that the transmitted data symbol vector d can be estimated linearly if given y and Ĥ . Different kinds of linear receivers can be employed for this purpose. In this section, we adopt the minimum-MSE (MMSE) receiver [13] . As shown in [22] , the MMSE receiver is information-lossless; thus, the following equation can hold
I(𝒅;𝒚)=I(𝒅; 𝒅 ^ MMSE ),
where
𝒅 ^ MMSE
denotes the MMSE estimate of d . From (13), it is easy to obtain H( d | y ) = H( d |
𝒅 ^ MMSE
), the right-hand side of which can be further rewritten as H( d |
𝒅 ^ MMSE
) = H(
𝒅 ˜ MMSE
), where
𝒅 ˜ MMSE
=
𝒅 ^ MMSE
d denotes the estimation error for the MMSE estimator. Therefore, we can deduce that
H(𝒅|𝒚)=H( 𝒅 ˜ MMSE ).
Noting that y is a Gaussian distributed vector,
𝒅 ^ MMSE
is therefore also Gaussian distributed since it is a linear composition of y . Correspondingly,
𝒅 ˜ MMSE
is a Gaussian vector as well; thus, we have
H(𝒅|𝒚)=H( 𝒅 ˜ MMSE )     = log 2 det[ πeE( 𝒅 ˜ MMSE 𝒅 ˜ MMSE H ) ].
For an MMSE estimator, the estimation of d can be represented as [13]
𝒅 ^ MMSE = 𝑾 H 𝒚,
where
𝑾= ( 𝑯 ^ 𝑯 ^ H + 𝑹 uu σ d 2 ) 1 𝑯 ^ .
Therefore, the correlation matrix of estimation error
𝒅 ˜ MMSE
can be given as
E( 𝒅 ˜ MMSE 𝒅 ˜ MMSE H ) =E[ ( 𝑾 H 𝒚𝒅 ) ( 𝑾 H 𝒚𝒅 ) H ] = σ d 2 𝐈 σ d 2 𝑯 ^ H ( 𝑯 ^ 𝑯 ^ H + 𝑹 uu σ d 2 ) 1 𝑯 ^ ;
thus, the capacity in (10) can be derived as
C prac = 1 N log 2  det [ I 𝑯 ^ H ( 𝑯 ^ 𝑯 ^ H + 𝑹 uu σ d 2 ) 1 𝑯 ^ ] 1              = 1 N log 2  det( I+ σ d 2 𝑯 ^ 𝑯 ^ H 𝑹 uu 1 ),
where the second equation is due to the matrix inversion lemma. Note that the derived capacity expression is similar to the representation derived in [23] for a MIMO system. This is not surprising because the OFDM transmission model is actually identical to the MIMO model.
From Appendix A, we find that
E( 𝑯 ˜ 𝑯 ˜ H )
is actually a diagonal matrix, assuming N is large enough, which can be represented as
E( 𝑯 ˜ 𝑯 ˜ H )= L N ( σ w 2 σ d 2 + 2 σ w 2 3 σ t 2 )I,
where
σ t 2
is the power of the training symbol used in [17] . By substituting (20) into (19), the capacity for a practical channel estimator can be obtained as
C prac = 1 N log 2 det( I+ε 𝑯 ^ 𝑯 ^ H ),
where
ε= [ (N+L) σ w 2 N σ d 2 + 2L σ w 2 3N σ t 2 ] 1 .
IV. Extension to ICD Estimator
In this section, the practical capacity for OFDM transmission is derived considering the ICD estimator. Due to the information-loss characteristics of the ICD estimator [22] , its achieving capacity will be degraded; that is,
I(𝒅;𝒚)I(𝒅; 𝒅 ^ ICD ),
referring to (13). For the detected symbol
𝒅 ^ ICD
with ICD estimator, the mutual information in (11) can be further computed as
I(𝒅; 𝒅 ^ ICD )=H(𝒅)H(𝒅| 𝒅 ^ ICD )                 = H(𝒅)H( 𝒅 ˜ ICD ).
For an ICD estimator, the detected matrix, W ICD , can be represented as [22]
𝑾 ICD = 𝑯 ^ ( 𝑯 ^ H 𝑯 ^ ) 1 .
And, the correlation matrix of estimation error
𝒅 ˜ ICD
can be given as
E( 𝒅 ˜ ICD 𝒅 ˜ ICD H ) =E[ ( 𝑾 ICD H 𝒚𝒅 ) ( 𝑾 ICD H 𝒚𝒅 ) H ]                                   = R uu ( 𝑯 ^ H 𝑯 ^ ) 1 .  
Like (15), the information entropy, H(
𝒅 ˜ ICD
), in (24) can be computed from (26) as
H( 𝒅 ˜ ICD )= log 2  det[ πeE( 𝒅 ˜ ICD 𝒅 ˜ ICD H ) ]       = log 2  det[ πe 𝑹 uu ( 𝑯 ^ H 𝑯 ^ ) 1 ].
Substituting both (12) and (27) in (24), the practical capacity with ICD estimator can be derived as
C prac ICD = 1 N log 2  det( σ d 2 𝑯 ^ H 𝑯 ^ 𝑹 uu 1 ). 
V. Optimal Power Allocation
To maximize the capacity, we just consider the capacity with MMSE estimator in this section.
The optimal power allocation aims to maximize the channel capacity, subject to
σ t 2 + σ d 2 =P,
where P is the total power. A similar problem has been addressed in [17] , where the optimization aims to minimize the MSE of the channel estimation. In this paper, our purpose is to maximize the capacity with given P . Considering the random fading in a wireless channel, the capacity in (10) is not directly used. Instead, we adopt the average capacity as the objective function, which can be defined as
C ¯ prac =E[ 1 N log 2  det( I+ε 𝑯 ^ 𝑯 ^ H ) ].
To find the optimal power allocation ratio, which can maximize the average capacity above, we define
σ t 2
= ; thus,
σ d 2
= P (1 − α ), where α ∈ (0, 1) is the power allocation ratio. Correspondingly, ε is also a function of α ; that is, ε = ε ( α ) and
C ¯ prac
=
C ¯ prac
( α ). Rewriting ε as a function of the ratio α , we have
ε(α)= P σ w 2 3Nα(1α) (3N+L)α+2L .
Bearing in mind that Ĥ Ĥ H is a Hermitian matrix, the eigenvalue decomposition of Ĥ Ĥ H can be expressed as
𝑯 ^ 𝑯 ^ H =𝑷𝜦 𝑷 H ,
where P is a unitary matrix, and Λ = diag( λ 0 , λ 1 , ..., λ N−1 ) where λn is the n th eigenvalue. For a Hermitian matrix, λn is real for any n ∈ (0, N − 1), and the eigenvalues have the following relationship: λ 0 λ 1 λ 2 ≥ ⋯ ≥ λ n . With the relation in (32), the average capacity can be derived as
C ¯ prac =E( 1 N log 2  det{ 𝑷[ I+ε(α)𝜦 ] 𝑷 H } )       =E{ 1 N log 2  det[ I+ε(α)𝜦 ] }       = 1 N n=0 N1 E { log 2 [ 1+ε(α) λ n ] }.
Since log 2 ( x ) is an increasing function of x , an upper bound for (33) can be given as
C ¯ prac E{ log 2 [ 1+ε(α) λ max ] } log 2 [ 1+ε(α) λ ¯ max ]
if denoting λ max as the maximal eigenvalue, and
λ ¯ max
= E( λ max ) denotes the mean of λ max . The second inequality in (34) is due to Jensen’s inequality [21] . In Appendix B, it is shown that
λ ¯ max =L σ h 2 + L N ( σ w 2 σ d 2 + 2 σ w 2 3 σ t 2 ),
where
σ h 2 = 1 L l=0 L1 E ( | h l | 2 ).
For an individual path, E(| hl | 2 ) depends on the channel delay profile. Note that
λ ¯ max
is also a function of α ; that is,
λ ¯ max (α)=L σ h 2 +( L σ w 2 NP α+2 3α(1α) ).
Therefore, the upper bound for
C ¯ prac
can be written as
C ¯ prac U (α)= log 2 [ 1+ε(α) λ ¯ max (α) ].
By solving the equation
d C ¯ prac U (α) dα =0,
the optimal power allocation ratio, α opt , for maximizing the capacity upper bound can be obtained. As it will be shown,
C ¯ prac
can achieve the maximum at α opt as well.
Substituting (31) and (37) into (39), (39) can be rewritten as
[ P σ w 2 3N α 2 +3Nα (3N+L)α+2L ] d[ L σ w 2 NP α+2 3 α 2 +3α ] dα + d[ P σ w 2 3N α 2 +3Nα (3N+L)α+2L ] dα [ L σ w 2 NP α+2 3 α 2 +3α ]=0,
which after some algebraic manipulation, can be simplified as
a α 2 +bα+c=0,
where a = 3 LNP
σ h 2
( L + 3 N ), b = 12 L 2 NP
σ h 2
, and c = 6 LN (
σ w 2
LP
σ h 2
). Equation (41) can be easily solved using the well-known root formula, and the solution is given as
α opt = b+ b 2 4ac 2a ,
where α opt is a positive value.
VI. Numerical Results
Numerical results are shown in this section to demonstrate the efficiency of the derived power allocation ratio. An OFDM transmission consisting of N = 64 subcarriers is investigated in the numerical analysis. For the wireless channel, L = 3 paths are considered, with power delay profile as E(| hl | 2 ) = 1 for l = 0, 1, 2 and Doppler shifts as 0.2, 0.1, and 0.1. The algorithm proposed in [17] is adopted for estimating the channel response. The SNR is defined as SNR = P /
σ w 2
.
The numerical results and theoretical upper bounds are shown in Fig. 1 . It is observed that the capacity, as well as the upper bounds, are convex functions of α . Both the numerical results and the upper bounds can achieve their respective maximums at α opt ≈ 0.15, which coincides with our theoretical prediction with (42) for N = 64, L = 3,
σ h 2
= 1, and P = 1.
PPT Slide
Lager Image
Capacity vs. power allocation ratio for SNR = 10 dB and SNR = 30 dB, respectively. Theoretical upper bounds are also presented for comparison.
The capacities for different power allocation ratios are presented in Fig. 2 . For α = 0.15, the theoretical upper bound and capacity with perfect channel estimation (that is, (7)) are also shown for comparison. As it can be seen from Fig. 2 , the performances for different α ’s are significantly different. For α = 0.85, the worst performance can be observed. On the other hand, a better performance can be obtained by adopting α = 0.5. This can be explained by referring to Fig. 1 . As seen in Fig. 1 , α = 0.85 is relatively far from the arrest point ( α ≈ 0.15) compared to α = 0.5. This accounts for the different performances we observed since
C ¯ prac
is a convex function of α . Also, from Fig. 2 , we can observe a 1 dB performance gap between
C ¯ prac
and
C ¯ perf
when α = 0.15. This suggests that the ideal capacity with perfect channel estimation can be almost achieved by adopting the derived optimal power allocation ratio. Meanwhile, a 2.5 dB performance gap between
C ¯ prac
with α = 0.15 and the theoretical upper bound can be observed due to the inequality in (34).
PPT Slide
Lager Image
Average capacities for α = 0.15, 0.5, 0.85, respectively in terms of SNR. The upper bound and average capacity with perfect channel estimation are also presented.
In Fig. 3 , the cumulative distribution functions (CDFs) of the capacities are also shown. Three cases, where α takes 0.15, 0.5, and 0.85, are investigated in Fig. 3 . Similar to the situation in Fig. 2 , an obvious performance improvement can be observed by adopting the optimal power allocation ratio, and the performance gets worse as α gets further away from the arrest point. Also, we can find that the probability of achieving a capacity of more than 2.5 bits/symbol is 0.9 at SNR = 10 dB. At SNR = 30 dB, the capacity can achieve more than 8.5 bits/symbol at a probability of 0.9. For both situations, the capacity achieved using optimal power allocation is very close to the capacity with perfect channel estimation, demonstrating the efficiency of the proposed power allocation strategy.
PPT Slide
Lager Image
CDFs for SNR = 10 dB and SNR = 30 dB, respectively. CDF of capacity with perfect channel estimation are also presented.
The comparison of average capacities for the proposed algorithms in [15] [17] is shown in Fig. 4 . The average capacity of the proposed algorithm with ICD estimator is also presented for comparison. As shown in Fig. 4 , due to the noise enhancement issue and information loss characteristics, the ICD estimator leads to a worse performance at lower SNR, compared with the MMSE estimator. With the SNR increasing, the capacity with ICD estimator approached the capacity of the MMSE estimator. Note that the training symbol and data symbols are with equal power allocation for the algorithms in [15] and [16] . For the algorithm in [17] , the optimal power allocation is based on minimizing the MSE of the channel estimation. As it can be seen from Fig. 4 , our proposed algorithm outperforms all the algorithms in [15] [17] . Even at high SNR, there exists a larger gap between the proposed algorithm and that in [15] , which can be interpreted later by the average MSE results in Fig. 5 .
PPT Slide
Lager Image
Comparison of average capacities for the proposed algorithm and algorithms in [15][17]. Orders of the Taylor expansions in [15] and [16] are represented by P and Q, respectively.
PPT Slide
Lager Image
Average MSEs for proposed algorithm and algorithms in [15][17] at different SNRs.
The average MSEs for the different algorithms in [15] [17] and this paper are compared in Fig. 5 . Because less power was allocated for the training symbol, our proposed algorithm presents worse MSE performance; however, it achieves the best capacity, as shown in Fig. 4 . To minimize the MSE of the channel estimation, the algorithm in [17] allocates more power to the training symbol. Nevertheless, this scheme is not efficient in terms of capacity performance, as can be seen from Fig. 4 . The algorithm in [16] has almost the same MSE as the algorithm in [17] . The algorithm in [15] obtains as superior an MSE performance as those in [16] and [17] at low SNR; however, at high SNR, it does not achieve good MSE performance and further contributes to the poor capacity performance, as shown in Fig. 4 . This results from the large intrinsic estimation error of the algorithm in [15] .
VII. Conclusion
In this paper, the optimal power allocation for channel estimation of OFDM uplinks in time-varying channels has been investigated. As an extension of the research of [17] , this paper intends to derive the optimal power allocation ratio in terms of maximizing the capacity, instead of minimizing the MSE of the channel estimation, as it has been done in [17] . Based on the derived upper bound expression for the capacity with imperfect channel estimation, the optimal power allocation strategy can be easily obtained by reasonably allocating the total power between the training symbol and data symbols. The numerical results are also shown, demonstrating the efficiency of the derived power allocation strategy.
Appendix A: Derivation of (20)
Recalling the representation of
𝑯 ˜
in (5), it actually can be written as
[ 𝑯 ˜ ] (m,n) = 1 N k=0 N1 e j 2πmk N 𝒉 ˜ k T 𝒒 k (n),
where
𝒉 ˜ k = ( h ˜ [0,k],   h ˜ [1,k],    ,   h ˜ [L1,k] ) T
and
𝒒 k (n)= ( e j 2πkn N ,   e j 2π(k1)n N ,    ,   e j 2π(kL+1)n N ) T .
Define
𝑯 ˜ T =( 𝑯 ˜ 0 , 𝑯 ˜ 1 ,  ...  , 𝑯 ˜ N−1 ),
where
𝑯 ˜ m
denotes the m th column of matrix
𝑯 ˜ T .
Then,
𝑯 ˜ m T
can be represented as
𝑯 ˜ m T = 1 N k=0 N1 e j 2πmk N 𝒉 ˜ k T 𝑸 k ,
where
𝑸 k =[ 𝒒 k (0),   𝒒 k (1),    ,   𝒒 k (N1) ].
With the definition of
𝑯 ˜
, the ( m , n )th element of
E( 𝑯 ˜ 𝑯 ˜ 𝑯 )
can be given as
E( 𝑯 ˜ m T 𝑯 ˜ n * )= 1 N 2 k=0 N1 t=0 N1 e j 2π(ntmk) N E( 𝒉 ˜ k T 𝑴 k,t 𝒉 ˜ t * ),
where
𝑴 k,t = 𝑸 k 𝑸 t H .
The term
E( 𝒉 ˜ k T 𝑴 k,t 𝒉 ˜ t * )
in (A.6) can be rewritten as
E( 𝒉 ˜ k T 𝑴 k,t 𝒉 ˜ t * )= l 1 =0 L1 l 2 =0 L1 E ( h ˜ [ l 1 ,k] h ˜ * [ l 2 ,t] ) [ 𝑴 k,t ] ( l 1 , l 2 ) ,
where [ Mk,t ] (l1,l2) denotes the ( l 1 , l 2 )th element of Mk,t . From the derivation in [17] , we can conclude that the estimation errors for different paths are independent; that is,
E( h ˜ [ l 1 ,k] h ˜ * [ l 2 ,t] )  { 0 for  l 1 = l 2 , =0 for  l 1 l 2 .
Therefore, the equation in (A.7) can simplified as
E( 𝒉 ˜ k T 𝑴 k,t 𝒉 ˜ t * )= l=0 L1 E ( h ˜ [l,k] h ˜ * [l,t] ) [ 𝑴 k,t ] (l,l)             = E( 𝒉 ˜ k T diag{ 𝑴 k,t } 𝒉 ˜ t * ),
where diag{ Mk,t } denotes a diagonal matrix composed of the elements on the diagonal line of Mk,t . Recalling that
𝑴 k,t = 𝑸 k 𝑸 t H
and the definition of Qk in (A.5), we can obtain that
diag{ 𝑴 k,t }= n=0 N1 d iag{ 𝒒 k (n) 𝒒 t H (n) }                               = n=0 N1 ( e j 2π(kt)n N e j 2π(kt)n N ) .
From (A.10), we can observe that diag{ Mk,t } is a zero matrix when k t and that diag{ Mk,t } = N · I when k = t . Therefore, the relation in (A.9) can be written as
E( 𝒉 ˜ k T 𝑴 k,t 𝒉 ˜ t * )={ NE( 𝒉 ˜ k T 𝒉 ˜ k * ) for k=t, 0 for kt.
With the relation in (A.11), the ( m , n )th element of
E( 𝑯 ˜ 𝑯 ˜ H )
in (A.6) can be simplified as
E( 𝑯 ˜ m T 𝑯 ˜ n * ) = 1 N k=0 N1 e j 2π(nm)k N E( 𝒉 ˜ k T 𝒉 ˜ k * )                             = 1 N k=0 N1 e j 2π(nm)k N l=0 L1 E (| h ˜ [l,k] | 2 ).
Following the derivation of
MSE ¯
in [17] , it is easy to obtain that
E(| h ˜ [l,k] | 2 )=L( σ w 2 N σ d 2 + 2 N 2 8kN+8 k 2 N 3 σ w 2 σ t 2 ),
and we therefore have
E( 𝑯 ˜ m T 𝑯 ˜ n * )        = L N k=0 N1 e j 2π(nm)k N ( σ w 2 N σ d 2 + 2 N 2 8kN+8 k 2 N 3 σ w 2 σ t 2 ).
Further derivation of (A.14) depends on the relation between m and n .
- A.m≠n
For this situation, (A.14) can be rewritten as
E( 𝑯 ˜ m T 𝑯 ˜ n * )= L N [ k=0 N1 e j 2π(nm)k N ( σ w 2 N σ d 2 + 2 σ w 2 N σ t 2 ) ]    + L N [ k=0 N1 e j 2π(nm)k N ( 8k σ w 2 N 2 σ t 2 ) ]+ L N [ k=0 N1 e j 2π(nm)k N ( 8 k 2 σ w 2 N 3 σ t 2 ) ].
Since m n , the first summarization in (A.15) is equal to zero. Using the relation [24]
k=0 N1 k q k = (N1) q N 1q + q(1 q N1 ) (1q) 2 ,
we find that the numerator of the second summarization is a linear function of N . Considering that the denominator for the second summarization is a quadratic function of N , the second term is also zero when N is large enough. Similarly, bearing in mind the following relation [24] ,
k=0 N1 k 2 q k           = (N1) 2 q N+2 +(2 N 2 2N1) q N+1 N 2 q N + q 2 +q (1q) 3
and the fact that the denominator for the third summarization term is a cubic function of N , the third summarization term is equal to zero as well for large enough N . Since all the three terms are zeros, we can conclude that E(
𝑯 ˜ m T 𝑯 ˜ n *
) = 0 given m n .
- B.m=n
Given m = n , (A.14) can be rewritten as
E( 𝑯 ˜ m T 𝑯 ˜ m * )= L N k=0 N1 ( σ w 2 N σ d 2 + 2 N 2 8kN+8 k 2 N 3 σ w 2 σ t 2 )
which, after some algebra, can be simplified as
E( 𝑯 ˜ m T 𝑯 ˜ m * )= L N ( σ w 2 σ d 2 + 2 σ w 2 3 σ t 2 )
for large enough N .
In view of the two situations above, we can finally obtain (20).
Appendix B
As shown in [17] , it is easy to derive that the estimates of vl and hl are both unbiased for large enough N . Therefore, we can also ascertain that the estimations of h [ l , k ] are unbiased as well; that is,
E( h ˜ [l,k])=0.
Bearing in mind (5), the following relation can be obtained
E{ [ 𝑯 ˜ ] (m,n) } = 1 N l=0 L1 k=0 N1 E ( h ˜ [l,k] ) e j 2π(nm)k N e j 2πnl N =0
which means the estimations of H are also unbiased; that is,
E( 𝑯 ˜ )=0,
where 0 is an N × N zero matrix. In view of the conclusion above, we can obtain
E( 𝑯 ^ 𝑯 ^ H )=E[ (𝑯+ 𝑯 ˜ ) (𝑯+ 𝑯 ˜ ) H ]                 =E(𝑯 𝑯 H )+E( 𝑯 ˜ 𝑯 ˜ H ).
In Appendix A, we have derived the expression of
E( 𝑯 ˜ 𝑯 ˜ H ).
Recalling the MSE representation in [17] for large enough M , (20) can be rewritten as
E( 𝑯 ˜ 𝑯 ˜ H )= 1 N MSE ¯ I,
where
MSE ¯
denotes the average MSE. Since the definition of
MSE ¯
is given as
MSE ¯ = l=0 L1 k=0 N1 E (| h ˜ [l,k] | 2 )
for large N , (B.5) can be further rewritten as
E( 𝑯 ˜ 𝑯 ˜ H )= 1 N l=0 L1 k=0 N1 E (| h ˜ [l,k] | 2 )I.
Recalling the similarity between the definition of H and
𝑯 ˜
, it is easy to derive that
E(𝑯 𝑯 H )= 1 N l=0 L1 k=0 N1 E (|h[l,k] | 2 )I                  =L σ h 2 I,
where
σ h 2
is defined in (36).
Since E( HH H ) and
E( 𝑯 ˜ H ˜ 𝑯 )
are both diagonal matrices, we have
E( 𝑯 ^ 𝑯 ^ H )=[ L σ h 2 + L N ( σ w 2 σ d 2 + 2 σ w 2 3 σ t 2 ) ]𝐈.
For (B.9), it is easy to observe that all the eigenvalues of E( Ĥ Ĥ H ) are equal; thus, we have
λ ¯ max =L σ h 2 + L N ( σ w 2 σ d 2 + 2 σ w 2 3 σ t 2 ).
This work was supported in part by the National Natural Science Foundation of China (No. 61271416 and 61301093), the Aerospace support fund of China (No. 2013-HT-XGD), NPU Foundation for Fundamental Research (No. JCY20130132).
BIO
Corresponding Author  yaorg@nwpu.edu.cn
Rugui Yao received his BS, MS, and PhD degrees in communications from the School of Electronics and Information, Northwestern Polytechnical University (NPU), Xi’an, China, in 2002, 2005, and 2007, respectively. He worked as a post-doctoral fellow at NPU from 2007 to 2009. In 2010, he joined NPU as an associate professor. In 2013, he joined the ITP Lab of Georgia Tech, Atlanta, GA, USA, as a visiting scholar. He has worked in the area of cognitive radio networks, channel coding, OFDM transmission, and spread spectrum systems.
09111035@bjtu.edu.cn
Yinsheng Liu received his BS degree in communication and information systems from North China Electric Power University, Baoding, China and his MS degree in communication and information systems from Beijing Jiaotong University, Beijing, China, in 2007 and 2009, respectively. He is currently pursuing his PhD degree in traffic information engineering and control at Beijing Jiaotong University. His research interests include wireless communication systems and digital signal processing.
785462603@qq.com
Geng Li received his BS degree in communication engineering from the School of Electronics and Information, Northwestern Polytechnical University (NPU), Xi’an, China, in 2012 and is currently pursuing his MS degree in electronic and communication engineering at NPU. His research interests include wireless communications and anti-jamming techniques.
xuj@mail.nwpu.edu.cn
Juan Xu received her BS, MS, and PhD degrees all in computer science from the School of Computer Science, Northwestern Polytechnical University, Xi’an, China, in 2002, 2005, and 2011 respectively. From 2011, she joined the School of Electronic and Control Engineering, Xi’an, China. Her research interests include channel coding, OFDM transmission, and spread spectrum systems.
References
Bingham J.A.C. 1990 “Multicarrier Modulation for Data Transmission: An Idea Whose Time Has Come,” IEEE Commun. Mag. 28 (5) 5 - 14    DOI : 10.1109/35.54342
May T. , Rohling H. , Engels V. 1998 “Performance Analysis of Viterbi Decoding for 64-DAPSK and 64-QAM Modulated OFDM Signals,” IEEE Trans. Commun. 46 (2) 182 - 190    DOI : 10.1109/26.659477
Li Y. 2000 “Pilot-Symbol-Aided Channel Estimation for OFDM in Wireless Systems,” IEEE Trans. Veh. Technol. 49 (4) 1207 - 1215    DOI : 10.1109/25.875230
Garcia M.J.F.-G. , Zazo S. , Paez-Borrallo J.M. 2000 “Pilot Patterns for Channel Estimation in OFDM,” Electron. Lett. 36 (12) 1049 - 1050    DOI : 10.1049/el:20000714
Choi J.W. , Lee Y.-H. 2005 “Optimum Pilot Pattern for Channel Estimation in OFDM Systems,” IEEE Trans. Wireless Commun. 4 (5) 2083 - 2088    DOI : 10.1109/TWC.2005.853891
Lee K.F. , Williams D.B. 2002 “Pilot-Symbol-Assisted Channel Estimation for Space-Time Coded OFDM Systems,” EURASIP J. Adv. Signal Process. 2002 (1) 507 - 516    DOI : 10.1155/S111086570200080X
Coleri S. 2002 “Channel Estimation Techniques Based on Pilot Arrangement in OFDM Systems,” IEEE Trans. Broadcast. 48 (3) 223 - 229    DOI : 10.1109/TBC.2002.804034
Ohno S. , Giannakis G.B. 2004 “Capacity Maximizing MMSE-Optimal Pilots for Wireless OFDM over Frequency-Selective Block Rayleigh-Fading Channels,” IEEE Trans. Inf. Theory 50 (9) 2138 - 2145    DOI : 10.1109/TIT.2004.833365
Visintin M. 1996 “Karhunen-Loeve Expansion of a Fast Rayleigh Fading Process,” Electron. Lett. 32 (18) 1712 - 1713    DOI : 10.1049/el:19961128
Zemen T. , Mecklenbrauker C.F. 2005 “Time-Varaint Channel Estimation Using Discrete Prolate Spheroidal Sequences,” IEEE Trans. Signal Process. 53 (9) 3597 - 3607    DOI : 10.1109/TSP.2005.853104
Tang Z. 2007 “Pilot-Assisted Time-Varying Channel Estimation for OFDM Systems,” IEEE Trans. Signal Process. 55 (5) 2226 - 2238    DOI : 10.1109/TSP.2007.893198
Teo K.D. , Ohno S. “Optimal MMSE Finite Parameter Model for Doubly-Selective Channels,” IEEE Global Telecommun. St. Louis, MO, USA Nov. 28–Dec. 2, 2005 3503 - 3507    DOI : 10.1109/GLOCOM.2005.1578424
Haykin S. 1996 “Adaptive Filter Theory” Prentice-Hall Engewood Cliffs, NJ, USA 610 - 620
Hlawatsch F. , Matz G. 2011 “Wireless Communication over Rapidly Time-Varying Channels” Academic Press of Elsevier Burlington, VT, USA 40 - 48
Du Z. 2011 “Maximum Likelihood Based Channel Estimation for Macrocellular OFDM Uplinks in Dispersive Time-Varying Channels,” IEEE Trans. Wireless Commun. 10 (1) 176 - 187    DOI : 10.1109/TWC.2010.110910.100135
Liu Y. 2012 “Channel Estimation for Macrocellular OFDM Uplinks in Time-Varying Channels,” IEEE Trans. Veh. Technol. 61 (4) 1709 - 1718    DOI : 10.1109/TVT.2012.2187939
Yao R. 2015 “On Channel Estimation for OFDM Uplinks in Time-Varying Channels,” IET Commun. to appear in
Huang H. 2014 “Spatial Channel Model for Multiple Input Multiple Output (MIMO) Simulations,” 3GPP TSG RAN Edinburgh, UK Tech. Rep. 3GPP TR 25.996(12.0.0)-12
Gorokhov A. , Linnartz J.-P. 2004 “Robust OFDM Receivers for Dispersive Time-Varying Channels: Equalization and Channel Acquisition,” IEEE Trans. Commun. 52 (4) 572 - 583    DOI : 10.1109/TCOMM.2004.826354
Scaglione A. , Babarossa S. , Giannakis G.B. 1998 “Self-Recovering Equalization of Time-Selective Fading Channels Using Redundant Filterbank Proceders,” Proc. DSP Workshop Bryce Canyon, UT, USA
Goldsmith A. 2005 “Wireless Communications” Cambridge University Press Cambridge, UK    DOI : 10.1017/CBO9780511841224
Tse D. , Viswanath P. 2005 “Fundamentals of Wireless Communication” Cambridge University Press Cambridge, UK 332 - 382    DOI : 10.1017/CBO9780511807213
Yoo T. , Goldsmith A. 2006 “Capacity and Power Allocation for Fading MIMO Channels with Channel Estimation Error” IEEE Trans. Inf. Theory 52 (5) 2203 - 2214    DOI : 10.1109/TIT.2006.872984
Jeffrey A. , Zwillinger D. 2007 “Table of Integrals, Series, and Products” 7th Edition Elsevier Academic Press Amsterdam, Netherlands 1 - 23