Efficient Channel Estimation in Massive MIMO Partially Centralized Cloud-Radio Access Network Systems

This article investigates channel estimation problem in massive MIMO partially centralized cloud-RAN (MPC-RAN). The channel estimation was realized through compressed data method to minimize the huge pilot overhead, then combined with parallel Givens data projection method (PGDPM) to form a semi-blind estimator. Comparison and analysis of improved minimum mean square error (MMSE), fast data projection method (FDPM), compressed data, and PGDPM techniques was evaluated for achievable normalized mean square error (NMSE) in MPC-RAN. The PGDPM-based estimator had the lowest normalized mean square error. The FDPM and PGDPM based methods are comparable in performance with PGDPM based estimator having a slight edge over FDPM-based estimator. This vindicates PGDPM-based estimator as a method to be utilized in channel estimation since it compresses the massive MIMO channel information, hence mitigating the fronthaul finite capacity problem, and at the same time, it is geared towards efficient parallelization for optimal BBU resource utilization.

heads (RRH), front-haul becomes the limiting factor due to its inherent finite capacity (Francis & Fettweis, 2018). One of the anticipated fronthaul finite capacity solutions is to break functions such that some are performed at the RRH and others at the baseband unit (BBU). Taking this suggested architecture into account, the RRH is tasked with performing basic functions such as beamforming and the BBU is left to perform digital functions like channel estimation. This then makes fronthaul traffic largely dependent on user terminals (UT) data rates and not on antenna numbers (Francis & Fettweis, 2019;J. Park, Kim, Carvalho, & Manch, 2017). This results in massive MIMO partially centralized cloud-radio access (MPC-RAN) network (S. Park, Lee, Chae, & Bahk, 2017).
When paired with distributed cooperation for the case where RRHs are interconnected, partial centralization significantly mitigates capacity constraint and time latency on the fronthaul of MPC-RANs. The common notion is therefore to configure the topology to be adaptive in such a way as to strike a common balance between the constraints of the fronthaul and the complexity of distributed cooperative processing (Peng, Wang, Lau, & Poor, 2015). The BBU 's cooperative processing is intended to suppress inter-RRH interference through the use of the channel state information (CSI) from both the RRHs and the wireless fronthaul (S. H. Park, Simeone, Sahin, & Shamai, 2014).

Related work
An improved minimum mean square error (MMSE) channel estimator combined with a sub-space tracking algorithm (Fasts data projection method (FDPM)) is presented in (Mukubwa & Sokoya, 2020b) to construct a semi-blind channel estimator for a massive MIMO duplex time division (TDD) network with pilot contamination. The authors replace the matrix inversion with matrix multiplication and addition in the MMSE channel estimator, and later combine it with a sub-space tracking algorithm to create a semi-blind channel estimator.
Estimating the channels by using compressive sensing techniques in C-RAN is explored in (Xu, Rao, & Lau, 2015). A novel algorithm to minimize overhead in uplink training through compressive sensing is formulated. To realize this, the authors transform channel estimation problem into compressive sensing problem. Then modify the Bayesian compressive sensing algorithm to facilitate C-RAN channel estimation capitalizing on sparsity of active user, inherent varied effects due path loss and combined sparsity structures present in uplink C-RAN network with multi-antennas.
Compressive channel estimation with low-complexity methods is demonstrated in (He, Quek, Chen, Zhang, & Li, 2018). The work employs compressed sensing leveraging on user activity sparsity within the C-RAN to perform channel estimation with minimized pilot overhead. The accuracy of channel estimation is further improved through a strategy that is iteratively re-weighted with guidelines availed to assist in choice of parameters for tuning. The process is optimized by use of three lowcomplexity techniques to offer differentiated services under computing setups that are distinct.
In (He, Quek, Chen, & Li, 2017) a penalty functional is formulated to solve the channel estimation problem in C-RAN. An algorithm that is efficient with guaranteed convergence is formulated to hasten processing procedure. This algorithm relies on alternating direction technique of multipliers and a technique that entails variable splitting.
In (Mukubwa & Sokoya, 2020a) the authors present channel estimation problem in massive MIMO partially centralized cloud-RAN. By noting that the user activities in massive MIMO partially centralized cloud-RAN are sparse, the channel estimation issue is solved by use of compressed data method to minimize the huge pilot overhead. The authors use compressed channel state information (CSI) to approximate the covariance matrix for MPC-RAN system following the method in (Chen, Lyu, & King, 2017). Then approximation of the covariance matrix uses compressed data based on a weighted sampling structure. This strategy is data aware with most significant entries being explored allowing for good approximation accuracy with fewer entries.
In this paper, the authors model a sub-space channel estimation technique that is highly parallelizable to be implemented at the BBU for channel estimation. Then combine the Compressed data method in (Mukubwa & Sokoya, 2020a) with the Sub-space model to create a semi-blind channel estimation technique that has both compressed uplink pilot data and highly parallelizable to exploit the multicore resources at the BBU. Then the validation of the method is done on simulated data in comparison with the conventional methods.
The main contribution of this work entails: • The improvement on the Givens rotation and formulation of an algorithm that is highly parallelizable to exploit multicore scenarios. • The algorithm created is combined with the data compression technique formulated in (Mukubwa & Sokoya, 2020a) to realize a semi-blind channel estimator. • The authors avoid the pilot overhead by relying on initial channel state information to realize channel estimation process.

Organization
The remainder of this paper is divided into seven sections: First, the authors introduce the system model adopted for this work to be carried out. Then discusses the improved MMSE channel estimator followed by a semi-blind channel estimator based on FDPM, after which the authors look at the estimator for the compressed data channel and then the semi-blind channel estimator based on GDPM.
The numerical findings and analysis are subsequently provided to demonstrate how well the modelled estimators of channels work against one another. Then finish off by presenting the conclusion. Notation: lower-case and upper-case boldface letters denote vectors and matrices, respectively; (·) T , (·) H , (·) −1 , and tr(·) denote the transpose, conjugate transpose, matrix inversion, and trace, respectively;  denotes the set of complex numbers, which is a set of matrices and x ji t , to stand for the (j, i) th element of X t . And then X 2 and X F represents the Spectral and Frobenius norms respectively.
, where q ≥ 1 stands for the l q norm of X   M . The authors also take  x ( ) to represent a square diagonal matrix with the main diagonal having the elements of X .  X ( ) is a square diagonal matrix with its main diagonal having only the diagonal elements of X .

SySTEM MOdEL
The authors presume an MPC-RAN system with L RRHs, each of which has M transmitting antennas and K user terminals (UTs) having single antenna. Then propose that the time division duplex (TDD) protocols are coordinated through RRHs to relay pilot signals and data to all RRHs simultaneously.
Initially transmitted by UTs in ℓ th RRH, the pilots are identical and given by ψ ϕ ϕ ϕ where ϕ j k , corresponds to the pilot used by every kth user terminal (UT) in each RRH and ϕ j k , 2 1 = .
Then a channel from the kth UT within the jth RRH is given as h j k M ,   . The channel vectors are believed to fade and are modelled as: where R j k , represents the matrix of covariances corresponding to the kth UT from the jth RRH. The authors further assume Rayleigh fading with no UTs correlation, with R I j k j k M , , = β . It is suggested from (Viering, Hofstetter, & Utschick, 2002) that R j k , will vary slowly over time, compared to h j k , .
For this work the authors assume that R j k , is constant across the transmission bandwidth and changes gradually over time. Therefore, the training sequences received Y j M   are calculated as: where the AWGN noise matrix is represented by Z j M   and ψ k K   is the pilot matrix representing total transmitted sequences by K UTs.

IMPROVEd MMSE CHANNEL ESTIMATION
The MMSE approximation still requires matrix inversion and so it is substituted with the rapid numerical algorithm (RNA) method. RNA-based approximation completely side-steps the inversion of the matrix and instead makes use of multiplication and addition (Mukubwa & Sokoya, 2020b). Then the Schulz iterative method is evoked for inverting a matrix as per (Ben-Israel, 1965;Li, Huang, Zhang, Liu, & Gu, 2011). It is then combined with the approximation in (Isaacson & Keller, 1994) to realize the inversion process in MMSE with addition and multiplication of matrices as per (Mukubwa, Sokoya, & Ilcev, 2017) which allows the iterative process to be efficiently parallel. And the channel approximates as: (3)

FdPM-BASEd SEMI-BLINd MOdEL FOR CHANNEL ESTIMATION
It was pointed out in (Mukubwa & Sokoya, 2020b) that linear estimators of channels like the MMSE and its generics rely on pilot sequences to estimate channels. Consequently, many UTs repeatedly use pilot training sequences which lead to pilot contamination which degrades the efficiency of the wireless network. This is compounded for the massive MIMO scenario and thus the need to establish approximation methods with precise CSI estimation based on a reduced number of pilots compared to traditional channel estimation methods. Therefore, approximation methods for semi-blind channels have been found to be effective in minimizing pilot (Quoc Ngo & Larsson, 2012). The semi-blind estimators are based on EVD algorithms with fewer pilots needed for estimating the channels. Asymptotic orthogonality of UTs can be used as an alternative to solve the uncertainty matrix by evoking the large numbers theorem. This is achieved using SVD form. SVD-based approximation usually has a better estimate compared to EVD-based approximation (Hu, Lv, & Lu, 2013), while both approaches exhibit  M 3 ( ) complexity in calculation relative to the signal dimensions obtained.
It makes such schemes untenable in large MIMO networks where there is a significant number of BS antennas. Subspace tracking algorithm has been suggested to reduce the complexity. The fast data projection method (FDPM) was proposed (Doukopoulos, Moustakides, & Member, 2008), which simplifies the process of iterating the matrix of correlation with a view to determining the matrix of uncertainty. It gives  MK ( ) less complexity with better tracking outcomes. To realize a semi-blind channel estimator in (Mukubwa & Sokoya, 2020b) they combine the improved MMSE channel estimator and the FDPM sub-space tracking. The initial channel estimation by improved MMSE is fed into the FDPM to reduce the pilots required for channel estimation. From which the short training sequence in (2) is utilized to realize the ambiguity matrix which is computed as per the algorithm listed below in Table 1.
The signal subspace W n M K ( ) ×  C , corresponding to the nth sample is tracked as in Table 1 is the forgetting factor controlling the effect of the old data. N data denotes the duration of signals transmitted without the pilots. The approximate ambiguity matrix, U s is obtained from the tracked W N data ( ) and expressed as: Then the approximate channel matrix is determined as:Ĥ This gives rise to FDPM based semi-blind estimator which is basically referred to as FDPM estimator in this work.

COMPRESSEd dATA CHANNEL ESTIMATION
According to (Mukubwa & Sokoya, 2020a) the estimation of the covariance matrices relies on pilot samples arriving at the RRH. They investigate the approximation of the needed covariance information by the BBU and the impact of these estimates. The procedure is repeated here for convenience. The channel estimated by MMSE is computed as: T n a n T n a n a n The computation of the MMSE approximation of h j k , at the jth RRH from (6) requires the . Bearing in mind that these are M M × (quite large) matrices, they assume regularization of the estimates as per (Ledoit & Wolf, 2004;Shariati, Bjornson, Bengtsson, & Debbah, 2014).
Since the use of MPC-RAN results in high-dimensional data, the authors hinted at huge communication and storage resources to compute these covariance matrices. Then this underlined the need for enormous bandwidth and power resources according to  to transmit the CSI data from RRHs to BBU. To mitigate this issue, a partial centralization of C-RAN system with interconnected and cooperative massive MIMO RRHs (MPC-RAN) was proposed in accordance with (Peng et al., 2015). This rendered the fronthaul traffic largely dependent on UT data levels and not on the number of antennas. Compressed data was utilized to calculate the matrix of covariance through the via-Q method in (Bjornson et al., 2016).
the data is projected back into original space by S S y The derived data is then used in covariance matrix approximation. At least M-Z elements are excluded from the kth vector by the weighted sampling matrix S j k , , the remaining ones are maintained as they may be most informative. If the sampling probabilities are carefully constructed, the unbiased estimator ˆ, φ j k will accurately perform in relation to the matrix spectral norm ˆ, Achlioptas et al., 2013;Gittens, 2011;Pourkamali-Anaraki, 2016).
The weighted sampling evoked is strong enough to explore the most appropriate entries to reduce the estimation error ˆ, and let: The compressed data X , the indices used for sampling T V W and , , α are then transmitted from the RRH to the BBU and used as follows to construct the unbiased covariance matrix estimator from the compressed data: and: s Zp Because of imperfection in the knowledge of matrix correlation, they conducted robust approximation by experimental optimization of the parameter α . With advances in computing, it is possible to manipulate vectors with length Ο M ( ) in the memory. Thus, compression of data by weighted sampling will require a single pass from the RRH to the BBU when moving data to memory. This makes the algorithm rendered for streaming data and is therefore suitable for use in MPC-RAN systems.
The estimator is unbiased and represented by S j k k with: the maximum none zero elements on the diagonal are Z.
To estimate the R j k M M , ∈ × ℂ they followed a common approach used for φ j,k . The goal was to obtain the h j k , observations with minimal intervention from other UTs. From (Bjornson et al., 2016;Yin, Gesbert, FiliPpou, & Liu, 2013) it was pointed out that the UT can employ a set of unique orthogonal pilots to carry out a training phase for R j k , . The authors assumed that the jth RRH has N R observations of the noisy h j k , which lays the basis of constructing the approximate covariance matrix ˆ, R j k . That will simply mean more data transmission over the fronthaul from RRH to BBU and higher computations in that respect. The authors adopted the via-Q method presented in (Bjornson et al., 2016) . This gives rise to compressed data channel estimator which is basically referred to as compressed data estimator in this work.

GdPM BASEd SEMI-BLINd MOdEL FOR CHANNEL ESTIMATION
This was stated earlier that covariance matrix computation, since the use of MPC-RAN results in high-dimensional data, requires huge communication and storage resources. Yet it is also worth noting that it takes considerable computation time to compute the covariance matrices. But we assume the estimation of the channel is done at the BBU. At the BBU, the computing resources are enormous with the availability of multicore processing, and with well thought-out parallelization, the computation time can be minimized, and the channel estimation process therefore hastened. This can be further enhanced with the right choice of parallelization architecture (Liu, Sohl, & Wang, 2010) and software architectures and load balancing of protocol stacks (Showk & Bilgic, 2013).
To this end, we use the compressed data estimation technique to provide the initial estimation of the channel and then focus on methods of subspace estimation to approximate the CSI. Many of the widely used subspace methods include the DPM and the FDPM (Doukopoulos & Moustakides, 2005). The DPM uses the Gram-Schmidt method to perform orthonormalization while the FDPM uses orthonormalization via the Householder process. So, DPM has an  MK 2 ( ) computational complexity, while the FDPM has an  MK ( ) computational complexity. Neither of these two, however, provides effective parallelization, and are thus more suited for operating in single core systems. Therefore, in order to achieve an effective parallelization sought after in C-RAN multicore BBUs, we invoke the use of the Givens orthonormalization mechanism but with  MK 2 ( ) computational complexity to form Givens data projection method (GDPM).
It is important to remember that a lower complexity, that is a flop-count product, may not inherently mean that the process is superior to the process with higher complexity. This is very significant for the case when computing is done on multicore machine such as the MPC-RAN BBU. Because a parallelization efficient system becomes superior in such scenarios (Ford, 2015). This forms the basis for our option of Givens orthonormalization method within the MPC-RAN for channel estimation.
Givens Rotations reproduces calculations where it is important to selectively zero particular elements (Golub & Van Loan, 2013). Each rotation can only impact two rows of the given matrix, so we can interchange the order of rotations affecting different rows, thus enabling the use of parallel rotational sets (Golub & Van Loan, 2013). This is the reason we said that the Givens transformation lends itself to successful parallelisation. The Givens transformation also comes in handy after a row is added or a column is removed and updating of a matrix is required. This is equivalent to addition of RRH antenna as a result of the evolution of the interacting RRH antennas for a UT in motion and column deletion, this is when a UT drops out of the network for some reason.
Again from (Hu et al., 2013), the covariance matrix for the signal obtained is determined as follows: Through SVD φ y is then decomposed to yield: is the subspace of the noise. Using (Hu et al., 2013) to compute the channel matrix H using U s , we use the scalar multiplicative ambiguity matrix B  K K × : We exploit the short training sequence in (2) to obtain the ambiguity matrix and calculate it as: where the Ĥ Compressed is the first approximation of the channel obtained from the calculation of the compressed channel in (27) The subspace tracking algorithm called the DPM in (Yang & Kaveh, 1988) was adopted but Gram-Schmidt orthonormalization was replaced with the Givens orthonormalization method. Although this does not minimize the algorithm's complexity, it allows the algorithm an efficient parallelization that is an essential feature in the MPC-RAN network.
By using the basic structure of the Givens rotation matrix, we compute the multiplication of a matrix. We compute the values of parameters c and s and a matrix A The authors leverage on this equation to develop an algorithm and with permitted abuse of language call it the serial GDPM (SGDPM) illustrated in Table 2.
The parameters c and s are computed as in the algorithm in Table 3. The process adapted above is the classical Givens rotation. This approach can be further improved by the column-wise Givens rotation where several elements of a column can be annihilated within the input matrix. This alteration has the advantage of fewer multiplications than the implementation of  (Merchant et al., 2014(Merchant et al., , 2018. This also has the ability to combine coarseand fine-grained parallelism. We start by conditioning the input matrix Y  M K × multiplying it with the initial matrix to give us a matrix A  M M × . Assuming that G  M M × . the updated matrix ɶ A GA = and thus: and: . Thus, to remove an element in the row m and column 1 corresponding to (m, 1), we apply one Givens transformation and we can rewrite (25) as: and equation (31) is the generalized Givens rotation.
The column-wise Givens rotation operates on one column per iteration and the generalized Givens rotation operates column-wise and row-wise simultaneously in a single iteration to triangulate a matrix of in dimension. This can be shown as seen in Figure 1.
Looking at the theoretical number of iterations required to carry out each of these Givens rotation-  Table 4. This algorithm forms the basis of the parallel GDPM (PGDPM) sub-space tracking and consequently the PGDPM based semi-blind estimator.
It is important to note that j represents the particular RRH we are currently operating in, m represents the columns of the matrix A j and i represents the rows of the matrix , , 1 … represents row one update in the mth column up to row M update in mth column. updating from the first row to the last row can be done simultaneously, thereby allowing for concurrent updating of rows in a column. But then iteration out of loop can be performed concurrently meaning that row changes in separate columns can be parallelised. In addition, the computation of the PGDPM in multi-RRH system for different RRHs can be parallelised according to the above algorithm. It then renders the PGDPM an effective algorithm for the channel estimation method to be employed at the BBU. Therefore, the approximate channel matrix is determined as:Ĥ This gives rise to PGDPM based compressed data semi-blind estimator which is basically referred to as GDPM estimator in this work.

NUMERICAL RESULTS ANd ANALySIS
In this section, we look at the performance indicators NMSE, SNR, reuse fact f and M for all the channel estimation techniques viz improved MMSE, compressed data and PGDPM. Tradeoffs among these parameters are evaluated for the channel estimation schemes discussed for the uplink MPC-RAN. The NMSE can be computed as: , ,       . Figure 2 compares the attainable NMSE vs. number of RRH antennas in multicell MPC-RAN for GDPM-based semi-blind estimator. Several observations can be made based on this figure. GDPM-based semi-blind estimator efficiency improves as the number of RRH antennas increases, as well as the increase in reuse factor. This is expressed by a decrease in the NMSE when the number of RRH antennas increases. It's also clear that with an increase in the reuse factor Table 5, the NMSE decreases. Thus, Table 5 Figure 3 shows that the semi-blind estimation technique based on GDPM has the lowest NMSE at f = 1 followed by FDPM, then RNA. The FDPM closely follows the GDPM. From Figure 4, it is clear that at f = 4 . GDPM-based semi-blind estimation technique still outperforms the conventional FDPM and RNA estimators. But the NMSE is lower in overall at f = 4 than it is at f = 1 .
Next, we provide NMSE, SNR and M comparison and analysis of the channel estimation techniques for RNA, compressed data, FDPM and GDPM in MPC-RAN. This comparison is performed for M ranging from 16 160 to , with an increment of 16 and K = 10 MPC-RAN network and the SNR ranges from 0dB to 20dB in 2dB steps.
From Figure 5 and Figure 6 depicts the performance of NMSE with variation in SNR for the respective channel estimators. IT can be observed that at lower SNR the NMSE is high but as the SNR increases, which is an indicator of improving channel conditions, the NMSE reduces for all the channel estimators. Again, it is observed that the semi-blind channel estimators have a better   NMSE than the linear estimators, yet they need reduced pilots for estimation. This points to better performance in pilot contamination yet with good performance. In order to get the average NMSE we take the NMSE over the RRH antenna range at particular value of f for each SNR between 0dB to 20dB in 2dB steps. Next, we average this NMSE for all SNR considered at a given f over the range of RRH antennas. This then yields the NMSE over a given RRH range for a specified f and plotted as shown in Figures 7 and 8. Figure 7 summarizes the NMSE against the number of RRH antennas for RNA, compressed data, FDPM and GDPM for a reuse factor of 1. As the number of RRH antennas increase the NMSE decreases since the channel estimation improves due channel hardening phenomenon. Again, the estimation of the GDPM channel and the estimation of the FDPM channel have less NMSE compared to the RNA-MMSE and compressed data channel, because the techniques of semi-blind estimation of the channel are superior to linear estimation techniques. But as RRH antennas increases the estimation of the RNA and the compressed data channel NMSE nears that of the estimation of the FDPM and GDPM channel since the approximation improves with the increase in the number of antennas due to the hardening phenomenon of channels.
The reuse factor is set to 2 and 4 respectively in Figure 8 and Figure 9, and the NMSE is reduced as compared to the case when the reuse factor is set to 1 and also a reuse factor of 2 has a higher NMSE than the reuse factor of 4. This can be due to the fact that as the reuse factor increases the pilot contamination reduces and this improves the channel estimation process leading to a reduction in NMSE for all RNA, compressed data, FDPM and GDPM channel estimation techniques.
Once again, it can be noted that the FDPM has a slightly higher NMSE than the GDPM pointing to the fact that although GDPM has a high complexity its performance is superior to that of FDPM. Thus, we can lower the complexity in GDPM with parallelization, and hence exploit its superior channel estimation performance. Another important observation is that when the NMSE is averaged over 0dB to 20dB SNR the resultant values of NMSE is less than for the case when SNR is not factored in for all the channel estimation techniques. This is expected since when the network condition is improved the estimation of the channel improves since the pilot contamination consequently reduces.

CONCLUSION
The paper gives the performance analysis and comparison of the RNA, Compressed data, FDPM and GDPM channel estimators for MPC-RAN system. The performance of the channel estimation schemes in terms of the RRH antennas and the NMSE is studied. The NMSE was derived theoretically for each of the channel estimation schemes under similar assumptions and for the MPC-RAN system. The NMSE for the GDPM estimator is lower than that of the FDPM, RNA and Compressed data estimators. And this points to better study around the GDPM estimator parallelization architectures to enhance its applicability in MPC-RAN network. For NMSE averaged over over 0dB to 20dB SNR, the increase in the number of antennas increases the NMSE performance of the FDPM, RNA and Compressed data estimators to near that of GDPM, this is attributed to better approximation as the number of antennas increase due to channel hardening phenomenon and improved channel conditions due to presence of high SNR. The future work to this study will be to look at parallelization structures in multicores to best implement GDPM estimator and make it more efficient in MPC-RAN application. This will offer better estimation with reduced data size and number of pilots yet with optimal channel estimation at BBU due to highly efficient parallelization.  Linear methods rely on pilots for channel estimation increasing pilot contamination through pilot overheads.
Uses fewer pilots mostly the initial estimation from linear model for channel estimation reducing pilot contamination.

2
Does not work with compressed data this results in overstretching the backhaul link in MPC-RAN system causing congestion.
Relies on compressed data techniques to reduce data send over the backhaul link hence minimizing congestion.

3
The parallelization efficiency in terms of hardware implementation as per the algorithms used is limited.
It offers great parallelization in terms of hardware implementation by virtue of the highly parallelizable algorithm in Table 5, further reducing its complexity.

4
They conventional channel estimation methods have a high NMSE thus giving reduced efficiency in MPC-RAN system.
The PGDPM depicts a lower NMSE than the conventional methods hence improving efficiency of the MPC-RAN system.