Robust Beamforming Design for Ultra-dense User-Centric C-RAN in the Face of Realistic Pilot Contamination and Limited Feedback

The ultra-dense cloud radio access network (UD-CRAN), in which remote radio heads (RRHs) are densely deployed in the network, is considered. To reduce the channel estimation overhead, we focus on the design of robust transmit beamforming for user-cen…

Authors: Cunhua Pan, Hong Ren, Maged Elkashlan

Robust Beamforming Design for Ultra-dense User-Centric C-RAN in the Face   of Realistic Pilot Contamination and Limited Feedback
1 Rob ust Beamforming Design for Ultra-dense User -Centric C-RAN in the F ace of Realistic Pilot Contamination and Limited Feedback Cunhua Pan, Hong Ren, Maged Elkashlan, Arumug am Nallanathan, F ellow , IEEE and Lajos Hanzo, F ellow , IEEE Abstract The ultra-dense cloud radio access network (UD-CRAN), in which remote radio heads (RRHs) are densely deployed in the network, is considered. T o reduce the channel estimation overhead, we focus on the design of robust transmit beamforming for user -centric frequency di vision duplex (FDD) UD-CRANs, where only limited channel state information (CSI) is av ailable. Specifically , we conceive a complete procedure for acquiring the CSI that includes two ke y steps: channel estimation and channel quantization. The phase ambiguity (P A) is also quantized for coherent cooperati ve transmission. Based on the imperfect CSI, we aim for optimizing the beamforming vectors in order to minimize the total transmit power subject to users’ rate requirements and fronthaul capacity constraints. W e deri ve the closed- form expression of the achiev able data rate by exploiting the statistical properties of multiple uncertain terms. Then, we propose a low-comple xity iterativ e algorithm for solving this problem based on the successiv e con vex approximation technique. In each iteration, the Lagrange dual decomposition method is employed for obtaining the optimal beamforming vector . Furthermore, a pair of low-comple xity user selection algorithms are provided to guarantee the feasibility of the problem. Simulation results confirm the accuracy of our robust algorithm in terms of meeting the rate requirements. Finally , our simulation results verify that using a single bit for quantizing the P A is capable of achieving good performance. I . I N T RO D U C T I O N Ultra dense networks (UDNs), where more and more small base stations (BSs) are deployed within a gi ven area, hav e been widely regarded as one of the most promising techniques of C. Pan, M. Elkashlan and A. Nallanathan are with the Queen Mary University of London, London E1 4NS, U.K. (Email: { c.pan, maged.elkashlan } @qmul.ac.uk). H. Ren was with National Mobile Communications Research Laboratory , Southeast Univ ersity , Nanjing 210096, China. She is now with the Queen Mary Univ ersity of London, London E1 4NS, U.K. (e-mail: renhong@seu.edu.cn). L. Hanzo is with the School of Electronics and Computer Science, University of Southampton, Southampton, SO17 1BJ, U.K. (e-mail:lh@ecs.soton.ac.uk). 2 F r o n t h a u l L i n k s B B U P o o l B B U B B U B B U U E 1 R R H 1 R R H 2 U E 2 U E 3 U E 4 U E 5 U E 6 R R H 3 R R H 4 R R H 5 R R H 6 R R H 7 R R H 8 R R H 9 B B U Fig. 1. Illustration of a UD-CRAN with nine RRHs and six UEs, i.e., I = 9 , K = 6 . T o reduce the complexity , each UE is served by the RRHs within the dashed circle centered around the UE. achie ving a high system throughput [1]. In UDNs, the average distance between small BSs and users can be dramatically reduced, which can translate into improved link reliability . Howe ver , since all small BSs reuse the same frequency , the users are also exposed to se vere inter -cell interference, which is a se vere performance limiting factor . Hence, the interference should be judiciously managed in order to reap the potential benefits of UDNs. As a result, the cloud radio access network (CRAN) concept has been recently proposed as a promising network architecture [2]. In CRAN, all the signal processing tasks are performed at the BBU pool, and all the con ventional small BSs are replaced by low-cost low-po wer RRHs, which are only responsible for simple transmission/reception functions. The RRHs are connected to the BBU pool through the fronthaul links to support the centralized signal processing. Hence, the interference in the network can be ef fectiv ely mitigated by employing the coordinated multipoint (CoMP) technique. Furthermore, due to their low-complexity functionalities, the mobile operator can densely deploy the RRHs at a low capital cost. Hence, the CRAN architecture is an ideal platform for supporting UDNs. This kind of network is generally termed as an UD-CRAN [3], [4]. An simple example of UD-CRAN is illustrated in Fig. 1, where the number of RRHs is larger than that of the UEs. Most of the existing contrib utions tend to deal with the v arious technical issues of con ventional CRAN with a limited number of RRHs based on the assumption of the a vailability of perfect CSI [5]–[12]. In particular , Luong et al. [11] considered the transmit po wer minimization problem for the do wnlink of C-RANs with limited fronthaul capacity , where a pair of novel iterati ve algorithms were proposed for solving this problem. In the first one, the classic successi ve con ve x approximation frame work was adopted for approximating the continuous noncon ve x constraints, 3 and the problem was con verted into a mixed-integer second order cone program (MI-SOCP). By relaxing the binary variables to continuous vlaued variables, the second algorithm that is based on the so-called inflation procedure was proposed, which only has to solve a series of SOCP problems. Most recently , the same authors studied in [12] considered the tradeof f between the achie vable sum-rate and total po wer consumption by using the radical multiobjecti ve optimization concept, where the optimization problem was formulated as a mixed-inte ger noncon ve x program. The authors proposed a branch and reduce and bound-based (BRB) algorithm for finding the globally optimal solution for benchmarking purposes, and also provided lo w-complexity iterati ve algorithms similar to the ones in [11]. Ho wev er , the most challenging issue in UD-CRANs is that a large amount of CSI is required for facilitating CoMP transmission. The acquisition of the CSI requires a large amount of training resources that escalate rapidly with the network size. One of the most promising solutions is to consider the av ailability of only partial CSI. Specifically , each user only has to estimate the CSI of the links from the RRHs in its serving cluster (termed intra-cluster CSI), while only measuring the large-scale channel gains (such as path loss and shadowing) for the CSI of the links from the RRHs beyond its serving cluster (termed inter-cluster CSI). For the example in Fig. 1, UE 1 only needs to estimate the CSI from RRH 1,2, and 3 to itself, while only the large-scale channel gains are required for the RRHs outside of its cluster . For this kind of scenario, the methods de veloped in [5]–[10] based on the assumption of perfect CSI cannot be tailored for this case. Recently , the transmission design relying on partial CSI has attracted extensi ve research interests [13]–[16]. In particular , a nov el compressi ve CSI acquisition method was proposed in [13] that can adapti vely determine the set of instantaneous CSIs that should be estimated. The weighted sum-rate maximization problem was considered in [14], where the Cauchy-Schwarz inequality was employed for deri ving the lower -bound of the accurate data rate. The threshold- based channel matrix sparsification method was proposed in [15] for a UD-CRAN, where the authors demonstrated that only a negligible performance loss will be caused by discarding the channel matrix entries below a certain threshold. Finally , in our recent work [16], we proposed a unified frame work to deal with the challenges arising in UD-CRAN, and Jensen’ s inequality was utilized to obtain a more tight lo wer bound on the achiev able rate than that in [14]. Ho wev er , in [13]–[16], perfect intra-cluster CSI was assumed to be a vailable at the BBU pool, which is unrealistic for UD-CRANs, especially when the network operates in the frequency di vision duplex (FDD) mode [17], which is the focus of this paper . T ran et al. [18] considered 4 the queue-aw are robust beamforming design to minimize the av erage transmission power in the face of imperfect CSI for the whole C-RAN, while satisfying the outage probability constraint of each user . The classic L yapunov optimization theory was employed for ensuring the systems stability . The Bernstein-T ype Inequality [19] was utilized for transforming the outage probabil- ity constraints into a more tractable form that facilitates the application of the Semi-Definite Relaxation (SDR) approach. Ho wev er , the channel error model is only suitable for the channel estimation error . In FDD UD-CRAN, each user has to estimate the intra-cluster CSI based on the pilot sequences sent from the RRHs within the serving cluster . Then, the user selects a code word from a pre-designed CSI codebook to quantize the estimated CSI and feeds back its index to the BBU pool through a dedicated feedback channel. This procedure will impose three kinds of channel imperfections: channel estimation error , CSI quantization error and feedback delay . Since UD-CRANs are usually deployed in a limited area such as shopping malls and stadiums where the users move slowly , the effect of channel feedback delay can be ignored [16]. Ho wev er , the other two error sources are inevitable and remain to be a serious problem in UD-CRANs. T o estimate the intra-cluster CSI, the pilot sequences sent from the RRHs that belong to the same user’ s serving cluster should be mutually orthogonal so that the user can differentiate the channels associated with different RRHs. For the example in Fig. 1, since RRH 1, 2, and 3 cooperativ ely serve UE 1, the pilot sequences sent from these RRHs should be mutually orthogonal. A direct method is to assign to all the RRHs mutually orthogonal pilots. Ho wev er , the number of pilots linearly increases with the number of RRHs, which is excessi ve in UD- CRANs. T o sav e the pilot resources, one should allo w the RRHs serving no common user to reuse the same pilot. The authors [20], [21] provided novel pilot reuse schemes for minimizing the total number of pilots required based on graph theory . In [22], Nguyen et al. proposed an iterati ve pilot allocation method for multicell massiv e MIMO networks, where the modified Hungarian method was adopted to solv e the pilot allocation problem for each cell by fixing the pilot assignments for all the other cells. Howe ver , the beamforming direction was fixed and the computational comple xity of the pilot assignment algorithm increases drastically with the number of cells. It is commonly known that the pilot reuse scheme will impose non-negligible pilot contamination, which ine vitably leads to sizeable channel estimation error that cannot be eliminated. Hence, the channel estimation error should be taken into account when designing the transmission strategy . A robust beamforming design explicitly considering the channel estimation error was studied in our recent work [23] for time division duplex (TDD) UD-CRANs, where 5 no channel quantization error is imposed as a benefit of the TDD channel’ s reciprocity . Since coherent cooperati ve transmission among RRHs provides higher spectral ef ficiency than non-coherent transmission, we consider the limited feedback scenario of the former transmission scheme. T o reduce the implementation complexity , the authors in [24], [25] advocated the per- RRH limited feedback strategy , where the estimated channels of all the links from all the candidate RRHs to each user are independently quantized rather than quantizing them jointly . Ho wev er , this feedback strategy will result in phase ambiguity (P A) [24]. T o elaborate, the P A is the phase differences between the single-RRH channel direction information (CDI) and the single-RRH quantized CSI code word, which has no impact on the con ventional single-cell channel quantization. Howe ver , it was sho wn in [24] that its adverse ef fect can be compensated by feeding back the P A information to the transmitter at the cost of a modest feedback o verhead. In this paper , we consider the robust downlink beamforming design of FDD UD-CRAN by taking into account all the channel uncertainties. Specifically , we aim for jointly optimizing the user-RRH associations and beamforming vectors for minimizing the total transmission po wer subject to users’ rate requirements, fronthaul capacity constraints and per -RRH po wer constraints. This is a mixed integer non-linear programming (MINLP) problem that is generally dif ficult to solve. For the imperfect CSI considered in this paper , in contrast to the constraints of (6) and (7) in [11], the SINR constraints cannot be transformed into an SOCP format. Due to the same reason, the BRB algorithm in [12] aiming for globally optimal solution cannot be used for the imperfect CSI case. Furthermore, for the low-comple xity algorithms de veloped in [11], [12], one has to solve an MI-SOCP or SOCP problem in each iteration, which incurs high computational complexity for UD-CRAN. Specifically , the contributions of this paper are summarized as follows: 1) W e provide a complete and practical procedure for the BBU pool to acquire the CSI required for centralized signal processing, namely for both channel estimation and channel quantization. T o the best of our knowledge, this paper is the first attempt to unify these two steps into a joint frame work. W e deriv e the closed-form expression of the achiev able data rate by exploiting statistical characteristics of the channel estimation error , the per-RRH CDI, the P A quantization errors and partial inter -cluster CSI. 2) T o address the feasibility issue, we provide a pair of low-comple xity user selection algo- rithms, namely the successiv e UE deletion method having a complexity order of O ( K ) and a bisection based search method having a complexity order of O (log 2 ( K )) , where K is the 6 total number of users. Simulation results show that the former algorithm performs better than the latter , and only slightly worse than the exhausti ve search based method having an exponentially increasing complexity order of K . The performance loss is roughly 8% in the worst case. 3) Based on the feasible set of users giv en by the user selection algorithms, we propose a lo w-complexity iterati ve algorithm for solving the power minimization problem. Specifi- cally , the non-smooth indicator function is approximated as a smooth concave real-valued fractional function, which is iterativ ely approximated by its first order T aylor expansion. In contrast to [23], this paper additionally considers the impact of CSI quantization errors, hence the semi-definite relaxation approach de veloped in [23] cannot be guaranteed to generate a rank-one beamforming solution. Instead, we approximate the complex-valued useful signal part in the rate expression by its first-order T aylor expansion with the aid of the T -transform [26] that transforms complex-v alued matrices and vectors into their real- v alued equi valents. The transformed optimization problem becomes a con ve x one, and we deri ve the optimal beamforming vectors by employing the Lagrange dual decomposition method. Then, the successi ve con vex approximation (SCA) technique is used for iterativ ely updating the corresponding variables that can guarantee to con ver ge. Note that [11], [12] provided the results of the first-order T aylor expansion for the complex-v alued expressions without a strict proof. Furthermore, The special structure of the resultant sub-problem has not been exploited for dev eloping a reduced-complexity algorithm for av oiding the direct solution of the MI-SOCP or SOCP . The rest of this paper is organized as follo ws. Section II presents the system model. Section III formulates a two-stage optimization problem. A low-comple xity iterativ e algorithm is provided in Section IV to deal with the transmit power minimization problem when the users are selected to be admitted. T wo low-comple xity user selection algorithms are presented in Section V. Extensiv e simulation results are gi ven in Section VI. Finally , our conclusions are drawn in Section VII. Notations : E { x } { y } denotes the expectation of y ov er random variable x . C N ( x , Σ ) denotes the complex Gaussian distrib ution with mean x and variance Σ . The complex set is denoted as C . I and 0 are an identity matrix and a zero matrix, respecti vely . The transpose, conjugate transpose and the pseudo-in verse of matrix A are denoted as A T , A H and A † , respecti vely . B = blkdiag { A i , i ∈ I } means that matrix B is formed by performing the block diagonalization ov er A i . Re( · ) and Im( · ) represent real and imaginary parts of a variable, respecti vely . f 0 θ ( x ) 7 T ABLE I T H E L I S T O F N O TA T I O N S I The number of RRHs K The number of UEs I The set of RRHs U The set of UEs U The set of selected UEs I k The candidate set of RRHs serving UE k U i The candidate set of UEs served by RRH i h i,k The channel from RRH i to UE k w i,k The BF vector from RRH i to UE k α i,k Large-scale channel gain from RRH i to UE k ¯ h i,k Small-scale fading from RRH i to UE k σ 2 k Noise power at UE k M The number of antennas at each RRH τ The number of time slots for training Q The set of pilot indices Q The orthogonal pilot sequences n l The reuse time of pilot l n max The maximum pilot reuse time ˆ h i,k The MMSE estimation of channel h i,k ˜ h i,k Channel direction information of ˆ h i,k p t Pilot power h i,k ˜ h i,k Channel direction information of ˆ h i,k q i,k The quantized version of ˜ h i,k C i,k Per-RRH codebook used by UE k B CDI i,k The number of bits to quantize CDI φ i,k The P A between CDI and its quantized codeword ˜ φ i,k The P A quantization error ˆ φ i,k The quantized version of the P A φ i,k B P A i,k The number of bits to quantize P A a i,k The quantization error of the CDI ˜ h i,k denotes the first-order deri vati ve of f θ ( x ) . The other notations are summarized in T able I. I I . S Y S T E M M O D E L A. Signal T ransmission Model Consider a downlink FDD UD-CRAN shown in Fig. 1, which has I RRHs and K UEs. Each RRH is equipped with M transmit antennas and each UE has a single receiv e antenna. The sets of RRHs and UEs are denoted as I = { 1 , · · · , I } and U = { 1 , · · · , K } , respecti vely . Each RRH is connected to the BBU pool through the wired/wireless fronthaul links. Let U ⊆ U represent the subset of UEs that can be admitted by the system. T o reduce the computational complexity associated with the UD-CRAN, the user-centric cluster technique is considered, where each UE is exclusiv ely served by its nearby RRHs, since the signals arriving from distant RRHs are weak at the UE due to the severe path loss. For the example of Fig. 1, UE 1 is only potentially served by RRH 1, RRH 2 and RRH 3. The set of RRHs that potentially serv e UE k is denoted as I k ⊆ I , or equi valently the candidate set of RRHs that serve UE k is denoted as I k . It should be emphasized that the set of RRHs that finally serve UE k may not be the same as I k , which needs to be optimized in the following sections, while the RRHs out of its cluster , i.e., I \I k , will not serve UE k . Additionally , let us denote U i ⊆ U as the set of UEs that are potentially served by 8 RRH i . Note that the clusters for the UEs may overlap with each other , which means that each RRH can simultaneously serve multiple UEs. These clusters are assumed to be predetermined based on the large-scale channel gains that vary slowly . Let us denote by h i,k ∈ C M × 1 and w i,k ∈ C M × 1 the channel vector and beamforming vector of the links spanning from RRH i to UE k , respecti vely . Then, the signal receiv ed at UE k is y k = X i ∈I k h H i,k w i,k s k | {z } desired signal + X l 6 = k ,l ∈U X i ∈I l h H i,k w i,l s l | {z } interference + z k , (1) where s l denotes the transmission data for UE l and z k is the zero-mean additi ve complex white Gaussian noise with variance σ 2 k . It is assumed that the data destined for each UE is independent of each other and it has a zero mean and unit variance, i.e., we hav e E {| s k | 2 } = 1 and E { s k 1 s k 2 } = 0 for k 1 6 = k 2 , ∀ k 1 , k 2 ∈ U . The channel vector h i,k can be decomposed as h i,k = √ α i,k ¯ h i,k , where α i,k represents the lar ge-scale channel gains of the links spanning from RRH i to UE k that accounts both for the shado wing and path loss, while ¯ h i,k is the small-scale channel fading with the distribution of C N ( 0 , I ) . B. Channel Estimation for Intra-cluster CSI T o design the beam-vectors for the UEs, the overall CSI should be a vailable at the BBU pool for the facilitation of joint transmission. Ho wev er , it is an unaffordable task to estimate the CSI from all RRHs to all UEs due to the limited a vailability of training resources. An appealing approach is that each UE only estimates the CSI within its cluster , named intra-cluster CSI. For the CSI beyond this cluster , it is assumed that only large-scale channel gains are av ailable, i.e., { α i,k , ∀ i ∈ I \I k , k ∈ U } . The out-cluster lar ge-scale channel gains are used to control the multiuser interference. In this paper , we assume that τ time slots are used for CSI training, thus the length of pilot sequences is τ , or equi valently the number of orthogonal pilot sequences is equal to τ . Let us denote the set of pilot indices as Q = { 1 , 2 , · · · , τ } , and the corresponding orthogonal pilot sequences as Q = [ q 1 , · · · , q τ ] ∈ C τ × τ that satisfies the orthogonal condition Q H Q = I . For the channel estimation in an FDD UD-CRAN system, the RRHs first send the training sequences to the UEs, then the UEs estimate their channels based on their receiv ed signals. Specifically , the training signals recei ved at UE k can be written as y k = X i ∈I k √ p t h H i,k X H i + X i ∈I / I k √ p t h H i,k X H i + n k , (2) 9 where p t is the pilot transmit power at each transmit antenna, n k ∈ C 1 × τ is the additiv e Gaussian noise vector during the training time slots, whose elements are independently generated and follo w the distributions of C N (0 , σ 2 k ) , X i ∈ C τ × M is the pilot training matrix sent from RRH i . The training matrix X i can be written as X i = h q π 1 i , · · · , q π M i i , where q π m i ∈ C τ × 1 denotes the pilot sequence used for estimating the channels spanning from the m th antenna of RRH i to the UEs. T o conserve the pilot resources, a pilot reuse scheme is considered, which should satisfy the follo wing constraints: 1) The pilot sequences from dif ferent RRHs in the same cluster should also be orthogonal, i.e. X H m X n = 0 for m, n ∈ I k , m 6 = n, ∀ k ∈ U ; 2) The maximum reuse time for each pilot sequence should be restricted to a small value for reducing the channel estimation error . Let us denote the reuse time for pilot l as n l . Then this condition can be expressed as n l ≤ n max , ∀ l ∈ Q . 3) The pilot sequences used by all antennas at the same RRH should be mutually orthogonal, i.e. X H i X i = I . The first constraint means that the RRHs serving the same UE should use an orthogonal pilot matrix. A natural pilot allocation approach to satisfy the abov e three constraints is the orthogonal pilot allocation scheme, where all antennas and RRHs are allocated orthogonal pilots. Obviously , the number of pilots required is M I , which occupies lots of time slots for UD-CRANs having a large number of RRHs. Hence, if we allow some RRHs to reuse the same set of pilots, the number of pilot sequences required will be reduced. In this paper , we aim for minimizing the number of pilots required, while guaranteeing the abov e three conditions. This pilot allocation problem has been studied in [20], where the Dsatur algorithm from graph theory was proposed to solv e it. The computational complexity of the Dsatur algorithm is giv en by O ( I 2 ) [20]. When some RRHs are allocated the same color , these RRHs can reuse the same pilot. Denote c ? as the number of dif ferent colors required by the Dsatur algorithm to color all the RRHs. Then the total number of pilots required is giv en by τ = M c ? , since the antennas in each RRH use different pilots. Let us define K X = { i : X i = X } as the set of RRHs that reuse the same pilots X obtained by using the Dsatur algorithm. Then, the MMSE estimation of channel h i,k is giv en by [27] ˆ h i,k = α i,k P m ∈K X i α m,k + ˆ σ 2 1 √ p t X H i y H k , (3) where ˆ σ 2 = σ 2 k /p t . It can be readily derived from (3) that the channel estimate ˆ h i,k obeys the 10 distribution of C N ( 0 , ω i,k I ) with ω i,k gi ven by ω i,k = α 2 i,k P m ∈K X i α m,k + ˆ σ 2 . (4) According to the property of MMSE estimation [27], the channel estimate error e i,k = h i,k − ˆ h i,k is independent of the channel estimate ˆ h i,k , which follows the distrib ution of C N ( 0 , δ i,k I ) , with δ i,k gi ven by δ i,k = α i,k  P m ∈K X i \ i α m,k + ˆ σ 2  P m ∈K X i α m,k + ˆ σ 2 . (5) Note that even when RRH i does not reuse any pilots of any other RRHs, there is still some channel estimation error for channel h i,k with δ i,k = α i,k ˆ σ 2 /( α i,k + ˆ σ 2 ) . C. Limited F eedbac k Model In this paper , we consider the limited per-RRH codebook feedback strategy [24], where each UE uses different codebooks to independently quantize its per-RRH CDI, i.e., ˜ h i,k = ˆ h i,k .    ˆ h i,k    . Then UE k feeds back the indices of codewords to its corresponding serving RRHs. The BBU pool will collect all the indices sent from dif ferent RRHs and will design beamforming vectors based on these indices. Specifically , the quantized version of the CDI ˜ h i,k is giv en by q i,k = arg max c i,k,n ∈C i,k    ˜ h H i,k c i,k,n    , (6) where C i,k is the per-RRH codebook used by UE k to quantize the CSI spanning from RRH i , which consists of unit-norm codew ords c i,k,n ∈ C M × 1 , n = 1 , · · · , 2 B CDI i,k , with B CDI i,k denoting the number of bits used for quantizing the CDI ˜ h i,k . Coherent joint transmission is assumed in this paper . Then, another important parameter namely the phase ambiguity (P A) is also required at the BB U pool [24], [25]. The P A is defined as the angle between the per-RRH CDI and its quantized code word, i.e., e j φ i,k = ˜ h H i,k q i,k .    ˜ h H i,k q i,k    with j = √ − 1 . The P A kno wledge is not required for single-point limited feedback MIMO systems, but affects the co-phasing of the coherent joint transmission in UD-CRAN, as detailed in [24], [25]. The P A can be fed back with the aid of a fe w bits by using scalar quantization. Since the code word is chosen by maximizing the magnitude of ˜ h i,k c i,k,n and the CDI is isotropically distributed, the P A φ i,k will be uniformly distributed in [0 , 2 π ] . Hence, it is optimal to quantize the P A employing a uniform scalar quantizer . Let us denote by ˜ φ i,k and ˆ φ i,k the P A quantization error and the quantized version of the P A φ i,k , respectiv ely . Then, the P A φ i,k can be represented 11 as φ i,k = ˆ φ i,k + ˜ φ i,k . If we use B P A i,k bits to quantize P A φ i,k , the P A quantization error ˜ φ i,k is uniformly distributed within  − π 2 B P A i,k , π 2 B P A i,k  . Let us define by a i,k ∆ = 1 −    ˜ h i,k q i,k    2 the quantization error of the CDI ˜ h i,k . For simplicity , random vector quantization (R VQ) is considered for quantizing the per -RRH CDIs in this paper . Then, according to [28], the per-RRH CDI ˜ h i,k can be rewritten as ˜ h i,k = p 1 − a i,k e j φ i,k q i,k + √ a i,k u i,k , (7) where u i,k is channel quantization error , which is a unit-norm vector isotropically distributed in the nullspace of q i,k . In this paper , we assume that there are dedicated error-free feedback channels for feeding back all quantized versions of CDIs and P As to the BBU pool. Then, the BBU pool determines the beamforming vectors based on the feedback information. I I I . P RO B L E M F O R M U L A T I O N In this section, we first provide the mathematical model for the constraints of the UD-CRAN, which include each UE’ s data rate requirement, the per-RRH po wer constraint and limited fronthaul capacity constraint. Then, based on these constraints, we formulate the UE selection problem and the transmit power minimization problem in a two-stage form. Let us denote the beamforming v ectors from all RRHs in I k by w k = [ w H i,k , ∀ i ∈ I k ] H ∈ C |I k | M × 1 , and the aggre gated channel vectors from RRHs in I l to UE k by g l,k = [ h H i,k , ∀ i ∈ I l ] H ∈ C |I l | M × 1 . In addition, define ˜ g k,k = [ e H i,k , ∀ i ∈ I k ] H ∈ C |I k | M × 1 and ˆ g k,k = [ ˆ h H i,k , ∀ i ∈ I k ] H ∈ C |I k | M × 1 as the ov erall CSI error and estimated CSI of the links spanning from the RRHs in I k to UE k , respectiv ely . Then, the channel estimation error can be re written as ˜ g k,k = g k,k − ˆ g k,k , while the receiv ed signal model in (1) can be reformulated as y k = ˆ g H k,k w k s k | {z } Desired signal + ˜ g H k,k w k s k | {z } Residual − interference + X l 6 = k ,l ∈U g H l,k w l s l | {z } Multi − user Interference + z k , ∀ k ∈ U . (8) As in most existing papers [29], [30], we consider the achie vable data rate, where the residual- interference term in (8) due to the channel estimation error is treated as uncorrelated Gaussian noise. Additionally , for the sake of reducing the decoding complexity , the multi-user interference term is also re garded as uncorrelated Gaussian noise. By considering the time slots allocated for channel training, the net achiev able data rate of UE k can be expressed as [30] r k = T − τ T log 2   1 + E n   ˆ g H k,k w k   2 o E n   ˜ g H k,k w k   2 o + P l 6 = k ,l ∈U E n   g H l,k w l   2 o + σ 2 k   , ∀ k ∈ U , (9) 12 where τ is the total number of time slots required by the Dsatur algorithm, T denotes the total number of time slots in each time frame, and the expectation is taken ov er multiple random processes, namely , the fast fading of the unknown CSI in I \I k , the channel estimation errors { e i,k , i ∈ I k } , the CDI quantization errors { u i,k , ∀ i ∈ I k } and the P A quantization errors n ˜ φ i,k , ∀ i ∈ I k o . Each UE’ s data rate should be higher than its minimum rate requirement: C1 : r k ≥ R k, min , ∀ k ∈ U , (10) where R k, min is the rate target of UE k . The second constraint is the per-RRH po wer constraint, which can be expressed as C2 : X k ∈U i k w i,k k 2 ≤ P i, max , i ∈ I , (11) where P i, max is the power limit of RRH i . Finally , each fronthaul link has a capacity constraint, since we consider a limited bandwidth. Specifically , this kind of constraint can be expressed as C3 : X k ∈U i ε  k w i,k k 2  r k ≤ C i, max , ∀ i ∈ I , (12) where C i, max is the capacity limit of the fronthaul link spanning from the BBU pool to RRH i , and ε ( · ) is an indicator function, defined as ε ( x ) =    1 , if x 6 = 0 , 0 , otherwise . (13) Due to the constraints of the system (C2 and C3), some UEs’ rate requirements (C1) may not be satisfied. Hence, some UEs should be remov ed in order to guarantee the QoS requirements of the remaining UEs. Similar to [10], [16], we formulate a two-stage optimization. Specifically , in Stage I, we aim for maximizing the number of UEs admitted to the dense network, which is formulated as P 1 : max w , U ⊆U |U | s . t . C1 , C2 , C3 , (14) where w denotes the set of all beamforming vectors and |U | is the cardinality of the set U . In Stage II, our goal is to optimize the beamforming vectors for minimizing the total transmit po wer with the UEs selected from Stage I. Let us denote by U ? the specific solution from Stage I, where the corresponding U i becomes U ? i . Then the optimization problem in Stage II is P 2 : min w P i ∈I P k ∈U ? i k w i,k k 2 2 s . t . C1 , C2 , C3 . (15) 13 In constraints C1-C3, U and U i are replaced by U ? and U ? i , respectiv ely . Problems P 1 and P 2 in (14) and (15) are dif ficult to solve. The reasons are giv en as follows. Firstly , the exact data rate r k is difficult to deriv e, since the expectation is taken ov er multiple uncertain terms. Secondly , both the objectiv e function and the fronthaul capacity constraint C3 of Problem P 1 contain the non-smooth and non-differential indicator functions, which is recognized as a mixed-integer non-linear programming (MINLP) problem. The exhausti ve search method can be adopted to solv e this kind of optimization problem. Howe ver , it has an exponential complexity order , which becomes excessi ve for UD-CRAN with large number of UEs. In the follo wing section, we first deal with the po wer minimization Problem P 2 by assuming that the set of admitted UEs has already been determined by solving Problem P 1 . Then, we will concei ve low-comple xity methods to deal with Problem P 1 in Section V. I V . L O W - C O M P L E X I T Y A L G O R I T H M T O D E A L W I T H P R O B L E M P 2 In this section, we provide a low-comple xity algorithm for solving Problem P 2 , when the UEs to be admitted hav e already been selected by using the UE selection algorithms in Section V, and denote the subset of UEs that hav e been selected as U . In the follo wing, we first simplify the rate expression. The multiple random processes in the rate expression make the accurate closed-form expression of the achiev able data rate of UE k in (9) difficult to deriv e. In Appendix A, we deri ved the achie vable data rate as r k = T − τ T log 2 1 + w H k A k,k w k w H k E k,k w k + P l 6 = k ,l ∈U w H l A l,k w l + σ 2 k ! (16) = T − τ T log 2 (1 + SINR k ) , (17) where we hav e E k,k = E  ˜ g k,k ˜ g H k,k  ∈ C M |I k |× M |I k | , A k,k = E  ˆ g k,k ˆ g H k,k  ∈ C M |I k |× M |I k | and A l,k = E  g l,k g H l,k  ∈ C M |I l |× M |I l | . The matrix E k,k can be readily computed as E k,k = blkdiag { δ i,k I M , i ∈ I k } , (18) while A k,k and A l,k are giv en in (A.15) and (A.16) of Appendix A, respectiv ely . Note that the matrices E k,k , A k,k and A l,k are semi-definite matrices, since they represent the expectations ov er semi-definite matrices [31]. The achie vable signal to interference plus noise ratio (SINR) of UE k is gi ven by SINR k = w H k A k,k w k w H k E k,k w k + P l 6 = k ,l ∈U w H l A l,k w l + σ 2 k . (19) 14 By exploiting the fact that the rate constraints hold with equality at the optimal point [23], Problem P 2 can be transformed as P 3 : min w X i ∈I X k ∈U i k w i,k k 2 2 (20a) s.t. C2 , C4 : SINR k ≥ η k, min , ∀ i ∈ I , (20b) C5 : X k ∈U i ε  k w i,k k 2  R k, min ≤ C i, max , ∀ i ∈ I , (20c) where η k, min = 2 T T − τ R k, min − 1 . A. Smooth Appr oximation of the Indicator Function W e first deal with the non-smooth nature of the indicator function in C5. Similar to [16], the indicator function is approximated by the smooth function f θ ( x ) = x x + θ , where θ is a small constant. By replacing the indicator function with f θ ( x ) , Problem P 3 can be approximated as P 4 : min w X i ∈I X k ∈U i k w i,k k 2 2 (21a) s.t. C2 , C4 , C6 : X k ∈U i f θ  k w i,k k 2  R k, min ≤ C i, max , ∀ i ∈ I . (21b) The successiv e con ve x approximation (SCA) method [32] is used to deal with the non-con vex constraint C7. Specifically , by exploiting the concavity of f θ ( x ) , we hav e f θ  k w i,k k 2  ≤ f θ  k w i,k ( t ) k 2  + β i,k ( t )  k w i,k k 2 − k w i,k ( t ) k 2  , (22) where w i,k ( t ) is the beamforming vector at the t th iteration, β i,k ( t ) = f 0 θ  k w i,k ( t ) k 2  . By replacing f θ  k w i,k k 2  in Problem P 4 with the right hand side of (22), we arriv e at P 5 : min w X i ∈I X k ∈U i k w i,k k 2 2 (23a) s.t. C2 , C4 , C7 : X k ∈U i τ i,k ( t ) k w i,k k 2 ≤ ˜ C i ( t ) , ∀ i ∈ I , (23b) where τ i,k ( t ) = β i,k ( t ) R k, min , ˜ C i ( t ) = C i, max − P k ∈U i  f θ  k w i,k ( t ) k 2  − β i,k ( t ) k w i,k ( t ) k 2  R k, min . Ho wev er , Problem P 5 is still dif ficult to solve due to Constraint C4, although it has been simplified from Constraint C1. The reasons are gi ven as follows. Due to the channel estimation error , each user suffers from residual interference, as seen from the right hand side of C4, i.e. w H k E k,k w k . Although the classic weighted minimum mean square error (WMMSE) method has been successfully applied in UD-CRANs under the idealized simplifying assumptions of ha ving perfect intra-cluster CSI [7], [10], [16], it cannot be adopted in this realistic optimization problem due to the residual-interference. Furthermore, note that the rank of matrix A k,k is in general higher than one, the Semi-definite (SDP) relaxation method used in [23] cannot be adopted here, since the resultant solution is no longer guaranteed to be of rank one. In the follo wing, we propose a nov el method to deal with Constraint C4. 15 B. Method to Deal with Constraint C4 In this follo wing, we propose a nov el method based on the first-order T aylor approximation to deal with Constraint C4 and then propose the Lagrange dual decomposition algorithm for solving this problem. Constraint C4 is non-con ve x, because w H k A k,k w k is a con ve x function of w k 1 . Similar to the successi ve con ve x approximation method dealing with the concav e fractional function, we ap- proximate it by its first-order T aylor expansion and make Constraint C4 con vex. Since w H k A k,k w k is con vex, we have w H k A k,k w k ≥ w H k ( t ) A k,k w k ( t ) + 2Re  w H k ( t ) A k,k ( w k − w k ( t ))  , (24) where w k ( t ) is the beamforming vector at the t th iteration. The above deriv ation is not direct since w H k A k,k w k is a function of complex-v alued vector w k . The T aylor expansion de veloped for the functions o ver real-valued variables cannot be directly extended to the complex case. In Appendix B, we deriv ed the above result relying on the so-called T -transform [26] that transforms complex-v alued matrices and vectors into their real-v alued equiv alents. By replacing w H k A k,k w k in C4 with the right side of (24), Problem P 5 is transformed to the follo wing optimization problem P 6 : min w X k ∈U k w k k 2 2 (25a) s.t. C2 , C7 , (25b) C8 : 2Re  w H k ( t ) A k,k w k  − ζ k ( t ) ≥ η k, min  w H k E k,k w k + X l 6 = k ,l ∈U w H l A l,k w l + σ 2 k  , ∀ k ∈ U , (25c) where ζ k ( t ) = w H k ( t ) A k,k w k ( t ) . Now , Problem P 6 is a con vex optimization problem. Addition- ally , in Appendix C, we prove that the Slater’ s condition [31] of Problem P 6 is satisfied. Hence, the duality gap between Problem P 6 and its dual problem is zero. As a result, the original Problem P 6 can be solved by solving its dual problem instead. In the following, we deriv e the structure of the optimal beamforming vector by applying the Lagrange dual decomposition method. 1 Note that A k,k is a semi-definite matix. 16 Let us represent I k as I k = { s k 1 , · · · , s k |I k | } . W e first introduce the follo wing block-diagonal matrices B i,k = diag        s k 1 z }| { 0 1 × M , · · · , s k m z }| { 1 1 × M , s k m +1 z }| { 0 1 × M , · · · , s k | I k | z }| { 0 1 × M        , if s k m = i, ∀ i ∈ I , k ∈ U . (26) Then, Constraints C2 and C7 can be rewritten as C9 : X k ∈U i w H k B i,k w k ≤ P i, max , ∀ i ∈ I (27) C10 : X k ∈U i τ i,k ( t ) w H k B i,k w k ≤ ˜ C i ( t ) , ∀ i ∈ I . (28) After some further manipulations, the Lagrangian function of Problem P 6 can be written as L ( w , λ , µ , ν ) = X k ∈U w H k J k ( t ) w k − X k ∈U υ k  w H k ( t ) A k,k w k + w H k A k,k w k ( t )  − X i ∈I λ i P i, max − X i ∈I µ i ˜ C i ( t ) + X k ∈U υ k  η k, min σ 2 k + ζ k ( t )  , where λ , µ , ν are the collections of non-negati ve Lagrangian multipliers associated with Con- straint C9, C10 and C8, respectiv ely , the matrix J k ( t ) above is giv en by J k ( t ) = I + X i ∈I k ( λ i + µ i τ i,k ( t )) B i,k + υ k η k, min E k,k + X l 6 = k ,l ∈U η l, min A k,l . (29) Then, the dual function is giv en by g ( λ , µ , ν ) (30) = min w L ( w , λ , µ , ν ) (31) = min w X k ∈U w H k J k ( t ) w k − X k ∈U υ k  w H k ( t ) A k,k w k + w H k A k,k w k ( t )  − X i ∈I λ i P i, max − X i ∈I µ i ˜ C i ( t ) + X k ∈U υ k  η k, min σ 2 k + ζ k ( t )  . (32) Note that J k ( t ) is a positiv e definite matrix. Hence, Problem (32) is a strictly con v ex problem and its unique solution can be obtained from its first-order optimality condition as: w k = υ k J − 1 k ( t ) A k,k w k ( t ) . (33) By substituting the optimal solution of w k in (33) into (32), the dual function becomes g ( λ , µ , ν ) = − X k ∈U υ 2 k w H k ( t ) A k,k J − 1 k ( t ) A k,k w k ( t ) − X i ∈I λ i P i, max − X i ∈I µ i ˜ C i ( t ) + X k ∈U υ k  η k, min σ 2 k + ζ k ( t )  . (34) 17 Then, the dual of Problem P 6 is giv en by max { λ i ≥ 0 ,µ i ≥ 0 ,ν k ≥ 0 , ∀ k,i } g ( λ , µ , ν ) . (35) The classic gradient descent methods such as the subgradient or ellipsoid methods [31] can be employed to solve the dual problem (35) to update the Lagrangian multipliers. C. Low-complexity Algorithm Combining Subsection-IV -A and Subsection IV -B, we concei ve an iterativ e algorithm to solve Problem P 3 based on the first order T aylor approximation (FO T A) method in Algorithm 1. It is readily seen that the optimal solution obtained at the t th iteration is also feasible for Problem P 3 at the ( t + 1) th iteration, since the indicator function is smaller than one and it is approximated as the right hand side of (22). This implies that Algorithm 1 generates a non-increasing sequence of objecti ve function v alues and finally con ver ges to the Karush-Kuhn-T ucker solution of Problem P 4 , as prov ed in [33]. Note that the optimal beamforming solution obtained by Algorithm 1 is guaranteed to be rank one. In Algorithm 1, it is necessary to find the initial feasible set of beamforming vectors w (0) . In Section V, we provide the UE selection algorithm to find the maximum number of admitted UEs. The corresponding obtained beamforming vectors can be set as the initial point of Algorithm 1. The reason is that the constraints of Problem P 3 and Problem P 7 are the same. Algorithm 1 FO T A-based Algorithm to Solve Problem P 3 1: Initialize iteration number t = 1 , error tolerance δ , small constant θ , feasible w (0) , calculate τ i,k (0) , ˜ C i (0) and ζ k (0) , calculate the objecti ve value of Problem P 6 , denoted as Ob j(0) . 2: Solve Problem P 6 by using the Lagrange dual decomposition method to obtain { w k ( t ) , ∀ k } with τ i,k ( t − 1) , ˜ C i ( t − 1) and ζ k ( t − 1) ; 3: W ith { w k ( t ) , ∀ k } , update τ i,k ( t ) , ˜ C i ( t ) and ζ k ( t ) ; 4: If | Ob j( t − 1) − Ob j( t ) | /Ob j( t ) < δ , terminate. Otherwise, set t ← t + 1 , go to step 2. D. Complexity Analysis In this subsection, we analyze the computational comple xity of Algorithm 1. For notational simplicity , we assume that candidate set size for each UE is equal to L , |I k | = L, ∀ k ∈ U . Note that in general L is much smaller than the total number of RRHs I . 18 For Algorithm 1, the main complexity lies in solving Problem P 6 by using the Lagrange dual decomposition method. In each iteration of the Lagrange dual decomposition method, the complexity is dominated by calculating w k in (33). Note that the complexity of calculating w k mainly lies in the calculation of J − 1 k ( t ) . According to [31], for a complex matrix A ∈ C N × N , the complexity of calculating A − 1 is on the order of O ( N 3 ) . Hence, the complexity of calculating w k is on the order of O ( M 3 L 3 ) . Since there are a total of K UEs, the total complexity of the Lagrange dual decomposition method in each iteration is on the order of O ( K M 3 L 3 ) . Since there are a total of (2 I + K ) dual variables, the total number of iterations required by the ellipsoid methods is upper-bounded by O [(2 I + K ) 2 ] [34]. Hence, the total complexity of the Lagrange dual decomposition method is gi ven by O [(2 I + K ) 2 K M 3 L 3 ] . Let us denote t avg as the a verage number of iterations required for Algorithm 1 to con ver ge, then the total complexity of Algorithm 1 imposed by solving Problem P 1 is expressed as T P 1 = O [ t avg (2 I + K ) 2 K M 3 L 3 ] . Simulation results show that Algorithm 1 con ver ges fast, typically 10 iterations are sufficient for the algorithm to con ver ge. V . L O W - C O M P L E X I T Y U E S E L E C T I O N A L G O R I T H M S In this section we solve the UE selection Problem P 1 . By substituting r k = R k, min , ∀ k into the fronthaul capacity constraint C3, we obtain an alternativ e optimization problem to Problem P 1 , which is expressed as follows: P 7 : max w , U ⊆U |U | s . t . C2 , C4 , C5 , (36) where C4 and C5 are given in Problem P 3 . Although Problem P 7 is not the same as the original UE selection Problem P 1 , Problem P 7 is equi valent to Problem P 1 in the sense that both problems yield the same optimal set of selected UEs, the proof of which can be found in Appendix D. It should be noted that the optimal beamforming vectors obtained from solving Problem P 7 may not be feasible for Problem P 1 . Ho wev er , the aim of solving Problem P 7 is twofold. Firstly , one can find the optimal set of selected UEs. Second, one can provide the initial feasible point for solving Problem P 3 in Stage II since both problems ha ve the same set of constraints. 19 Inspired by the UE selection method of [35], we construct an alternati ve to Problem P 7 by introducing a set of auxiliary variables { ϕ k } k ∈ ¯ U : P 8 : min { ϕ k ≥ 0 } k ∈U , w X k ∈U ϕ k (37a) s.t. C2 , C5 , (37b) C11 : w H k A k,k w k + ϕ k ≥ η k, min  w H k E k,k w k + X l 6 = k ,l ∈U w H l A l,k w l + σ 2 k  , ∀ k ∈ U . (37c) Let us denote the solution of { ϕ k } k ∈U by { ϕ ? k } k ∈U . It is readily seen that Problem P 8 is always feasible. If the optimal solutions of { ϕ ? k } k ∈U are all equal to zero, then all UEs can be admitted to the network. Otherwise, some UEs should be remov ed from the system and we reschedule them for the next opportunity . Intuiti vely , the UE having a largest v alue of ϕ ? k has a higher probability to be removed since it has the largest discrepancy from its rate tar get. Problem P 8 can be solved similarly to Problem P 3 of the abov e section, hence the details of which are omitted. There are two low-comple xity UE deletion methods. One is the successi ve UE deletion method that is provided in [10], [35]. The main idea is to remov e the UE having the largest ϕ ? k each time, until all the remaining optimal values of ϕ ? k become equal to zero. The comple xity of this algorithm increases linearly with the number of UEs, hence it is on the order of O ( K ) . This algorithm is suitable for medium-sized networks. The other technique is the bisection based search method proposed in [16]. The main idea is to sort { ϕ ? k } k ∈U in descending order ϕ ? π 1 ≥ · · · ≥ ϕ ? π K . Then, one should find a minimum L 0 for ensuring that all the UEs in U = { π L 0 +1 , · · · , π K } can be supported with L 0 = 1 , · · · , K − 1 . The bisection search method is used to iterativ ely find the optimal L 0 by updating its upper-bound and lower -bound. The complexity of the bisection based method is on the order of d log 2 (1 + K ) e , which is suitable for very dense networks supporting a large number of UEs. The details of these two algorithms are not shown here for simplicity . It should be emphasized that when using the iterati ve algorithm in the above section to solve Problem P 8 , the iterati ve procedure will terminate once the intermediate solutions of { ϕ k } k ∈U are all equal to zero. Hence, the data rates of some UEs with the obtained beamforming solution are strictly larger than their minimum rate requirements. 20 V I . S I M U L A T I O N R E S U LT S In this section, we provide simulation results to ev aluate the performance of the proposed ro- bust algorithms. T wo types of UD-CRAN networks are considered: a small UD-CRAN deployed in a square area of [400 m × 400 m] and a larger one of [700 m × 700 m]. Both the UEs and RRHs are uniformly distributed in these areas. F or the small one, the numbers of RRHs and UEs are set to I = 14 and K = 8 with the densities of 87.5 RRHs/km 2 and 50 UEs/km 2 , respecti vely . For the large one, the numbers are set to I = 42 and K = 24 with the densities of 85.7 RRHs/km 2 and 49 UEs/km 2 , respecti vely . These two scenarios comply with the ultra-dense networks in the fifth-generation (5G) wireless system [36], where the density of BSs will be up to 40-50 BSs/km 2 . The channels are generated according to the L TE specifications [37], which are composed of three elements: 1) the large-scale path loss giv en by P L = 148 . 1 + 37 . 6log 10 d (dB) , where d is the distance between a RRH and a UE in km; 2) the log-normal shadowing fading having a zero mean and 8 dB standard deviation; 3) small-scale Rayleigh fading with zero mean and unit v ariance. For ease of exposition, all UEs are assumed to hav e the same rate constraints of R min = R k, min , ∀ k , and all RRHs hav e the same power constraints of P max = P i, max , ∀ i . Furthermore, the fronthaul capacity constraints are assumed to be the same for all RRHs, i.e., C max = C i, max , ∀ i , and we consider the normalized fronthaul capacity constraints (with respect to each UE’ s rate traget), i.e., ˜ C max = C max / R min . Note that ˜ C max can be interpreted as the maximum number of UEs that can be supported by each fronthaul link. For simplicity , each UE is assumed to choose its nearest L RRHs as its serving candidate set, i.e., |I k | = L, ∀ k . The maximum pilot reuse times for small and large UD-CRANs are n max = 2 and n max = 3 , respecti vely . The total number of time slots in each time frame is T = 200 , the numbers of CDI and P A quantization bits for each RRH are set as B CDI = 4 and B P A = 2 , respectiv ely . Unless otherwise stated, the simulation parameters are giv en in T abel II and the following results are obtained by av eraging ov er 100 channel generations. A. Smaller UD-CRAN In this section, we e valuate the performance of our algorithm in the small C-RAN network, where the simulation results in Fig. 2-Fig. 9 are based on this scenario. W e first study the impact of the initial points on the con ver gence behaviour of the FO T A-based Algorithm to solv e Problem P 8 . Fig. 2 sho ws the objecti ve v alue of Problem P 8 versus the number of iterations for one randomly generated set of channel realizations for two cases of R min = 21 T ABLE II M A I N S I M U L A T I O N PAR A M E T E R S Parameters V alue Parameters V alue Number of antennas M 2 System bandwidth B 20 MHz Noise power density -174 dBm/Hz [37] Error tolerance δ 10 − 5 Small constant θ 10 − 5 Pilot power p t 200 mW Maximum transmit power P max 100 mW Rate target R min 3 bit / s / Hz Candidate size L 3 Normalized fronthaul limits ˜ C max 3 1 bit / s / Hz and R min = 2 bit / s / Hz . Since Problem P 8 is a non-con v ex problem, dif ferent initial points may lead to different solutions. T o in vestigate this effect, we consider two initialization schemes: 1) Rand-initial: In this scheme, both the power allocation and beamforming direction on each beam is randomly generated; 2) CM-initial: For this scheme, the total po wer on each RRH is equally split among its served UEs and the beamforming direction is set to be the same as its channel direction. It can be observed from Fig. 2 that the algorithm with dif ferent initial points will hav e different con vergence speeds, but con ver ge to the same objecti ve v alue. It is dif ficult to justify which initialization scheme has faster con ver gence speed as seen in Fig. 2. For both schemes, fi ve iterations are sufficient for the algorithm to con verge. When R min = 1 bit / s / Hz , the algorithm con verges to zero, which means all the UEs can be admitted. Howe ver , when R min = 2 bit / s / Hz , the algorithm con ver ges to a positiv e v alue, which implies that some UEs should be deleted. Next, we study the con ver gence behaviour of the proposed two UE selection algorithms. Specifically , Fig. 3 illustrates the number of UEs to be checked versus the number of times to solve Problem P 8 for a randomly generated network, where the successiv e UE deletion method and the bisection search method are labeled as ‘Suc’ and ‘Bis’, respecti vely . It can be found from Fig. 3 that the number of UEs to be checked for the ‘Suc’ algorithm always decreases with the number of times Problem P 8 is solved, while that of the ‘Bis’ algorithm fluctuates during the procedure. These observations are consistent with the features of these two algorithms. Interestingly , for both rate targets, the numbers of times by the ‘Bis’ algorithm are fixed to fi ve. Howe ver , the number of times Problem P 8 is solved by the ‘Suc’ algorithm depends on the rate tar gets. For the example in Fig. 3, the ‘Suc’ algorithm only needs three times when R min = 2 bit / s / Hz , while six times when R min = 6 bit / s / Hz . In Fig. 4, we plot the number of UEs admitted by the v arious algorithms versus the rate 22 0 5 1 0 1 5 2 0 0 5 1 0 1 5 2 0 2 5 3 0 3 5 4 0 4 5 5 0 O b j e c t i v e v l a u e o f P r o b l e m P 8 N u m b e r o f i t e r a t i o n s R a t e = 1 b i t / s / H z , R a n d - i n i t i a l R a t e = 1 b i t / s / H z , C M - i n i t i a l R a t e = 2 b i t / s / H z , R a n d - i n i t i a l R a t e = 2 b i t / s / H z , C M - i n i t i a l Fig. 2. Con vergence beha viour of the FO T A-based Algorithm to solve Problem P 8 under different initial points. 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 N u m b e r o f U E s t o b e c h e c k e d N u m b e r o f t i m e s t o s o l v e P r o b l e m P 8 R a t e = 2 b i t / s / H z , B i s R a t e = 2 b i t / s / H z , S u c R a t e = 6 b i t / s / H z , B i s R a t e = 6 b i t / s / H z , S u c Fig. 3. Conv ergence behaviour of the proposed two UE selection algorithms. targets for the smaller UD-CRAN. The exhausti ve UE search algorithm (labeled as ‘Exhaustiv e search’) is used as a performance benchmark, which checks all subsets of UEs and chooses the one having the largest number of admitted UEs. Note that the computational complexity of the exhausti ve search is on the order of O (2 K ) . As expected, the number of UEs admitted by all the algorithms is reduced upon increasing the UEs’ data rate targets. The exhausti ve search method performs better than the other two algorithms, which comes at the expense of a high computational complexity . Howe ver , its performance gain is negligible in the lo w rate regime. In the high data rate target regime, the performance gain of the exhausti ve search method ov er the successiv e UE deletion still remains limited to 0 . 5 . Hence, for moderate-sized UD-CRANs, the successiv e UE deletion is a good option. The bisection based search method has a modest performance loss compared to the other two algorithms. Hence, the bisection based search method is more suitable for larger UD-CRAN, as a benefit of its lowest complexity . Fig. 5 compares the execution time for various UE selection algorithms by using an E5- 1650 CPU operating at 3.5GHz. This figure shows that for the small R min , all the algorithms hav e almost the same operation time. This phenomenon is reasonable, which can be explained as follo ws. In the small R min regime, almost all the UEs can be admitted, hence both algorithms only need to solv e Problem P 8 for once. Ho wev er , for the large R min , the e xhaustiv e search algorithm needs significantly higher operation time than the proposed two UE selection algorithms, and the gap increases with R min . The ex ecution time required by the successi ve UE deletion increases with R min , and is a little higher than that of the bisection search algorithm for large R min , which may ev en decreases with R min . Fig. 6 sho ws the con vergence behaviour of the FO T A-based Algorithm under different rate 23 1 2 3 4 5 6 2 3 4 5 6 7 8 m i n R A v e r a g e n u m b e r o f a d m i t t e d U E s b i t / s / H z B i s e c t i o n s e a r c h S u c c e s s i v e U E d e l e t i o n E x h a u s t i v e s e a r c h Fig. 4. Number of UEs admitted by various algorithms versus different rate targets. 1 2 3 4 5 6 0 2 4 6 8 1 0 1 2 1 4 m i n R E x e c u t i o n t i m e ( s ) b i t / s / H z B i s e c t i o n s e a r c h S u c c e s s i v e U E d e l e t i o n E x h a u s t i v e s e a r c h Fig. 5. Execution time for various UE selection algorithms. targets, where the bisection search algorithm is employed for selecting the admitted UEs. The av erage numbers of admitted UEs for dif ferent rate targets are sho wn in this figure. It is seen from this figure that our proposed algorithm con ver ges rapidly and generally three iterations are sufficient for the algorithm to con verge under all considered rate targets, which is appealing for practical applications. Since the number of UEs admitted for the larger R min is smaller , the larger R min may not yield higher transmit power . In Fig. 7, we compare the performance of the FO T A-based Algorithm with that of the exhausti ve search method. For the latter algorithm, if |U i | ≥ ˜ C max , the algorithm checks all possible subsets of U i with size ˜ C max , and chooses the one with the minimum transmit power . It is observed from Fig. 7 that our proposed algorithm achiev es almost the same performance as that of the exhausti ve search method, which confirms the ef fectiv eness of our proposed algorithm. The corresponding execution time for these two algorithms is shown in Fig. 8. W e can observe from Fig. 8 that the ex ecution time of the exhausti ve search method requires much more time than the proposed FOT A-based Algorithm for the small R min , and almost the same for large R min . The reason can be explained as follows. For the case of small R min , more UEs can be admitted in the network, so that more RRHs will satisfy the condition |U i | ≥ ˜ C max . Then the number of checking times is large, which leads to high computational complexity . Ho wev er , for the case of lar ge R min , only a small number of UEs can be admitted as seen in Fig. 4. Then, almost all the RRHs satisfy the fronthaul capacity constraint, and it is not necessary for the exhausti ve search method to enumerate the UE-RRH associations, leading to almost the same complexity of our algorithm. Note that the time required by the proposed algorithm is within one second and the algorithm con ver ges within fi ve iterations as seen in Fig. 6, then the ex ecution 24 0 2 4 6 8 1 0 0 . 0 4 0 . 0 6 0 . 0 8 0 . 1 0 0 . 1 2 0 . 1 4 A v e r a g e n u m b e r o f a d m i t t e d U E s : 2 . 6 A v e r a g e n u m b e r o f a d m i t t e d U E s : 4 . 4 A v e r a g e n u m b e r o f a d m i t t e d U E s : 7 T o t a l t r a n s m i t p o w e r ( W ) N u m b e r o f i t e r a t i o n s R a t e = 2 b i t / s / H z R a t e = 4 b i t / s / H z R a t e = 6 b i t / s / H z Fig. 6. Con vergence beha viour of the FO T A-based Algorithm to solve Problem P 3 under different rate targets. 1 2 3 4 5 6 0 . 0 2 0 . 0 3 0 . 0 4 0 . 0 5 0 . 0 6 0 . 0 7 0 . 0 8 m i n R T o t a l t r a n s m i t p o w e r ( W ) b i t / s / H z F O T A - b a s e d A l g o r i t h m E x h a u s i t i v e s e a r c h Fig. 7. T otal transmit power versus the rate targets for the proposed algorithm and the exhausti ve search method. 1 2 3 4 5 6 0 5 1 0 1 5 2 0 2 5 3 0 3 5 m i n R E x e c u t i o n t i m e ( s ) b i t / s / H z F O T A - b a s e d A l g o r i t h m E x h a u s i t i v e s e a r c h Fig. 8. Execution time for the proposed FO T A-based Algo- rithm and the exhaustiv e search method. 1 2 3 4 5 6 7 8 1 . 0 1 . 2 1 . 4 1 . 6 1 . 8 2 . 0 2 . 2 2 . 4 2 . 6 2 . 8 A c t u a l A c h i e v a b l e D a t a R a t e ( b i t / s / H z ) U E i n d e x P r o p o s e d R o b u s t A l g . R o b u s t C H Q u a n . R o b u s t C H E s t i . C D I F e e d b a c k O n l y N o n - r o b u s t Fig. 9. The achiev able data rate for v arious algorithms, where the rate target is set as R min = 2 bit / s / Hz . time for each iteration of the FO T A-based Algorithm is within 0.2 second. No w , we study the robustness of the proposed algorithm against the follo wing four algorithms: 1) Only Robust to Channel Quantization (labeled as ‘Robust CH Quan. ’): This method only takes into account the effect of channel quantization, when designing the beamforming vectors, regardless of the channel estimation errors. 2) Only Rob ust to Channel Estimation Error (labeled as ‘Robust CH Esti. ’): As the termi- nology suggests, this method only considers the effects of channel estimation errors, and nai vely treats the feedback CDI and P A as perfect. Then, the SDP method proposed in [23] can be adopted to solve the resultant optimization problem. 3) Only Feeding back the CDI Information (labeled as ‘CDI FB Only’): In this method, each UE only feeds back the CDI inde x to the BBU pool, without considering the P A information. The A matrix deriv ed in Appendix A can be recalculated without considering the P A quantization information and the statistics of the quantization error . 4) Nonrobust Beamforming Design (labeled as ‘Non-robust’): Neither channel quantization 25 T ABLE III P O W E R C O N S U M P T I O N A N D N U M B E R S O F A D M I T T E D U E S F O R V A R I O U S M E T H O D S Proposed Robust Robust CH Quan. Robust CH Esti. CDI FB Only Non-robust Power consumption (mW) 167 130 158 149 114 errors nor channel estimation errors are considered by this algorithm and the feedback CDI and P A are reg arded as perfect. T able III reports the total power consumption required by the various methods for one random channel generation, where all eight UEs are admitted. It can be seen that our proposed algorithm has the highest power consumption, since it requires more power to compensate for both the channel estimation errors and channel quantization errors. Note that the non-robust algorithm requires the least since these errors are not considered. Howe ver , it is important to observe each UE’ s actual achiev able data rate achiev ed by these algorithms. Fig. 5 shows each UE’ s actual achie vable data rate by all the methods. It is seen that all UEs’ data requirements are satisfied by our proposed robust algorithm, which confirms the effecti veness of our proposed algorithm. For the ‘Robust CH Quan. ’ method, all UEs’ rate requirements are not fulfilled since the channel estimation errors are not considered. Hence, the channel estimation error cannot be ignored when designing the beamforming v ector due to the non-ne gligible pilot contamination. F or ‘Rob ust CH Esti. ’ method, the statistics information of channel quantization error is not considered and some UEs’ actual achie vable data rates are lower than the rate target, such as those of UE 2 and UE 8. It is also observed that some UEs hav e much higher rates, indicating that the power and spatial resources are not properly allocated by the ‘Robust CH Esti. ’ method. For the ‘CDI FB Only’ method, the actual achiev able data rates of all UEs are lower than the rate tar gets, and UE 2’ s data rate is ev en lower than 1 bit / s / Hz . This confirms the importance of feeding back the P A information for coherent transmission. Finally , some UEs’ actual achiev able data rates are below the rate target by the ‘Non-robust’ method as it nai vely treats the feedback CSI as the perfect. Howe ver , it is observed in Fig. 5 that, ev en with non-perfect P A feedback information, the performance of the ‘Non-robust’ method is e ven better than that of the ‘CDI FB Only’ method in terms of the number of UEs that satisfy the rate target. In summary , only our proposed algorithm is capable of maintaining the guaranteed rates for each UE, since it jointly considers the effects of channel estimation errors and channel quantization errors, which are (partially) ignored by the other algorithms. 26 2 4 6 8 1 0 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0 2 1 P e r f e c t - i n t r a C S I - B i s P e r f e c t - i n t r a C S I - S u c R o b u s t - B i s R o b u s t - S u c N u m b e r o f a d m i t t e d U E s N u m b e r o f C D I b i t s Fig. 10. Number of UEs admitted by v arious algorithms versus CDI quantization bits B CDI for a large UD-CRAN. 1 2 3 4 5 6 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0 N u m b e r o f a d m i t t e d U E s N u m b e r o f P A b i t s P e r f e c t - i n t r a C S I - B i s P e r f e c t - i n t r a C S I - S u c R o b u s t - B i s R o b u s t - S u c Fig. 11. Number of UEs admitted by v arious algorithms versus P A quantization bits B P A for a large UD-CRAN. B. Lar ger UD-CRANs The follo wing simulation results are based on the larger UD-CRAN. W e in vestigate the effects of different system parameters on the performance of the proposed algorithm. Since UD-CRANs will be deployed in hot spots, where the number of UEs is high and the communication resources are limited, maximizing the number of admitted users for each time frame should be a high priority . Additionally , according to the results of Fig. 6, the po wer consumption may not provide sufficient insights, since its value mainly depends on the number of UEs selected from Stage I. Hence, in the following, we only consider the performance in terms of the number of UEs that can be supported. For comparison, the performance of the algorithm having perfect intra-cluster CSI [16] is also simulated as a performance benchmark. The bisection based search method and the successi ve UE deletion method of the robust algorithm are denoted as ‘Robust-Bis’ and ‘Robust-Suc’, respectiv ely , while ‘Perfect-intraCSI-Bis’ and ‘Perfect-intraCSI-Suc’ represent the two methods for the case of perfect intra-cluster CSI. 1) Impact of the number of CDI quantization bits: Fig. 10 illustrates the impact of CDI quantization bits B CDI on the system performance. Similar observations can be found in Fig. 10 as those in Fig. ?? . Note that when B CDI increases from 2 to 6, three more UEs can be admitted by the proposed robust algorithms and will not increase for B CDI ≥ 6 . This is due to the fact that each RRH is equipped with two antennas and a small number of CDI quantization bits are suf ficient to achie ve good performance. A fixed performance gap is observed between the robust algorithm and those for perfect intra-cluster CSI when B CDI ≥ 6 due to the additional channel estimation error . 27 2) Impact of the number of P A quantization bits: No w , we study the impact of the important system parameter B P A in Fig. 11. It is seen from this figure that there is a slight increase of the number of UEs admitted by the robust algorithms when B P A increases from 1 to 3 and becomes saturated, when B P A ≥ 3 . This is a very inspiring result, implying that only a small number of bits is necessary for the P A quantization, which mitigates the feedback overhead, while guaranteeing good performance. In particular, ev en one bit used for P A quantization can achie ve 90% of the performance attained with perfect P A information. V I I . C O N C L U S I O N S This paper provided a complete frame work for dealing with the unav ailability of full CSI in user-centric UD-CRANs, where only partial inter-cluster CSI and a quantized version of the intra-cluster CSI are av ailable at the BBU pool. W e deriv ed the achie vable data rate expression by exploiting the statistical characteristics of various channel uncertainties. Based on this, we dev el- oped a low-comple xity robust beamforming algorithm for minimizing the total transmit po wer , while guaranteeing each user’ s rate requirement and fronthaul capacity constraints. In addition, to ensure the feasibility of the problem, a pair of lo w-complexity user selection algorithms are provided as well. Simulation results show that our proposed robust algorithm significantly outperforms the existing state-of-art algorithms in terms of providing the required guaranteed quality-of-service (QoS) for the users. Furthermore, extensi ve simulations results are provided to study the impact of dif ferent system parameters on the performance. One new important observ ation was made: One bit for quantizing the each RRH’ s P A is enough to obtain a large proportion of the performance obtained with perfect P A information. A P P E N D I X A D E R I V A T I O N S O F A k,k A N D A l,k A. Derivation of A k,k W e first deri ve the expression of A k,k , which is equal to A k,k = E  ˆ g k,k ˆ g H k,k  . Denote indices of I k as I k = { s k 1 , · · · , s k |I l | } . T o calculate A k,k , we first hav e to calculate E n ˆ h s k i ,k ˆ h H s k i ,k o and E n ˆ h s k i ,k ˆ h H s k l ,k o , where i 6 = l . The deriv ation of E n ˆ h s k i ,k ˆ h H s k i ,k o is based on the following lemma. 28 Lemma 1: The random vectors and variables ha ve the following properties: E n q s k i ,k u H s k i ,k o = 0 , (A.1) E n a s k i ,k o = 2 B CDI s k i ,k β  2 B CDI s k i ,k , M M − 1  , ρ s k i ,k , (A.2) E  u i,k u H i,k  = 1 M − 1  I M − q s k i ,k q H s k i ,k  , O s k i ,k , (A.3) where β ( a, b ) is the beta function. Proof: (A.1) follows since q s k i ,k is orthogonal to u s k i ,k , while (A.2) and (A.3) are from [38] and [39], respectiv ely .  Based on Lemma 1, we embark on deriving E n ˆ h s k i ,k ˆ h H s k i ,k o . According to Section-II-B and Section-II-C, the quantized channel ˆ h s k i ,k can be rewritten as ˆ h s k i ,k =    ˆ h s k i ,k     q 1 − a s k i ,k e j φ s k i ,k q s k i ,k + p a s k i ,k u s k i ,k  . (A.4) Then E n ˆ h s k i ,k ˆ h H s k i ,k o can be deriv ed as E n ˆ h s k i ,k ˆ h H s k i ,k o = E     ˆ h s k i ,k    2   E n 1 − a s k i ,k o q s k i ,k q H s k i ,k + E n a s k i ,k o E n u s k i ,k u H s k i ,k o (A.5) = ω s k i ,k M  1 − ρ s k i ,k  q s k i ,k q H s k i ,k + ρ s k i ,k O s k i ,k  , (A.6) where (A.1) is used in (A.5), (A.2) and (A.3) are used in (A.6). Note that the P A quantization does not affect the value of E n ˆ h s k i ,k ˆ h H s k i ,k o . T o calculate E n ˆ h s k i ,k ˆ h H s k l ,k o with i 6 = l , the follo wing lemma should be employed. Lemma 2: The random vectors and variables in (A.4) hav e the following properties: E n e j φ s k i ,k o = e j ˆ φ s k i ,k 2 B P A s k i ,k π sin π 2 B P A s k i ,k ! , e j ˆ φ s k i ,k ξ s k i ,k (A.7) E n    ˆ h s k i ,k    o = √ ω s k i ,k Γ  M + 1 2  Γ ( M ) , ς s k i ,k (A.8) E n q 1 − a s k i ,k o = 2 B CDI s k i ,k X m =1 C m 2 B CDI s k i ,k ( − 1) m +1 m ( M − 1) β  m ( M − 1) , 3 2  , Ω s k i ,k , (A.9) where C r n = n ! r !( n − r )! . 29 Proof: W e first prove equality (A.7). Specifically , we hav e E n e j φ s k i ,k o = e j ˆ φ s k i ,k E  e j ˜ φ s k i ,k  = e j ˆ φ s k i ,k Z π 2 B P A s k i ,k − π 2 B P A s k i ,k e j x 2 B P A s k i ,k 2 π dx = e j ˆ φ s k i ,k Z π 2 B P A s k i ,k − π 2 B P A s k i ,k (cos x + j sin x ) 2 B P A s k i ,k 2 π dx = e j ˆ φ s k i ,k 2 B P A s k i ,k π sin π 2 B P A s k i ,k ! . For (A.8), let us define _ h s k i ,k ∆ = ˆ h s k i ,k . √ ω s k i ,k . Then,    _ h s k i ,k    obeys a scaled (by a factor of 1  √ 2 ) chi distrib ution with 2 M degrees of freedom. Hence, E n    _ h s k i ,k    o = Γ( M + 1 2 ) Γ( M ) [29] and (A.8) is prov ed. Finally , we embark on proving (A.9). Define ν , 1 − a s k i ,k . According to Lemma 1 in [38], the probability density function of ν with M antennas using a 2 B R VQ codebook is given by f υ ( υ ) = 2 B X i =1 C i 2 B ( − 1) i +1 i ( M − 1)(1 − υ ) i ( M − 1) − 1 . (A.10) Define x , √ υ ∈ [0 , 1] , then the Jacobian of the transformation is giv en by J = dx dυ = 1 2 υ − 1 2 . Hence, the PDF of x is gi ven by f x ( x ) = 2 2 B X i =1 C i 2 B ( − 1) i +1 i ( M − 1)  1 − x 2  i ( M − 1) − 1 x. (A.11) Then the expectation of x is giv en by E { x } = Z 1 0 xf x ( x ) dx = 2 2 B X m =1 C m 2 B ( − 1) m +1 m ( M − 1) Z 1 0  1 − x 2  m ( M − 1) − 1 x 2 dx = 2 B X m =1 C m 2 B ( − 1) m +1 m ( M − 1) β  m ( M − 1) , 3 2  where the last equality is due to the fact that the beta function can be represented as [28], [40] β  c, a b  = b Z 1 0 x a − 1  1 − x b  c − 1 dx (A.12) with a = 3 , b = 2 and c = m ( M − 1) . Hence, (A.9) follo ws with B = B CDI s k i ,k .  30 Based on Lemma 2, E n ˆ h s k i ,k ˆ h H s k l ,k o , i 6 = l , can be calculated as E n ˆ h s k i ,k ˆ h H s k l ,k o = E n ˆ h s k i ,k o E n ˆ h H s k l ,k o (A.13) = ς s k i ,k ς s k l ,k Ω s k i ,k Ω s k l ,k ξ s k i ,k ξ s k l ,k e j  ˆ φ s k i ,k − ˆ φ s k l ,k  q s k i ,k q H s k l ,k (A.14) where (A.13) follows since h s k i ,k is independent of h s k l ,k , (A.14) follows since E n u s k i ,k o = 0 M , and Lemma 2 is used.  Based on the abov e results, A k,k is giv en by A k,k =      ( A k,k ) 1 , 1 · · · ( A k,k ) 1 , |I k | . . . . . . . . . ( A k,k ) |I k | , 1 · · · ( A k,k ) |I k | , |I k |      , (A.15) where ( A k,k ) i,l ∈ C M × M , i, l ∈ 1 , · · · , |I k | is the block matrix of A k,k at the i th row and l th column. If i = l , then ( A k,k ) i,l = E n ˆ h s k i ,k ˆ h H s k i ,k o . Otherwise, ( A k,k ) i,l = E n ˆ h s k i ,k ˆ h H s k l ,k o . B. Derivation of A l,k Let us denote the indices of I l by I l = { s l 1 , · · · , s l |I l | } . Then A l,k can be represented as A l,k =      ( A l,k ) 1 , 1 · · · ( A l,k ) 1 , |I l | . . . . . . . . . ( A l,k ) |I l | , 1 · · · ( A l,k ) |I l | , |I l |      , (A.16) where ( A l,k ) i,m = E n h s l i ,k h H s l m ,k o . T o deri ve E n h s l i ,k h H s l m ,k o , we should discuss four cases: 1) s l i , s l m ∈ I k , i 6 = m ; 2) s l i , s l m ∈ I k , i = m ; 3) s l i , s l m / ∈ I k , i = m ; 4) s l i / ∈ I k or s l m / ∈ I k , i 6 = m . For Case 1), both RRH s l i and RRH s l m belong to UE k ’ s cluster I k , but they are not the same RRH. Then, h s l i ,k can be rewritten as h s l i ,k =    ˆ h s l i ,k     q 1 − a s l i ,k e j φ s l i ,k q s l i ,k + p a s l i ,k u s l i ,k  + e s l i ,k . (A.17) By exploiting the fact that E n u s l i ,k o = 0 M , E n e s l i ,k o = 0 M and Lemma 2, E n h s l i ,k h H s l m ,k o can be deriv ed similarly to E n ˆ h i,k ˆ h H l,k o in (A.14), which is giv en by E n h s l i ,k h H s l m ,k o = ς s l i ,k ς s l m ,k Ω s l i ,k Ω s l m ,k ξ s l i ,k ξ s l m ,k e j  ˆ φ s l i ,k − ˆ φ s l m ,k  q s l i ,k q H s l m ,k . (A.18) For Case 2), RRH s l i and RRH s l m represent the same RRH belonging to I k . By using (A.17) and the facts that E n q s l i ,k e H s l i ,k o = 0 , E n u s l i ,k e H s l i ,k o = 0 and Lemma 1, E n h s l i ,k h H s l m ,k o can be deri ved similarly to E n ˆ h i,k ˆ h H i,k o in (A.6), which is giv en by E n h s l i ,k h H s l m ,k o = ω s l i ,k M h 1 − ρ s l i ,k  q s l i ,k q H s l i ,k + ρ s l i ,k O s l i ,k i + δ s l i ,k I M , (A.19) 31 where δ s l i ,k is the channel estimation error giv en in (5). For Case 3), both RRHs represent the same RRH that is not in UE k ’ s cluster I k . Then, E n h s l i ,k h H s l m ,k o is giv en by E n h s l i ,k h H s l m ,k o = α s l i ,k I M , since we ha ve assumed that only large- scale fading gains are av ailable for the out-cluster RRHs in the BBU pool. For the latter case, it can be readily sho wn that E n h s l i ,k h H s l m ,k o = 0 M , since at least one RRH does not belong to I k and they are not the same RRH. A P P E N D I X B D E R I V A T I O N O F I N E Q U A L I T Y ( 2 4 ) Before deriving (24), we first introduce the concept of T -transform. For a complex vector x ∈ C n and a complex matrix X ∈ C n × m , a one-to-one mapping function of C n → R 2 n and C n × m → R 2 n × 2 m , is defined as T ( x ) =   Re( x ) Im( x )   , T ( X ) =   Re( x ) − Im( x ) Im( x ) Re( x )   . (B.1) The T -transform establishes the relationship between complex-v alued vectors or matrices and their counterparts, which facilitates the deri vation of the first-order T aylor expansion for a complex-v alued function. Let us define ˆ x ∆ = T ( x ) and ˆ X ∆ = T ( X ) as the T -transform results of the complex-v alued vector x and matrix X , respecti vely . Then we hav e the following tw o properties for the T - transform [26]: y = Ax ⇔ ˆ y = ˆ Aˆ x , (B.2) Re  x H y  = ˆ x T ˆ y . (B.3) Based on the above results, w H k A k,k w k can be equiv alently written for the real-valued vector ˆ w k and for the matrix A k,k as w H k A k,k w k ( a ) = Re  w H k A k,k w k  ( b ) = ˆ w T k T ( A k,k w k ) ( c ) = ˆ w T k ˆ A k,k ˆ w k , g ( ˆ w k ) , (B.4) 32 where (a) follo ws since w H k A k,k w k is a real value, (b) follows by using (B.3) and (c) follows by using (B.2). Since g ( ˆ w k ) is a con ve x function of ˆ w k , we hav e w H k A k,k w k = g ( ˆ w k ) (B.5) ≥ g [ ˆ w k ( t )] + ∇ g [ ˆ w k ( t )] H [ ˆ w k − ˆ w k ( t )] (B.6) = g [ ˆ w k ( t )] + 2 ˆ w H k ˆ A k,k [ ˆ w k − ˆ w k ( t )] (B.7) = w H k ( t ) A k,k w k ( t ) + 2Re  w H k ( t ) A k,k ( w k − w k ( t ))  , (B.8) where (B.7) follows since ∇ g ( ˆ w k ( t )) = 2 ˆ A k,k ˆ w k ( t ) , (B.2) and (B.3) are used in (B.8) similarly to (B.4). Hence, the proof is complete. A P P E N D I X C P RO O F O F S L A T E R ’ S C O N D I T I O N O F P RO B L E M P 6 W ithout loss of generality , we consider Problem P 6 in the first iteration of Algorithm 1, i.e., t = 1 . As explained in Subsection IV -C, the beamforming obtained from solving Problem P 7 in Section V (denoted as w ? ) is set as the initial beamforming in Algorithm 1, i.e., w (0) = w ? . Hence, w ? is a feasible solution to Problem P 6 . The idea of the proof is to construct a new set of beam-vectors from w ? such that Constraints C2, C7 and C8 in Problem P 6 hold with strict inequalities [31]. As stated at the end of Section V, the iterativ e algorithm to solve Problem P 7 will terminate once the intermediate solutions of { ϕ k } k ∈U are all equal to zero. Hence, the data rates achiev ed by some UEs with the obtained solution w ? will be strictly larger than its minimum rate requirements. W e assume that UE k is one of those UEs, which satisfies 2Re  w H k (0) A k,k w ? k  − ζ k (0) > η k, min  w ? H k E k,k w ? k + X l 6 = k ,l ∈U w ? H l A l,k w ? l + σ 2 k  . (C.1) W e then scale UE k ’ s beam-vector by a constant 0 < √ χ k < 1 and denote the ne w beam-vector as w # k = √ χ k w ? k . One should find such a χ k that satisfies the follo wing inequality: 2Re  w H k (0) A k,k w # k  − ζ k (0) > η k, min  w #H k E k,k w # k + X l 6 = k ,l ∈U w ? H l A l,k w ? l + σ 2 k  . (C.2) By substituting the expressions of ζ k (0) and w # k into (C.2), we ha ve w ? H k A k,k w ? k > η k, min " χ k 2 √ χ k − 1 w ? H k E k,k w ? k + 1 2 √ χ k − 1 X l 6 = k ,l ∈U w ? H l A l,k w ? k + σ 2 k !# . (C.3) 33 Hence, when 1 4 < χ k < 1 , 0 < χ k 2 √ χ k − 1 < 1 and 0 < 1 2 √ χ k − 1 < 1 hold. Then, one can always find a χ k that is very close to one such that (C.3) is satisfied. By keeping the beam-vectors of all other UEs fixed, we immediately hav e 2Re  w H l (0) A l,l w ? l  − ζ l (0) > η l, min w ? H l E l,l w ? l + X j 6 = l ,k,j ∈U w ? H j A j,l w ? j + χ k w ? H k A k,l w ? k + σ 2 l ! , ∀ l 6 = k , l ∈ U . (C.4) Hence, Constraint C8 in Problem P 6 with the new set of beam-vectors { w # k , w ? l , ∀ l 6 = k , } hold with strict inequality for all UEs. The remaining task is to prove that Constraint C2 and C6 hold with strict inequality . Unfor- tunately , with the ne w beam-vectors { w # k , w ? l , ∀ l 6 = k , } , we only guarantee the following strict inequalities corresponding to the RRHs in I k : X l 6 = k ,l ∈U i   w ? i,l   2 + χ k   w ? i,k   2 < P i, max , i ∈ I k , (C.5) X l 6 = k ,l ∈U i τ i,l (0)   w ? i,l   2 + τ i,k (0) χ k   w ? i,k   2 < ˜ C i (0) , i ∈ I k . (C.6) T o deal with this issue, we randomly select one RRH from I \I k , say RRH i . Then, randomly select one UE served by RRH i , say UE l . W e perform the same scaling operation as UE k for UE l , i.e., w # l = √ χ l w ? l . One can find a χ l ( 1 4 < χ l < 1 ) such that 2Re  w H l (0) A l,l w # l  − ζ l (0) > η l, min w #H l E l,l w # l + X j 6 = l ,k,j ∈U w ? H j A j,l w ? j + χ k w ? H k A k,l w ? k + σ 2 l ! . (C.7) Obviously , with the new set of beam-vectors { w # k , w # l , w ? j , ∀ j 6 = k , j 6 = l } , Constraint C8 corre- sponding to the other UEs hold with strict inequality . Then, Constraint C2 and C6 corresponding to the RRHs in I l hold with strict inequality . Repeat this step until Constraint C2 and C6 of all the RRHs in I hold with strict inequality . Then, the final constructed set of beam-vectors remain in the interior of the feasible region of Problem P 6 . Hence, according to Page 226 in [31], the Slater’ s condition of Problem P 6 is satisfied. For Problem P 6 in the subsequent iterations of Algorithm 1, the similar proof applies. A P P E N D I X D T H E E Q U I V A L E N C E B E T W E E N P RO B L E M P 1 A N D P R O B L E M P 7 Denote the optimal solution of Problem P 1 and Problem P 7 as {U ? , w ? } and  U # , w #  , respecti vely . W e first prove that the optimal solution of Problem P 1 is feasible for Problem P 7 . 34 It is obvious that {U ? , w ? } is feasible for Constraints C2 and C4 of Problem P 7 since Constraint C4 is the equiv alent transformation of Constraint C1 in Problem P 1 . Now we show that {U ? , w ? } is also feasible for Problem P 7 . Specifically , we ha ve the following chain inequalities: X k ∈U ? i ε    w ? i,k   2  R k, min ≤ X k ∈U ? i ε    w ? i,k   2  r ? k ≤ C i, max , ∀ i ∈ I , (D.1) where r ? k is obtained by substituting w ? into (9). Then, {U ? , w ? } satisfies Constraint C5 of Problem P 7 . Hence, {U ? , w ? } is feasible for Problem P 7 . For Problem P 7 , if r # k = R k, min , ∀ k , where r # k is obtained by substituting w # into (9), then we have X k ∈U # i ε     w # i,k    2  r # k = X k ∈U # i ε     w # i,k    2  R k, min ≤ C i, max , ∀ i ∈ I (D.2) which satisfies Constraint C3 of Problem P 1 . It is readily verified that  U # , w #  satisfies Constraints C1 and C2 of Problem P 1 . Hence,  U # , w #  is also feasible for Problem P 1 . On the other hand, if there e xists at least one UE whose data rate is strictly lar ger than its rate requirement, i.e., r # k > R k, min . Then, we can adopt the iterati ve scaling algorithm gi ven in Appendix A of [23] to construct another set of beamforming vectors w ## such that r ## k = R k, min , ∀ k , where r ## k is obtained by substituting w ## into (9). As a result,  U # , w ##  is feasible for Problem P 1 . Based on the above discussions, we arri ve at the conclusion that Problem P 1 and Problem P 7 can achiev e the same optimal set of selected UEs. R E F E R E N C E S [1] J. Andre ws, S. Buzzi, W . Choi, S. Hanly , A. Lozano, A. Soong, and J. Zhang, “What will 5G be?” IEEE J. Sel. Areas Commun. , vol. 32, no. 6, pp. 1065–1082, Jun. 2014. [2] M. Peng, Y . Sun, X. Li, Z. Mao, and C. W ang, “Recent advances in cloud radio access networks: System architectures, key techniques, and open issues, ” IEEE Commun. Surveys T ut. , vol. 18, no. 3, pp. 2282–2308, thirdquarter 2016. [3] Y . Shi, J. Zhang, K. B. Letaief, B. Bai, and W . Chen, “Large-scale conv ex optimization for ultra-dense cloud-RAN, ” IEEE W ir eless Commun. Mag. , vol. 22, no. 3, pp. 84–91, Jun. 2015. [4] R. G. Stephen and R. Zhang, “Joint millimeter-wa ve fronthaul and OFDMA resource allocation in ultra-dense CRAN, ” IEEE T rans. Commun. , vol. 65, no. 3, pp. 1411–1423, Mar . 2017. [5] Y . Shi, J. Zhang, and K. Letaief, “Group sparse beamforming for green Cloud-RAN, ” IEEE T rans. W ir eless Commun. , vol. 13, no. 5, pp. 2809–2823, May 2014. [6] B. Dai and W . Y u, “Energy efficienc y of downlink transmission strategies for cloud radio access networks, ” IEEE J. Sel. Ar eas Commun. , vol. 34, no. 4, pp. 1037–1050, Apr . 2016. [7] ——, “Sparse beamforming and user-centric clustering for downlink cloud radio access network, ” IEEE Access, , vol. 2, pp. 1326–1339, Oct. 2014. 35 [8] V . N. Ha, L. B. Le, and N. D. Dao, “Coordinated multipoint transmission design for Cloud-RANs with limited fronthaul capacity constraints, ” IEEE T rans. V eh. T echnol. , vol. 65, no. 9, pp. 7432–7447, Sep. 2016. [9] A. Abdelnasser and E. Hossain, “Resource allocation for an OFDMA Cloud-RAN of small cells underlaying a macrocell, ” IEEE T rans. Mobile Comput. , vol. 15, no. 11, pp. 2837–2850, Nov . 2016. [10] C. Pan, H. Zhu, N. J. Gomes, and J. W ang, “Joint precoding and RRH selection for user-centric green MIMO C-RAN, ” IEEE T rans. W ireless Commun. , vol. 16, no. 5, pp. 2891–2906, May 2017. [11] P . Luong, L. N. Tran, C. Despins, and F . Gagnon, “Joint beamforming and remote radio head selection in limited fronthaul C-RAN, ” in 2016 IEEE 84th V ehicular T echnology Confer ence (VTC-F all) , Sept 2016, pp. 1–6. [12] P . Luong, F . Gagnon, C. Despins, and L. N. T ran, “Optimal joint remote radio head selection and beamforming design for limited fronthaul C-RAN, ” IEEE T rans. Signal Pr ocess. , vol. 65, no. 21, pp. 5605–5620, Nov . 2017. [13] Y . Shi, J. Zhang, and K. B. Letaief, “CSI overhead reduction with stochastic beamforming for cloud radio access networks, ” in 2014 IEEE International Confer ence on Communications (ICC) , 2014, pp. 5154–5159. [14] T . R. Lakshmana, A. T olli, R. Dev assy , and T . Svensson, “Precoder design with incomplete feedback for joint transmission, ” IEEE T rans. W ireless Commun. , vol. 15, no. 3, pp. 1923–1936, Mar . 2016. [15] C. Fan, Y . J. Zhang, and X. Y uan, “Dynamic nested clustering for parallel PHY-layer processing in Cloud-RANs, ” IEEE T rans. W ir eless Commun. , vol. 15, no. 3, pp. 1881–1894, Mar . 2016. [16] C. Pan, H. Zhu, N. J. Gomes, and J. W ang, “Joint user selection and energy minimization for ultra-dense multi-channel C-RAN with incomplete CSI, ” IEEE J. Sel. Ar eas Commun. , vol. 35, no. 8, pp. 1809–1824, Aug. 2017. [17] Z. Bai, “Evolved uni versal terrestrial radio access (E-UTRA); physical layer procedures, ” 3GPP , Sophia Antipolis, T echnical Specification 36.213 v . 11.4. 0 , 2013. [18] T . X. Tran, A. Hajisami, and D. Pompili, “QuaRo: A queue-aware robust coordinated transmission strategy for downlink C-RANs, ” in 2016 13th Annual IEEE International Confer ence on Sensing, Communication, and Networking (SECON) , June 2016, pp. 1–9. [19] V . K. N. Lau, F . Zhang, and Y . Cui, “Low complexity delay-constrained beamforming for multi-user mimo systems with imperfect csit, ” IEEE T rans. Signal Pr ocess. , vol. 61, no. 16, pp. 4090–4099, Aug. 2013. [20] Z. Chen, X. Hou, and C. Y ang, “Training resource allocation for user-centric base station cooperation networks, ” IEEE T rans. V eh. T echnol. , vol. 65, no. 4, pp. 2729–2735, Apr . 2016. [21] J. Zhang, X. Y uan, and Y . J. Zhang, “Locally orthogonal training design for Cloud-RANs based on graph coloring, ” IEEE T rans. W ir eless Commun. , vol. 16, no. 10, pp. 6426–6437, Oct. 2017. [22] T . M. Nguyen and L. B. Le, “Joint pilot assignment and resource allocation in multicell massive MIMO network: Throughput and energy efficienc y maximization, ” in 2015 IEEE W ir eless Communications and Networking Conference (WCNC) , March 2015, pp. 393–398. [23] C. Pan, H. Mehrpouyan, Y . Liu, M. Elkashlan, and N. Arumugam, “Joint pilot allocation and robust transmission design for ultra-dense user-centric TDD C-RAN with imperfect CSI, ” IEEE T rans. W ir eless Commun. , vol. 17, no. 3, pp. 2038–2053, Mar . 2018. [24] D. Su, X. Hou, and C. Y ang, “Quantization based on per-cell codebook in cooperativ e multi-cell systems, ” in 2011 IEEE W ir eless Communications and Networking Conference , March 2011, pp. 1753–1758. [25] F . Y uan and C. Y ang, “Bit allocation between per-cell codebook and phase ambiguity quantization for limited feedback coordinated multi-point transmission systems, ” IEEE T rans. Commun. , vol. 60, no. 9, pp. 2546–2559, Sep. 2012. [26] E. T elatar, “Capacity of multi-antenna Gaussian channels, ” Eur opean transactions on telecommunications , vol. 10, no. 6, pp. 585–595, 1999. [27] T . Kailath, A. H. Sayed, and B. Hassibi, Linear estimation . Prentice Hall Upper Saddle Riv er , NJ, 2000, vol. 1. 36 [28] N. Jindal, “MIMO broadcast channels with finite-rate feedback, ” IEEE T rans. Inf. Theory , vol. 52, no. 11, pp. 5045–5060, Nov . 2006. [29] J. Jose, A. Ashikhmin, T . L. Marzetta, and S. V ishwanath, “Pilot contamination and precoding in multi-cell TDD systems, ” IEEE T rans. W ireless Commun. , vol. 10, no. 8, pp. 2640–2651, Aug. 2011. [30] T . V . Chien, E. Bjornson, and E. G. Larsson, “Joint power allocation and user association optimization for massive MIMO systems, ” IEEE T rans. W ireless Commun. , vol. 15, no. 9, pp. 6384–6399, Sep. 2016. [31] S. Boyd and L. V andenberghe, Conve x optimization . Cambridge univ ersity press, 2004. [32] Q. T . Dinh and M. Diehl, “Local con vergence of sequential conv ex programming for noncon vex optimization, ” in Recent Advances in Optimization and its Applications in Engineering . Springer, 2010, pp. 93–102. [33] C. Pan, W . Xu, W . Zhang, J. W ang, H. Ren, and M. Chen, “W eighted sum energy efficienc y maximization in ad hoc networks, ” IEEE W ireless Commun. Lett. , vol. 4, no. 3, pp. 233–236, Jun. 2015. [34] A. Ben-T al and A. Nemirovski, Lectur es on modern con vex optimization: analysis, algorithms, and engineering applications . SIAM, 2001. [35] E. Matskani, N. D. Sidiropoulos, Z.-Q. Luo, and L. T assiulas, “Conv ex approximation techniques for joint multiuser downlink beamforming and admission control, ” IEEE T rans. W ireless Commun. , vol. 7, no. 7, pp. 2682–2693, Jul. 2008. [36] X. Ge, S. T u, G. Mao, C. X. W ang, and T . Han, “5G ultra-dense cellular networks, ” IEEE W ireless Commun. , vol. 23, no. 1, pp. 72–79, Feb . 2016. [37] E. U. T . R. Access, “Further advancements for E-UTRA physical layer aspects, ” 3GPP TR 36.814, T ech. Rep. , 2010. [38] C. K. Au-Y eung and D. J. Love, “On the performance of random vector quantization limited feedback beamforming in a MISO system, ” IEEE T rans. W ir eless Commun. , vol. 6, no. 2, pp. 458–462, Feb. 2007. [39] C. Zhang, W . Xu, and M. Chen, “Robust MMSE beamforming for multiuser MISO systems with limited feedback, ” IEEE Signal Pr ocess. Lett. , vol. 16, no. 7, pp. 588–591, Jul. 2009. [40] A. K. Gupta and S. Nadarajah, Handbook of beta distribution and its applications . CRC press, 2004.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment