Implicit Neural Representation for Multiuser Continuous Aperture Array Beamforming

Implicit neural representations (INRs) can parameterize continuous beamforming functions in continuous aperture arrays (CAPAs) and thus enable efficient online inference. Existing INR-based beamforming methods for CAPAs, however, typically suffer fro…

Authors: Shiyong Chen, Jia Guo, Shengqian Han

Implicit Neural Representation for Multiuser Continuous Aperture Array Beamforming
1 Implicit Neural Representation for Multiuser Continuous Aperture Array Beamforming Shiyong Chen, Student Member , IEEE, Jia Guo, Member , IEEE, and Shengqian Han, Senior Member , IEEE Abstract —Implicit neural repr esentations (INRs) can param- eterize continuous beamforming functions in continuous aper - ture arrays (CAP As) and thus enable efficient online inference. Existing INR-based beamforming methods f or CAP As, howev er , typically suffer from high training complexity and limited gen- eralizability . T o address these issues, we first deri ve a closed- form expr ession for the achievable sum rate in multiuser multi- CAP A systems where both the base station and the users ar e equipped with CAP As. F or sum-rate maximization, we then develop a functional weighted minimum mean-squared error (WMMSE) algorithm by using orthonormal basis expansion to con vert the functional optimization into an equivalent parameter optimization problem. Based on this functional WMMSE algo- rithm, we further pr opose BeamINR, an INR-based beamforming method implemented with a graph neural netw ork to exploit the permutation-equivariant structure of the optimal beamf orming policy; its update equation is designed from the structure of the functional WMMSE iterations. Simulation results sho w that the functional WMMSE algorithm achie ves the highest sum rate at the cost of high online complexity . Compared with baseline INRs, BeamINR substantially reduces inference latency , lowers training complexity , and generalizes better across the number of users and carrier frequency . Index T erms —Continuous aperture array (CAP A), beamform- ing, WMMSE, implicit neural repr esentation. I . I N T RO D U C T I O N M ASSIVE multi-input-multi-output (MIMO) is a key technology for improving spectral and energy ef fi- ciency [1]. Howe ver , traditional MIMO systems based on spatially discrete arrays (SPDAs) face two major scalability limitations. First, each antenna requires dedicated hardware, such as a radio-frequency port in fully digital arrays, which leads to high fabrication cost and ener gy consumption. Second, maintaining half-wav elength spacing between antennas results in large array sizes and creates implementation challenges [2]. T o address these limitations, alternati ve array architectures such as holographic MIMO (HMIMO) [3]–[5] and recon- figurable intelligent surfaces [6], [7] use low-cost radiating elements on compact metasurfaces and allow denser element deployment. A continuous aperture array (CAP A) can be viewed as the limiting case of a HMIMO system with infinitely many radiating elements and coupled circuits. Under this model, beamforming design shifts from finite-dimensional beamforming vectors to continuous current distributions over Shiyong Chen is with the School of Electronics and Information Engineering, Beihang Uni versity , Beijing 100191, China (email: shiy- ongchen@buaa.edu.cn). Jia Guo is with the School of Electronic Engineering and Computer Science, Queen Mary University of London, London E1 4NS, U.K. (email: jia.guo@qmul.ac.uk). Shengqian Han is with the School of Electronics and Information Engineer - ing, Beihang University , Beijing 100191, China (email: sqhan@buaa.edu.cn). the aperture. The resulting optimization problems are therefore noncon vex functional problems that are difficult to solve with con ventional methods [8]. Dev eloping efficient beamforming algorithms is therefore essential to fully exploit the potential of CAP As. Existing numerical methods for CAP A beamforming can be broadly di- vided into approximation-based methods and direct functional methods. Approximation-based methods represent channel and beamforming functions with finite orthogonal Fourier bases and then solve an equiv alent finite-dimensional problem [9]– [11]. Although con venient, this strategy introduces approxi- mation loss. Moreov er, the number of required Fourier bases grows with the carrier frequency and aperture size, which increases the problem dimension and online complexity in large-scale CAP A systems [12]–[14]. Direct functional meth- ods based on the calculus of v ariations av oid approximation loss [15], [16], but the y still require iterati ve functional updates and repeated integral ev aluations, which limits their real- time applicability . A. Related W orks 1) Learning Beamforming P olicy for CAP A: Recent stud- ies have explored deep neural networks (DNNs) to reduce the computational cost of CAP A beamforming. The main challenge is that CAP A beamforming is inherently func- tional: both the channel response and the beamforming vari- ables are continuous functions, which leads to an infinite- dimensional formulation, whereas standard DNN mappings are finite-dimensional. In the multiuser single-CAP A do wn- link, where a CAP A-equipped BS serves multiple single- antenna users, the optimal beamforming function admits a weighted-sum representation of channel-response functions, which enables indirect learning through a finite-dimensional weight vector [17]. Howe ver , this structure does not e xtend to multi-CAP A systems with CAP A-based receiv ers, in which the weights themselves become functions. For the single- user multi-CAP A system, where both the BS and the user are equipped with CAP As, an implicit neural representation (INR) was emplo yed to directly parameterize the beamforming function [18]. Ho wever , that INR is implemented with a fully connected neural network (FNN), which leads to low learning efficienc y , high training complexity , and limited generalizabil- ity with respect to the number of users. 2) Methods to Impr ove Learning Efficiency: Improving DNN learning efficienc y requires both reducing training com- plexity and enhancing generalization. One ef fective strategy is to exploit mathematical properties of the target policy , such as PE, to constrain the hypothesis space. This idea has 2 been used to design GNNs for power allocation [19]–[21], user scheduling [22], [23], and beamforming [24]. Prior work shows that exploiting PE can substantially improv e learning efficienc y . For example, the GNNs in [20]–[23] reduce train- ing complexity and, in some cases, improve generalization by incorporating PE into the policy design. Similarly , [19] shows that GNNs can be more learning-efficient than con- ventional DNNs: although they may require longer training in some cases, they generalize better across problem scales and therefore reduce the need for retraining. Howe ver , the benefits of PE are not uni versal. In [22], [23], although PE- aware GNNs reduce training complexity , they still exhibit poor generalization performance. Another effecti ve way to improv e learning efficienc y is to incorporate mathematical models into model-dri ven DNNs [5], [25]–[27], which simplifies the mapping that the network needs to learn. Existing methods mainly fall into three cat- egories: deep unfolding [28]–[30], designing the DNN output based on the structure of the optimal solution [25], [31], [32], and using mathematical models to guide the design of each DNN layer [5], [26], [27], [33]. Deep unfolding learns spe- cific operations or parameters within iterati ve algorithms, or introduces additional learnable parameters into each iteration. For example, [28], [29] unfolds the WMMSE algorithm for beamforming optimization, thereby reducing the complexity of matrix in version, while [30] unfolds projected gradient descent by learning step sizes for hybrid beamforming. Although deep unfolding networks can con ver ge faster than con ventional iterativ e algorithms, they inherit the numerical computations of the original algorithms, resulting in high training and inference complexity . Another line of work exploits the structure of the optimal policy to simplify learning. In [25], [31], [32], beamforming learning is reduced to power allocation learning. This improv es learning performance and reduces training complexity , but remains problem-specific because it relies on the structure of the optimal solution. Mathematical models can also directly guide DNN design. F or instance, [26] de velops a GNN based on the iterative T aylor e xpansion of the matrix pseudo-in verse, improving both learning and generalization performance. Similarly , [5], [27], [33] design GNN update equations according to the iterativ e equations of gradient-based beamforming optimization algorithms. Although these model- driv en GNNs show strong generalization in SPDA systems, they are tailored to specific problem settings and are not directly applicable to CAP A systems. B. Motivation and Contributions Existing INR-based methods for CAP A beamforming typ- ically suf fer from high training complexity and limited gen- eralization. T o address these limitations, we aim to design an INR framework for CAP A beamforming with higher learning efficienc y . Specifically , we first dev elop a functional WMMSE algorithm for sum-rate maximization, and then design Beam- INR based on this functional WMMSE algorithm. BeamINR is implemented as a GNN to exploit the PE property of the optimal beamforming policy , while its update rule is designed by lev eraging the iterativ e structure of the functional WMMSE algorithm. Fig. 1. Illustration of the downlink CAP A system L x B , L y B . The main contributions are summarized as follo ws. 1 • W e consider a multiuser multi-CAP A system and, to the best of our knowledge, derive the first explicit closed- form expression for its achiev able sum rate. • Based on this formulation, we develop a functional WMMSE algorithm for multiuser multi-CAP A beam- forming. Specifically , we first transform the sum-rate maximization problem into an equiv alent sum-MSE mini- mization problem, and then use orthonormal basis expan- sion to con vert it into a parameter optimization problem. By deri ving the first-order optimality conditions of this parameter optimization problem and mapping them back to the functional domain, we obtain the update equations of the proposed functional WMMSE algorithm. • Building on the functional WMMSE algorithm, we de- sign a ne w INR, termed BeamINR, to parameterize the continuous beamforming function. Specifically , Beam- INR is implemented as a GNN to exploit PE, and its update rule is designed using the iterative structure to aggregate and combine the channel response function. • Simulation results validate the proposed methods. The functional WMMSE algorithm achieves the highest sum rate but incurs high online inference complexity . In contrast, BeamINR achieves substantially lo wer inference latency and generalizes better than baseline INRs across different numbers of users. Notations: ( · ) T , ( · ) H , ( · ) ∗ , and ∥ · ∥ denote the transpose, Hermitian transpose, conjugate, and Frobenius norm of a matrix, respectiv ely . | · | denotes the magnitude of a complex value, and I z is the identity matrix of size z × z . I I . S Y S T E M M O D E L Consider a downlink multiuser multi-CAP A system in which a BS equipped with a CAP A serv es K users, each also equipped with a CAP A, as illustrated in Fig. 1. In a Cartesian coordinate system, the BS CAP A is modeled as a rectangular aperture lying on the xy -plane and centered at the origin, with side lengths L x B and L y B along the x - and y -ax es, respecti vely . A point on the BS aperture is denoted by s = [ s x , s y , 0] T , and 1 Portions of this work, specifically the deriv ation of the functional WMMSE algorithm, were reported in our conference paper [34]. The present journal manuscript substantially extends [34] in three aspects: it deri ve a closed-form expression for the achie vable sum rate in multiuser multi-CAP A systems, de- signs BeamINR to learn the continuous beamforming function which exploits the PE property of the optimal beamforming policy and the iterativ e structure of the functional WMMSE algorithm. It also provides additional simulation results for both the numerical optimization and the INR-based approaches. 3 the set of all such points is denoted by S B . The k -th user CAP A is centered at location r k,o and has side lengths L x U and L y U . Its orientation is specified by the rotation angles ω k x , ω k y , and ω k z about the x -, y -, and z -axes, respectiv ely , with corresponding rotation matrices R x ( ω k x ) , R y ( ω k y ) , and R z ( ω k z ) . T o describe points on the k -th user CAP A, we introduce a local coordinate system whose origin is r k,o and whose xy -plane coincides with the user CAP A. In this local coordinate system, a point on the k -th user CAP A is denoted by ¯ r k = [ ¯ r k x , ¯ r k y , 0] T , and the set of all such points is denoted by ¯ S k U . Each point can be mapped to the global coordinate system as r = R x ( ω k x ) R y ( ω k y ) R z ( ω k z ) ¯ r k + r k,o , ¯ r k ∈ ¯ S k U , (1) where r denotes the corresponding point in the global coordi- nate system and S k U denotes the set of all such points. The BS transmits d data streams to each user . Let v k ( s ) = [ v k 1 ( s ) , . . . , v k d ( s )] ∈ C 1 × d denote the beamforming function on the BS CAP A for user k , and let x k = [ x k 1 , . . . , x k d ] ∈ C 1 × d denote the corresponding data symbols, where v k i ( s ) conv eys symbol x k i with E {| x k i | 2 } = 1 . The received signal at point r on the k -th user CAP A is given by y k ( r ) = K X i =1 Z S B h k ( r , s ) v i ( s ) x T i d s + n k ( r ) , r ∈ S k U , (2) where h k ( r , s ) denotes the continuous channel kernel 2 from point s on the BS CAP A to point r on the k -th user CAP A, and n k ( r ) ∼ C N (0 , σ 2 n ) is additi ve white Gaussian noise with variance σ 2 n . As widely assumed in the literature [16], [17], we consider uni-polarized CAP As under line-of-sight conditions, where both the BS and the user CAP As are polarized along the y -axis. The channel response function h k ( r , s ) is then expressed as h k ( r , s ) = Λ k, T R − j η e − j 2 π λ ∥ r − s ∥ 2 λ ∥ r − s ∥  I 3 − ( r − s )( r − s ) T ∥ r − s ∥ 2  Λ T , (3) where r ∈ S k U , s ∈ S B , Λ T = [0 , 1 , 0] T and Λ k R = R x ( ω k x ) R y ( ω k y ) R z ( ω k z ) Λ T are unit polarization vectors, η is the intrinsic impedance, and λ is the signal wav elength. T o fo- cus on beamforming optimization, we assume perfect channel state information at the BS. T o obtain the continuous channel, prior work has explored parametric estimation methods [35], [36], which estimate a finite set of channel parameters and then reconstruct the channel response functions. I I I . P RO B L E M F O R M U L A T I O N W e aim to optimize the beamforming functions to maximize the system sum rate. Prior work has deri ved closed-form achiev able-rate expressions for the multiuser single-CAP A system [11], [13] and the single-user multi-CAP A system [16], where only inter-user interference or only intra-user interfer- ence is present. In contrast, the considered multiuser multi- CAP A system inv olves both intra-user and inter-user interfer- ence. W e therefore first derive a closed-form expression for its achiev able sum rate and then formulate the corresponding 2 In functional analysis, a two-variable function used in an integral operator, such as h k ( r , s ) in (2), is referred to as the kernel of that operator. optimization problem. T o this end, we begin by defining the in verse of a continuous kernel. Definition 1 (In verse of a Continuous Kernel) . F or a continu- ous kernel G ( r , s ) , a kernel G − 1 ( z , r ) is defined as the in verse of G ( r , s ) if it satisfies [13] Z S G − 1 ( z , r ) G ( r , s )d r = δ ( z − s ) , (4) for all r , s , z ∈ S , where δ ( · ) is the Dirac delta function. Proposition 1. The achie vable sum r ate of a multiuser multi- CAP A system is given by R = P K k =1 log det  I d + Q k  (5) wher e Q k = R R S U a H kk ( r 1 ) J − 1 ¯ k ( r 1 , r 2 ) a kk ( r 2 ) d r 2 d r 1 , (6a) J ¯ k ( r 1 , r 2 ) = K P j =1 ,j  = k a kj ( r 1 ) a H kj ( r 2 ) + σ 2 n δ ( r 1 − r 2 ) , (6b) a kj ( r ) = R S B h k ( r , s ) v j ( s ) d s . (6c) Her e, S U = S K k =1 S k U and h k ( r , s ) is zer o for r / ∈ S k U . Pr oof: See Appendix A. W ith Proposition 1, the sum-rate maximization problem is formulated as max v k ( s ) P K k =1 log det  I d + Q k  , (7a) s.t. P K k =1 R S B ∥ v k ( s ) ∥ 2 d s ≤ C max , (7b) (6a) , (6b) , (6c) , where C max denotes the maximum current budget at the BS. I V . F U N C T I O NA L W M M S E A L G O R I T H M In this section, we dev elop a functional WMMSE algorithm for beamforming design. Instead of solving (7) directly , we first reformulate it as an equiv alent sum-MSE minimization problem and then represent the continuous functions in the equiv alent problem with complete orthonormal basis expan- sions. These reformulations conv ert the original functional sum-rate maximization problem into an equiv alent optimiza- tion ov er coefficient matrices, which facilitates the deriv ation of the functional WMMSE algorithm. A. Equivalent MSE Minimization Problem W e first reformulate problem (7) as an equi valent sum- MSE minimization problem. Specifically , by applying the combining function u k ( r ) ∈ C 1 × d to the receiv ed signal, the k -th user estimates its transmitted signal as ˆ x k = Z S k U u H k ( r ) y k ( r )d r . (8) W ith (8), the MSE matrix E k can be derived as E k = E x ,n  ( ˆ x k − x k )( ˆ x k − x k ) H  ≜ I d − B kk − B H kk + K X j =1 B kj B H kj + σ 2 n Z S k U u H k ( r ) u k ( r ) d r , (9) 4 where B kj = Z S k U u H k ( r ) a kj ( r )d r . (10) Proposition 2. Pr oblem (7) is equivalent to the following pr oblem, in the sense that the globally optimal solution v k ( s ) is identical for both pr oblems. min { W k , u k ( r ) , v k ( s ) } X K k =1 T r ( W k E k ) − log det( W k ) (11a) s . t . (9) , (10) , (6c) , (7b) , wher e W k ⪰ 0 is the weight matrix of user k . Pr oof: See Appendix B. T o establish the equiv alence between problem (7) and the MSE-based formulation in Proposition 2, we need to handle in verse kernels consisting of a continuous kernel plus an outer- product term. The follo wing lemma provides a functional W oodbury identity for this purpose and is used in the proof of Proposition 2. Lemma 1 (Functional W oodbury Identity) . Let J ( r 1 , r 2 ) be a continuous in vertible kernel, a ( r 1 ) ∈ C 1 × d and b ( r 1 ) ∈ C d × 1 for r 1 , r 2 ∈ S . If the term J ( r 1 , r 2 ) + a ( r 1 ) b ( r 2 ) is in vertible, then (J ( r 1 , r 2 ) + a ( r 1 ) b ( r 2 )) − 1 = J − 1 ( r 1 , r 2 ) − ψ ( r 1 ) ( I d + G ) − 1 ϕ ( r 2 ) , (12) wher e G = R R S b ( r 1 ) J − 1 ( r 1 , r 2 ) a ( r 2 )d r 1 d r 2 and ψ ( r 1 ) = R S J − 1 ( r 1 , r 2 ) a ( r 2 )d r 2 , ϕ ( r 2 ) = R S b ( r 1 ) J − 1 ( r 1 , r 2 ) d r 1 . (13) Pr oof: See Appendix C. For the beamforming functions v k ( s ) , ∀ k , let β ( s ) = [ β 1 ( s ) , . . . , β N s ( s )] ∈ C 1 × N s denote orthonormal basis func- tions with N s → ∞ , which satisfy the orthonormality condition Z S B β H ( s ) β ( s )d s = I N s . (14) Then, each beamforming function v k ( s ) can be expressed as a linear combination of these basis functions [37], i.e., v k ( s ) = β ( s ) V k , (15) where V k ∈ C N s × d denotes the coef ficient matrix, obtained through the projection V k = Z S B β H ( s ) v k ( s )d s . (16) Similarly , for u k ( r ) , consider another set of orthonormal basis functions denoted by α k ( r ) =  α k 1 ( r ) , . . . , α k N r ( r )  ∈ C 1 × N r with N r → ∞ , which satisfy Z S k U α H k ( r ) α k ( r )d r = I N r . (17) Accordingly , u k ( r ) can be represented as u k ( r ) = α k ( r ) U k , (18) where U k ∈ C N r × d is the corresponding coefficient matrix, obtained as U k = Z S k U α H k ( r ) u k ( r )d r . (19) Since β ( s ) and α k ( r ) form complete orthonormal sets ov er S B and S k U , respectiv ely , the continuous channel kernel h k ( r , s ) can be represented as [16] h k ( r , s ) = α k ( r ) H k β H ( s ) , (20) where H k ∈ C N r × N s is the channel coefficient matrix, ob- tained as H k = Z S k U Z S B α H k ( r ) h k ( r , s ) β ( s )d s d r . (21) Substituting (15), (18), and (20) into problem (11), we obtain min W k , U k , V k K X k =1 (T r ( W k E k ) − log det( W k )) (22a) s.t. E k = I d − B kk − B H kk + X K j =1 B kj B H kj + σ 2 n U H k U k , (22b) B kj = U H k H k V j , (22c) K X k =1 T r  V H k V k  ≤ C max . (22d) Problem (22) optimizes the coefficient matrices V k and U k together with the weight matrices W k . Although the problem can be solved by the con ventional WMMSE algorithm, the computational complexity becomes prohibitiv e as N s → ∞ and N r → ∞ . Therefore, instead of directly optimizing V k and U k , the next subsection deri ves their optimality conditions and uses them to construct the corresponding optimal functions v k ( s ) and u k ( r ) . B. Derivation of the Functional WMMSE Algorithm This subsection deri ves the update equations for u k ( r ) , W k , and v k ( s ) . 1) Update of u k ( r ) : From problem (22), the first-order optimality condition for U k is  K X j =1 H k V j V H j H H k + σ 2 I N r  U k − H k V k = 0 . (23) Multiplying both sides of (23) by α k ( r 1 ) yields α k ( r 1 ) H k V k = α k ( r 1 )  K X j =1 H k V j V H j H H k + σ 2 n I N r  U k . (24) Using the orthonormality condition in (14), the term α k ( r 1 ) H k V k can be rewritten as α k ( r 1 ) H k V j = Z S B α k ( r 1 ) H k β ( s ) H | {z } h k ( r 1 , s ) β ( s ) V j | {z } v j ( s ) d s = Z S B h k ( r 1 , s ) v j ( s )d s ≜ a kj ( r 1 ) , (25) 5 where the second equality follows from (20) and (15). Substi- tuting (25) into (24) giv es a kk ( r 1 ) =  K X j =1 a kj ( r 1 ) V H j H H k + σ 2 n α k ( r 1 )  I N r U k . (26) W ith (17), the right-hand side of (26) can be further ex- pressed as Z S k U  K X j =1 a kj ( r 1 ) V H j H H k α H k ( r ) | {z } a H kj ( r ) + σ 2 n α k ( r 1 ) α H k ( r )  α k ( r ) U k d r = Z S k U  K X j =1 a kj ( r 1 ) a H kj ( r ) + σ 2 n δ ( r 1 − r )  u k ( r )d r , (27) where α k ( r 1 ) α H k ( r ) = δ ( r 1 − r ) holds due to the orthonor- mality of α k ( r ) , and α k ( r ) U k equals u k ( r ) by (18). Using (27) to replace the right-hand side of (26), the functional optimality condition for u k ( r ) is a kk ( r 1 ) = Z S k U  K X j =1 a kj ( r 1 ) a H kj ( r ) + σ 2 n δ ( r 1 − r )  u k ( r )d r . (28) Defining J k ( r 1 , r ) ≜ P K j =1 a kj ( r 1 ) a H kj ( r ) + σ 2 n δ ( r 1 − r ) , (28) can be rewritten as a kk ( r 1 ) = R S k U J k ( r 1 , r ) u k ( r )d r , from which u k ( r ) is obtained as u k ( r ) = Z S k U J − 1 k ( r , r 1 ) a kk ( r 1 ) d r 1 , (29) where J − 1 k ( r , r 1 ) denotes the inv erse of J k ( r 1 , r 2 ) , as defined in Definition 1. 2) Update of W k : From problem (22), the optimal W k is W k =  I d − U H k H k V k  − 1 . (30) W ith (17), (30) can be rewritten as W k =  I d − Z S k U U H k α H k ( r ) | {z } u H k ( r ) α k ( r ) H k V k | {z } a kk ( r ) d r  − 1 , =  I d − Z S k U u H k ( r ) a kk ( r ) d r  − 1 , (31) where the second equality follows from (18) and (25). 3) Update of v k ( s ) : Similar to the deriv ation of u k ( r ) , the first-order optimality condition for V k can be derived as  K X j =1 H H j U j W j U H j H j + µ I  V k − H H k U k W k = 0 , (32) where µ is the Lagrange multiplier introduced to satisfy the constraint in (7b), which can be obtained via bisection. Multiplying both sides of (32) by β ( s 1 ) yields  K X j =1 β ( s 1 ) H H j U j W j U H j H j + µ β k ( s 1 ) I  V k = β ( s 1 ) H H k U k . (33) Using the orthonormality condition in (17), the term β ( s 1 ) H H k U k can be rewritten as β ( s 1 ) H H k U k = Z S k U β ( s 1 ) H H k α H k ( r ) | {z } h H k ( r , s 1 ) α k ( r ) U k | {z } u k ( r ) d r = Z S k U h H k ( r , s 1 ) u k ( r )d r ≜ c k ( s 1 ) . (34) where the second equality follows from (20) and (18). Substi- tuting (34) into (33) giv es c k ( s 1 ) W k =  K X j =1 c j ( s 1 ) W j U H j H j + µ β ( s 1 )  V k . (35) Applying (14), the right-hand side of (35) can be further written as  K X j =1 c j ( s 1 ) W j U H j H j + µ β ( s 1 )  I N s V k = Z S B  K X j =1 c j ( s 1 ) W j U H j H j β H ( s ) | {z } c H j ( s ) + µ β ( s 1 ) β H ( s )  β ( s ) V k d s . (36) Using (36) to replace the right-hand side of (35), the functional optimality condition for v k ( s ) is c k ( s 1 ) W k = Z S B  K X j =1 c j ( s 1 ) W j c H j ( s ) + µδ ( s 1 − s )  v k ( s )d s , (37) where β ( s 1 ) β H ( s ) = δ ( s 1 − s ) holds due to the orthonormal- ity of β ( r ) , and β ( s ) V k equals v k ( s ) based on (15). Defining T k ( s 1 , s ) ≜ P K j =1 c j ( s 1 ) W j c H j ( s ) + µδ ( s 1 − s ) , the functional update equation of v k ( s ) is derived as v k ( s ) = Z S B T − 1 k ( s 1 , s ) c k ( s 1 ) W k d s 1 . (38) where T − 1 k ( s 1 , s ) denotes the in verse of T k ( s 1 , s ) . The proposed functional WMMSE algorithm is summarized in T able I. Remark 1 . Unlike F ourier-based discretization methods, which iterativ ely optimize finite-dimensional coefficient matrices and then reconstruct the continuous beamforming functions from truncated Fourier e xpansions [10], [11], the proposed approach yields closed-form functional update equations and updates the beamforming functions directly in the continuous domain. It therefore av oids the approximation errors introduced by truncating the Fourier expansion. V . I N R O F B E A M F O R M I N G F U N C T I O N In this section, we employ an INR to learn the beamform- ing policy . W e first analyze the PE property of the policy and implement the INR as a GNN to exploit it. W e then relate the functional WMMSE algorithm to the con ventional GNN update rule and, based on which, deriv e a new GNN update equation. 6 T ABLE I P S EU D O C OD E O F T H E P RO P O S ED F U NC T I O NA L W M M SE A L G OR I T HM 1 Initialize v k ( s ) such that P K k =1 R S B ∥ v k ( s ) ∥ 2 d s ≤ C max , and set W k = I d , ∀ k 2 repeat 3 Update u k ( r ) with (29), ∀ k 4 Update W k with (31), ∀ k 5 Update v k ( s ) with (38), ∀ k 6 until the change in K P k =1 log det( W k ) is less than a tolerance ε . Fig. 2. Illustration of the undirected graph with K = 4 . A. P ermutation-Equivariant INR via GNNs Substituting (1) into (6c), then into (6a), and inte grating ov er S U show that Q k depends on the user geometry matrix P o = [ p T 1 , · · · , p T K ] T ∈ R K × 6 , where p k = [ r k, T 0 , ω k x , ω k y , ω k z ] ∈ R 1 × 6 collects the location and orientation of user k . Con- sequently , the beamformer V ( s ) = [ v T 1 ( s ) , . . . , v T K ( s )] T ∈ C K × d is determined by P o and s . W e therefore define the optimal beamforming policy as the mapping from ( s , P o ) to the optimal beamformer, which is denoted as V ⋆ ( s ) = F b ( s , P o ) , s ∈ S B , (39) where V ⋆ ( s ) denotes the optimal beamformer for the input ( s , P o ) . The policy in (39) exhibits a one-dimensional PE (1D-PE) property w .r .t the user dimension, as stated in the follow- ing proposition. Proposition 3. When the policy input is permuted as Π T P o , then the beamforming function Π T V ⋆ ( s ) is optimal, i.e.  Π T V ⋆ ( s )  = F b  s , Π T P o  , (40) wher e Π ∈ R K × K is a permutation matrix on the user indices. Pr oof: See Appendix D. T o parameterize the policy in (39), we employ an INR giv en by V ( s ) = P θ ( s , P o ) , s ∈ S B , (41) where P θ ( · ) denotes the INR parameterized by θ . T o exploit the 1D-PE property in (40), we implement P θ ( · ) as a GNN de- fined on a vertex graph. Specifically , following [38], the graph contains K user v ertices with pairwise edges, as illustrated in Fig. 2. For v ertex k , the feature consists of the geometry v ector p k and the spatial coordinate s , while the action is v k ( s ) . No features or actions are assigned to the edges. A GNN updates vertex representations by aggregating in- formation from neighboring vertices and combining it with the representation of the target vertex. Specifically , for user verte x k , the hidden representation in the ( l + 1) -th layer d ( l +1) k ( s ) = [ d ( l +1) k, 1 ( s ) , . . . , d ( l +1) k,C l +1 ( s )] T , where C l +1 is the representation dimension, is updated as d ( l +1) k ( s ) = σ S ( l ) ¯ d ( l ) k ( s ) + aggregation z }| { W ( l ) K X i =1 ,i  = k d ( l ) i ( s ) | {z } combination ! , (42) where the neighboring representations d ( l ) i ( s ) , ∀ i  = k , are aggregated through the trainable matrix W ( l ) ∈ R C l +1 × C l , and the resulting aggregated information is combined with the representation of v ertex k , namely d ( l ) k ( s ) , through S ( l ) ∈ R C l +1 × C l . In (42), σ ( · ) denotes the acti vation function, and the initial and final hidden representations, d (0) k ( s ) and d ( L ) k ( s ) , correspond to the vertex features and actions, respecti vely , where L is the total number of GNN layers. From (42), the aggreg ation and combination processes ha ve the following characteristics: 1) In the aggregation term, the neighboring hidden rep- resentations d ( l ) i ( s ) , ∀ i  = k , are aggregated using a shared trainable matrix W ( l ) , without additional weights to differentiate their contributions. 2) In the combination term, d ( l ) k ( s ) is combined with the aggregated information through S ( l ) . 3) The update of d ( l +1) k ( s ) uses only the information at point s from layer l . As reported in [39], for related beamforming tasks in SPD A systems, e xploiting the PE property alone does not necessarily ensure generalization to unseen problem scales and incorporating mathematical structure into update-equation design can improve learning ef ficiency by enhancing general- izability . Motiv ated by these observations, we next design a new aggregation and combination mechanism by relating the recursion of the proposed functional WMMSE algorithm to the update equation of a conv entional GNN. B. Update Equation Design of GNN T o relate the proposed functional WMMSE algorithm to the con ventional GNN update equation, it is useful to express the beamforming iteration in T able I explicitly as a recursion with respect to the previous beamforming iterate, as stated in the following proposition. Proposition 4. Given the beamforming functions { v ( l ) j ( s ) } K j =1 at iteration l , the beamforming iteration equation in the pr oposed functional WMMSE algorithm can be equivalently written as v ( l +1) k ( s ) = Z S k U h H k ( r , s ) a ( l ) kk ( r ) d rΘ k + K X i =1 K X j =1 ( i,j )  =( k,k ) Z S k U h H i ( r , s ) a ( l ) ij ( r ) d rΣ k ij , (43) wher e a ( l ) ij ( r ) = R S B h i ( r , s 1 ) v ( l ) j ( s 1 ) d s 1 and Θ k , Σ k ij ∈ R d × d . Pr oof: See Appendix E. 7 Comparing (43) with (42) shows that the functional WMMSE recursion and the con ventional GNN update share the similar iterativ e paradigm. In both cases, either the beam- former v ( l +1) k ( s ) in (43) or the hidden representation d ( l +1) k ( s ) in (42) is obtained through an aggregation step followed by a combination step. Specifically , (42) aggregates neighboring hidden representations d ( l ) i ( s ) and combines the result with the hidden representation of user k , whereas (43) aggregates the channel kernels h H i ( r , s ) weighted by a ( l ) ij ( r ) and combines them with the self term h H k ( r , s ) weighted by a ( l ) kk ( r ) . This structural correspondence moti vates a ne w GNN update equation. Relative to (42), (43) e xhibits four ke y differences in aggregation and combination: 1) The aggregated terms are the channel kernels h H i ( r , s ) , rather than neighboring hidden representations d ( l ) i ( s ) . 2) Each kernel contribution is weighted by a specific co- efficient a ( l ) ik ( r ) , instead of sharing a common aggrega- tion weight. 3) The combination term is the kernel h H k ( r , s ) with co- efficient a ( l ) kk ( r ) , rather than the pre vious-layer hidden representations d ( l ) k . 4) The update at s in (43) depends on information over the entire aperture, as the combination and aggregation in volved integrals over S B . These observ ations motiv ate a new aggre gation and combi- nation design. Interpreting v ( l ) k ( s ) in (43) as the l -th layer hidden representation ¯ d ( l ) k ( s ) ∈ C C l × 1 , and viewing the summation terms as aggregation and the additiv e structure as combination, we design the following GNN update equation d ( l +1) k ( s ) = Z S k U h H k ( r , s ) a ( l ) kk ( r ) d rS ( l ) + K X i =1 K X j =1 ( i,j )  =( k,k ) Z S k U h H i ( r , s ) a ( l ) ij ( r )d rW ( l ) (44) where a ( l ) ij ( r ) = R S B h i ( r , s ) d ( l ) j ( s )d s , and the matrices Θ k and Σ k ij in (43) are absorbed into the trainable parameter matrices S ( l ) and W ( l ) . The update equation can be further simplified by exploiting the graph topology . Consider vertex k aggregates information only from its neighbors, i.e., the v ertices connected to k , then (44) reduces to d ( l +1) k ( s ) = Z S k U h H k ( r , s ) a ( l ) kk ( r ) d rS ( l ) + K X i =1 ,i  = k Z S k U h H k ( r , s ) a ( l ) ki ( r )d rW ( l ) . (45) The INR with the update equation in (45) is referred to as BeamINR. C. T raining of BeamINR BeamINR is trained in an unsupervised manner using the negati ve objectiv e in (7a) as the loss. Since this objective in volves continuous integrals ov er the BS and user CAP As, direct ev aluation is intractable. W e therefore approximate all integrals using Gauss–Le gendre (GL) quadrature. Specifically , for a continuous function f ( · ) defined on S B , the integral is approximated as [16] R S B f ( s )d s ≈ M B , G P m =1 M B , G P n =1 ξ B m ξ B n A B 4 f ( s m,n ) , s m,n ∈ S B , (46) where s m,n =  ϕ B m L x B 2 , ϕ B n L y B 2 , 0  T denotes a sampling point, A B = L x B L y B is the area of S B , M B , G is the quadrature order , and { ϕ B m } M B , G m =1 and { ξ B m } M B , G m =1 denote the roots and weights of the Legendre polynomial, respectively . W ith (46), we construct a GL-based training dataset for BeamINR. Each sample consists of a fixed set of quadrature points on the BS CAP A and an independently generated user geometry matrix. Specifically , the same M 2 B , G points { s m,n } M B , G m,n =1 are shared across all samples, whereas the user geometry varies across samples. The resulting training dataset is defined as D B , G = n { s m,n } M B , G m,n =1 , P ( t ) o o |D B , G | t =1 , (47) where P ( t ) o denotes the user geometry matrix of the t -th sample and |D B , G | is the dataset size. The loss ov er D B , G is computed using GL quadrature. For the t -th sample, the beamforming function is obtained as V ( t ) ( s m,n ) = P θ b ( s m,n , P ( t ) o ) , s m,n ∈ S B . (48) Using this result, the integral in (6c) is approximated by a ( t ) kj ( r ( t ) ) ≈ M B , G P m =1 M B , G P n =1 ξ B m ξ B n A B 4 · h ( r ( t ) , s m,n ) · v ( t ) j ( s m,n ) , (49) where h ( r ( t ) , s m,n ) depends on P ( t ) o via (1). Similarly , the integral in (6a) is approximated as ¯ Q ( t ) k ( r ) ≈ M U , G X m =1 M U , G X n =1 ξ U m ξ U n A U 4 a H kk ( r )J − 1 ¯ k ( r , r ( t ) k,m,n ) a kk ( r ( t ) k,m,n ) , (50a) Q ( t ) k ≈ M U , G X m =1 M U , G X n =1 ξ U m ξ U n A U 4 · ¯ Q ( t ) k ( r ( t ) k,m,n ) , (50b) where A U = L x U L y U is the area of S k U , M U , G is the quadrature order on S k U , { ξ U m } M U , G m =1 are the corresponding GL weights, and r ( t ) k,m,n are the user-side quadrature points in the t -th sample. T o obtain r ( t ) k,m,n , we first generate local quadrature points { ¯ r m,n } M U , G m,n =1 on ¯ S k U using the GL roots { ϕ U m } M U , G m =1 , and then map them to the global coordinate system via r ( t ) k,m,n = ¯ r m,n + r ( t ) k,o , where r ( t ) k,o is the location of user k in the t -th sample. Substituting Q ( t ) k into (7a) yields the achiev able sum rate of the t -th sample. The loss is defined as the negati ve sum rate and denoted by L ( t ) B , G . The av erage loss ov er dataset D B , G is then computed as L B , G = 1 |D B , G | P |D B , G | t =1 L ( t ) B , G . 8 The same GL approximation is also applied to the integrals in each GNN update layer . Specifically , for the t -th sample, the hidden representation of the l -th layer d ( l +1) k ( s ) is updated by d ( l +1) k ( s ) ≈ M U , G X m =1 M U , G X n =1 ξ U m ξ U n A U 4 h H k ( r ( t ) k,m,n , s ) e ( l ) kk ( r ( t ) k,m,n ) S ( l ) + K X i =1 i  = k M U , G X m =1 M U , G X n =1 ξ U m ξ U n A U 4 h H k ( r ( t ) k,m,n , s ) e ( l ) ki ( r ( t ) k,m,n ) W ( l ) , (51) where e ( l ) kj ( r ( t ) ) ≈ M B , G P m =1 M B , G P n =1 ξ B m ξ B n A B 4 · h k ( r ( t ) , s m,n ) · d ( l ) j ( s m,n ) . Although GL quadrature yields accurate integral approxi- mations, using only the fixed GL sampling nodes in D B , G may cause BeamINR to overfit to these specific coordinates and thereby limit generalization o ver the BS CAP A [40]. T o alleviate this issue, we adopt a hybrid training strategy that combines deterministic GL quadrature with sample-wise randomized sampling [18]. Specifically , in addition to the GL-based dataset D B , G , we construct an auxiliary dataset using points generated by an Owen-scrambled Sobol se- quence [41], and train BeamINR with the combined loss ov er the two datasets. For the auxiliary dataset, we generate M B , S randomized sampling points for each sample using the Owen-scrambled Sobol sequence. The resulting sampling points  s ( t ) i  M B , S i =1 , together with the corresponding user geometry matrix P ( t ) o , form the auxiliary dataset D B , S = n  s ( t ) i  M B , S i =1 , P ( t ) o o |D B , S | t =1 , (52) where |D B , S | denotes the dataset size. For the t -th sample in D B , S , the loss L ( t ) B , S is computed in the same way as L ( t ) B , G , except that the integral ov er S B is approximated using the randomized sampling points { s ( t ) i } M B , S i =1 , i.e., R S B f ( s ) d s ≈ P M B , S i =1 f ( s ( t ) i ) M B , S A B . (53) The av erage loss over D B , S is giv en by L B , S = 1 |D B , S | P |D B , S | t =1 L ( t ) B , S . Finally , the overall training loss of BeamINR is defined as L B = (1 − α ) · L B , G + α · L B , S , (54) where α ∈ [0 , 1] is a factor controlling the importance of L B , S . V I . S I M U L A T I O N R E S U LT S In this section, we ev aluate the proposed functional WMMSE algorithm and BeamINR and compare them with relev ant baselines. A. Simulation Setup Unless otherwise specified, the follo wing simulation setup is used. The number of users is K = 3 . The BS and user CAP As have side lengths L x B = L y B = 2 m and L x U = L y U = 0 . 5 m, respectiv ely . The position of user k is giv en by r k o = [ r k x , r k y , r k z ] T , where r k x , r k y ∈ [ − 5 , 5] m and r k z ∈ [20 , 30] m are uniformly distributed. The rotation angles are independently and uniformly distributed as ω k x , ω k y , ω k z ∈ [ − π 2 , π 2 ] . The wavelength is set to λ = 0 . 125 m (corresponding to a carrier frequency of 2 . 4 GHz), and the intrinsic impedance is η = 120 π Ω . T o maximize the multiplexing gain, the number of data streams is set to d = min { d B , d U } , where d B =  2  L x B λ  + 1  2  L y B λ  + 1  and d U =  2  L x U λ  + 1  2  L y U λ  + 1  , as deriv ed in [16]. The maximum current budget is set to C max = 1000 mA 2 , and the noise variance is σ 2 n = 5 . 6 × 10 − 3 V 2 . The hyperparameters of BeamINR are set as follows. W e use six hidden layers with dimensions [64 , 128 , 512 , 512 , 128 , 64] , and apply the tanh activ ation function to all hidden layers. BeamINR is trained in an unsupervised manner , where the loss is defined as the negati ve of the objectiv e in (54). Training is performed using the Adam optimizer with an initial learning rate of 10 − 3 and a batch size of 32 . W e generate 500,000 samples in each of D B , G and D B , S for training. Each sample in D B , G contains M 2 B , G = 100 fixed GL sampling points on S B , while each sample in D B , S contains M B , S = 100 randomized Sobol sampling points. T o ensure a consistent sampling density across both CAP As, the numbers of sampling points are scaled according to the CAP A areas. Accordingly , the number of GL sampling points on S k U is set to M U , G = l L x U L x B M B , G m . The factor in (54) is set to α = 0 . 1 . For testing, we generate an additional 10,000 samples in D B , G . T o obtain accurate integral approximations during testing, a higher-resolution GL sampling grid with M 2 B , G = 400 is used. B. Learning P erformance W e compare the proposed functional WMMSE algorithm and BeamINR with the following baselines: • ConINR : An INR baseline using the con ventional GNN update equation in (42). • V arINR : A v ariant of BeamINR with the update equation in (42), without exploiting the underlying graph topology . • F ourier : The Fourier -basis method in [9], which truncates beamforming functions into finite Fourier coefficients for optimization. • SPD A : The discretization-based method in [11], which approximates CAP As by finite array elements. Fig. 3 shows that the achie vable sum rate increases with the transmit current budget. As shown in the figure, both func- tional WMMSE and the learning-based methods consistently outperform the Fourier and SPDA methods because they av oid discretization loss. BeamINR further outperforms ConINR and V arINR, mainly due to the proposed update equation in (45), which incorporates both the iterati ve structure of the functional WMMSE algorithm and the topology of the CAP A graph. Among all methods, functional WMMSE achiev es the highest sum rate. Similar trends are observed in Fig. 4, where functional WMMSE and BeamINR maintain gains over all baselines across all user counts, indicating more ef fective multiuser interference mitigation. Moreover , the advantage of functional WMMSE becomes more pronounced as the number of users increases. 9 10 1 10 2 10 3 10 4 0 10 20 30 40 50 6000 8000 10000 35 40 45 Fig. 3. Sum rate versus current budget. 2 3 4 5 6 7 10 20 30 40 50 2 2.5 3 20 25 Fig. 4. Sum rate versus number of users. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 10 20 30 40 50 0.7 0.75 0.8 40 45 Fig. 5. Sum rate versus CAP A size. In Fig. 5, we consider square user CAP As with L x U = L y U . For all schemes, enlarging the user aperture increases the achiev able sum rate by providing additional spatial degrees of freedom (DoFs). The proposed functional WMMSE algo- rithm and BeamINR consistently outperform the baselines, suggesting that they exploit the additional DoFs offered by larger apertures more effecti vely . Fig. 6 further shows that the achiev able rate increases with the carrier frequency , which is consistent with the fact that higher frequencies support a larger number of spatial DoFs [42]. C. Generalizability In this subsection, we e valuate the generalizability of the learning-based methods with respect to the number of users, 1 2 3 4 5 6 7 0 10 20 30 40 50 60 70 5 5.5 6 45 50 Fig. 6. Sum rate versus frequency . CAP A size, and carrier frequency . T able II reports the generalization performance of the learning-based methods with respect to the number of users. All methods are trained with K = 5 and ev aluated for K ranging from 2 to 8 . The performance metric is the ratio of the sum rate achiev ed by each learning-based method to that achieved by the proposed functional WMMSE method. As shown in T able II, BeamINR and V arINR consistently achiev e higher sum-rate ratios than ConINR, indicating im- prov ed generalizability across different user counts due to the incorporation of the underlying mathematical structure. T ABLE II U S ER G E NE R A L IZ A B I LI T Y K 2 3 4 5 6 7 8 BeamINR (%) 52.76 76.48 90.71 94.30 92.97 90.31 86.51 V arINR (%) 37.75 42.91 69.33 93.71 76.16 59.83 58.23 ConINR (%) 45.98 81.77 85.73 89.95 85.99 81.23 78.37 For generalization with respect to the BS CAP A size, all learning-based models are trained with A B = 4 m 2 and tested ov er A B ∈ [2 , 6] m 2 . For generalization with respect to the user CAP A size, the models are trained with A U = 0 . 5 m 2 and tested over A U ∈ [0 . 3 , 0 . 7] m 2 . The performance metric is the ratio of the sum rate achiev ed by each learning-based method to that achiev ed by the functional WMMSE algorithm. As shown in T able III, all INRs maintain more than 96% of the functional WMMSE performance across A B , indicating strong generalizability with respect to the BS CAP A area. They also generalize well to larger user CAP A sizes. T ABLE III C A P A S IZ E G EN E R A LI Z A BI L I T Y A B (m 2 ) 2 3 4 5 6 BeamINR (%) 96.62 96.92 98.76 97.35 96.56 V arINR (%) 95.34 95.93 97.01 96.84 95.54 ConINR (%) 95.21 95.67 96.87 96.32 95.01 A U (m 2 ) 0.3 0.4 0.5 0.6 0.7 BeamINR (%) 93.51 94.87 97.39 96.32 95.02 V arINR (%) 91.07 93.36 96.52 95.07 93.60 ConINR (%) 86.35 91.07 95.83 94.32 92.93 T able IV reports the generalization of the learning-based methods with respect to carrier frequency . All INRs are trained at a fixed carrier frequency f = 2 . 4 GHz and tested over f ∈ [1 . 8 , 3] GHz. As shown in T able IV, ConINR maintains 10 more than 90% of the functional WMMSE performance only for f ∈ [2 . 2 , 2 . 6] GHz, indicating good generalizability only within a relativ ely narrow band. In contrast, BeamINR and V arINR maintain high performance across the entire tested range. This behavior is expected because the channel kernel depends on the wav elength λ and hence on the carrier fre- quency f , as implied by (3). By using the designed update equations in (44) and (45), BeamINR and V arINR primarily learn the aggregation and combination coef ficients and then apply them to the frequency-dependent channel terms to con- struct the beamforming function, thereby improving frequency generalizability . T ABLE IV F R EQ U E NC Y G EN E R A LI Z A B IL I T Y f (GHz) 1.8 2 2.2 2.4 2.6 2.8 3 BeamINR (%) 90.84 92.20 95.82 98.26 95.64 92.31 90.52 V arINR (%) 88.92 90.26 93.06 97.23 94.59 90.01 89.35 ConINR (%) 83.16 88.56 92.73 96.19 93.02 86.96 80.23 D. Inference and T raining Complexity T able V compares the inference time and training com- plexity of the considered methods. T raining complexity is ev aluated in terms of sample, time, and space requirements, all measured under the condition that the method achiev es 95% of the sum rate attained by the functional WMMSE algorithm. As shown in T able V, the learning-based methods achie ve substantially lo wer inference latency than the numerical base- lines. Among the learning-based approaches, BeamINR attains the lowest training complexity . Its inference latency is slightly higher than that of ConINR because the designed update equation introduces additional computation. T ABLE V I N FE R E N CE T I ME A N D T R A IN I N G C O M PL E X I TY Name Inference time T raining Complexity Sample T ime Space BeamINR 0.052 s 10 K 1.62 h 3.68 M V arINR 0.082 s 25 K 51.2 h 4.72 M ConINR 0.033 s 50 K 3.68 h 8.63 M WMMSE 1.378 s Fourier 10.88 s SPD A 9.590 s Note: “K” and “M” represent thousand and million, respectively . V I I . C O N C L U S I O N S This paper develops an INR-based approach, termed Beam- INR, for beamforming learning in multiuser multi-CAP A systems. W e first deri ve a closed-form e xpression for the achiev able sum rate and then propose a functional WMMSE algorithm for beamforming optimization. Building on this algorithm, we design BeamINR as a GNN that e xploits the PE property of the optimal beamforming policy , with an update equation designed by exploiting the iterative structure of the functional WMMSE algorithm. Simulation results show that, although the functional WMMSE algorithm achieves the highest sum rate, it incurs high online comple xity . In contrast, BeamINR achiev es lower inference latency , lower training complexity , and better generalization w .r .t. the number of users and carrier frequency than the baseline INRs. A P P E N D I X A P RO O F O F P RO P O S I T I O N 1 From (2), the achiev able sum rate of user k is the mutual in- formation between the data vector and the received signal, i.e., R k = H ( y k ( · )) − H ( y k ( · ) | x k ) , (A.1) where y k ( · ) denotes the receiv ed-signal function on S k U , and H ( · ) and H ( · | · ) denote the differential entropy and conditional differential entropy , respectively . In (A.1), both y k ( r ) and n k ( r ) are Gaussian processes. T o ev aluate their differential entropies, we employ the Karhunen– Lo ` eve expansion (KLE), which represents a Gaussian process in an orthonormal basis and yields a sequence of statistically independent Gaussian random variables. Thus, the entropy of the process can be characterized through the entropy of these Gaussian coefficients as the number of basis functions tends to infinity . Let α k ( r ) = [ α k 1 ( r ) , . . . , α k N r ( r )] ∈ C 1 × N r denote an orthonormal basis on S k U with N r → ∞ , satisfying R S k U α H k ( r ) α k ( r )d r = I N r . (A.2) Under this basis, the noise process n k ( r ) admits the expansion n k ( r ) = α k ( r ) n k , (A.3) where n k ∈ C N r × 1 satisfies n k ∼ C N ( 0 , σ 2 n I N r ) . Since α k ( r ) is complete on S k U , each a ki ( r ) admits the expansion a ki ( r ) = α k ( r ) A ki (A.4) where A ki ∈ C N r × d is obtained by projection, A ki = R S k U α H k ( r ) a ki ( r )d r . (A.5) Substituting (A.3) and (A.4) into (2) yields y k ( r ) = X K i =1 α k ( r ) A ki x T i + α k ( r ) n k , = α k ( r )  X K i =1 A ki x T i + n k  ≜ α k ( r ) y k . (A.6) For a linear transformation ˜ y = Ay , the differential entropy and conditional differential entropy satisfy h ( ˜ y ) = h ( y ) + log | det( A ) | and h ( ˜ y | x ) = h ( y | x ) + log | det( A ) | , respec- tiv ely [43]. W ith (A.6), applying these identities to (A.1) yields R k = H ( y k ) + log | det( α k ( · )) | − ( H ( y k | x k ) + log | det( α k ( · )) | ) . (A.7) Since y k is a Gaussian random vector , its entropy and condi- tional entropy are gi ven by H ( y k ) = lim N r →∞ logdet  ( π e ) N r ( P K i =1 A ki A H ki + σ 2 n I N r )  , (A.8a) H ( y k | x k ) = lim N r →∞ logdet  ( π e ) N r ( P K i =1 i  = k A ki A H ki + σ 2 n I N r )  . (A.8b) Substituting (A.8) into (A.7) yields R k = H ( y k ) − H ( y k | x k ) = lim N r →∞ logdet  I d + ˜ Q k  , (A.9) where ˜ Q k ≜ A H kk  P K i =1 i  = k A ki A H ki + σ 2 n I N r ) − 1 A kk  . Next, we con vert (A.9) from the coef ficient domain to the functional domain using (A.2)–(A.4). Define ˜ A k = 11 [ A ki ] i  = k ∈ C N r × ( K − 1) d . Applying the W oodbury matrix identity gives ˜ Q k = A H kk  ˜ A k ˜ A H k + σ 2 n I N r ) − 1 A kk = 1 σ 2 n A H kk A kk − 1 σ 4 n A H kk ˜ A k ( I ( K − 1) d + 1 σ 2 n ˜ A H k ˜ A k ) − 1 ˜ A H k A kk . (A.10) Using (A.2)–(A.4), each matrix product in (A.10) can be con verted into a functional integral. For example, A H kk A kk = Z S k U A H kk α H k ( r ) | {z } a H kk ( r ) α k ( r ) A kk | {z } a kk ( r ) d r = R S k U a H kk ( r ) a kk ( r )d r . (A.11) Applying the same conv ersion to the remaining terms in (A.10) yields ˜ Q k = R R S k U a H kk ( r 1 ) E k ( r 1 , r 2 ) a kk ( r ) d r 1 d r 2 , (A.12) where E k ( r 1 , r 2 ) = Y ( r 1 , r 2 ) − ψ ( r 1 )  I ( K − 1) d + G  − 1 ϕ ( r 2 ) , G = R R S U ˜ a H k ( r )Y ( r 1 , r 2 ) ˜ a k ( r )d r 1 d r 2 , and the auxiliary functions are defined as Y ( r 1 , r 2 ) = 1 σ 2 n δ ( r 1 − r 2 ) , ˜ a k ( r ) = [ a ki ( r )] i  = k ∈ C 1 × ( K − 1) d , and ψ ( r 1 ) = R S U Y ( r 1 , r 2 ) ˜ a k ( r )d r , (A.13a) ϕ ( r 1 ) = R S U ˜ a H k ( r )Y ( r 1 , r 2 ) d r . (A.13b) Using Lemma 1, E k ( r 1 , r 2 ) simplifies to E k ( r 1 , r 2 ) = Y − 1 ( r 1 , r 2 ) + ˜ a k ( r ) ˜ a H k ( r ) = P K i =1 i  = k a ki ( r 1 ) a H ki ( r 2 ) + σ 2 n δ ( r 1 − r 2 ) (A.14) Substituting (A.14) into (A.12) sho ws that ˜ Q k = Q k , where Q k is defined in (6a). Substituting this result into (A.9) and summing over k completes the proof. A P P E N D I X B P RO O F O F P RO P O S I T I O N 2 From the first-order optimality condition of problem (11) w .r .t. u k ( r ) , the optimal combining function is u opt k ( r ) = R S k U J − 1 k ( r , r 1 ) a kk ( r 1 )d r 1 , (B.1) where J k ( r , r 1 ) ≜ P K j =1 a kj ( r ) a H kj ( r 1 ) + σ 2 n δ ( r − r 1 ) and J − 1 k ( r , r 1 ) denotes its in verse as defined in Definition 1. Like wise, the first-order optimality condition with respect to W k giv es W opt k = E − 1 k . (B.2) Substituting u opt k ( r ) and W opt k into (11) yields the optimiza- tion problem w .r .t. v k ( s ) , max v k ( s ) X K k =1 log det  E opt k  − 1  (B.3a) s.t. E opt k = I d − Z Z S k U a H kk ( r 1 )J − 1 k ( r 1 , r 2 ) a kk ( r 2 )d r 1 d r 2 , (B.3b) a kj ( r ) = R S B h k ( r , s ) v j ( s ) d s , (B.3c) P K k =1 R S B ∥ v k ( s ) ∥ 2 d s ≤ C max . (B.3d) Applying Lemma 1 to the inv erse kernel in (B.3b) gives J − 1 k ( r 1 , r 2 ) =  J ¯ k ( r 1 , r 2 ) + a kk ( r 1 ) a H kk ( r 2 )  − 1 = J − 1 ¯ k ( r 1 , r 2 ) − ψ k ( r 1 ) ( I d + G k ) − 1 ϕ k ( r 2 ) , (B.4) where G k = R R S k U a H kk ( r 1 ) J ¯ k ( r 1 , r 2 ) a kk ( r 2 )d r 1 d r 2 , and J ¯ k ( r 1 , r 2 ) , ψ k ( r 1 ) and ϕ k ( r 2 ) are defined as J ¯ k ( r 1 , r 2 ) = P K j =1 ,j  = k a kj ( r 1 ) a H kj ( r 2 ) + σ 2 n δ ( r 1 − r 2 ) , (B.5a) ψ k ( r 1 ) = R S k U J − 1 ¯ k ( r 1 , r 2 ) a kk ( r 2 )d r 2 , (B.5b) ϕ k ( r 2 ) = R S k U a H kk ( r 1 ) J ¯ k ( r 1 , r 2 ) d r 1 . (B.5c) Substituting (B.4) and (B.5) into (B.3b) yields ( E opt k ) − 1 =  I d − G k ( I d + G k ) − 1  − 1 = I d + G k = I d + R R S k U a H kk ( r 1 ) J − 1 ¯ k ( r 1 , r 2 ) a kk ( r 2 ) d r 2 d r 1 . (B.6) Combining (B.6) with (B.3) completes the proof. A P P E N D I X C P RO O F O F L E M M A 1 By Definition 1, it suffices to verify the identity Z S  J − 1 ( r 1 , r ) − ψ ( r 1 ) ( I d + G ) − 1 ϕ ( r )  · (J ( r , r 2 ) + a ( r ) b ( r 2 ))  d r = δ ( r 1 − r 2 ) − ψ ( r 1 ) ( I d + G ) − 1 Z S ϕ ( r ) a ( r )d r | {z } G · b ( r 2 ) − ψ ( r 1 )( I d + G ) − 1 Z S ϕ ( r )J ( r , r 2 )d r | {z } b ( r 2 ) + Z S J − 1 ( r 1 , r ) a ( r )d r | {z } ψ ( r 1 ) · b ( r 2 ) . (C.1) Using the definitions in (13), the right-hand side of (C.1) simplifies to (C.1) = δ ( r 1 − r 2 ) − ψ ( r 1 ) ( I d + G ) − 1 Gb ( r 2 ) − ψ ( r 1 ) ( I d + G ) − 1 b ( r 2 ) + ψ ( r 1 ) b ( r 2 ) = δ ( r 1 − r 2 ) − ψ ( r 1 )  ( I d + G ) − 1 G + ( I d + G ) − 1 − I d  b ( r 2 ) = δ ( r 1 − r 2 ) , (C.2) where ( I d + G ) − 1 G + ( I d + G ) − 1 = I d . Hence, by Defini- tion 1, the proof is complete. A P P E N D I X D P RO O F O F P RO P O S I T I O N 3 Consider the Lagrangian of problem (7), given by L ( H ( r , s ) , V ( s ) , λ ) = R sum − λ  K X k =1 Z S B ∥ v k ( s ) ∥ 2 d s − C max  , (D.1) where R sum denotes the objectiv e in (7a), λ is the dual variable associated with the power constraint, and H ( r , s ) = [ h 1 ( r , s ) , · · · , h K ( r , s )] collects the channel kernels deter - mined by P o . The corresponding stationarity condition is g V ( s ) ( H ( r , s ) , V ( s ) , λ ) = ∇ V ∗ ( s ) R sum   ( H ( r , s ) , V ( s )) − λ K X k =1 v k ( s ) . (D.2) 12 Let V ⋆ ( s ) be an optimal solution of problem (7) for the input H ( r , s ) , and let λ ⋆ ≥ 0 denote the corresponding optimal dual variable. Then ( H ( r , s ) , V ⋆ ( s )) satisfies the Karush– Kuhn–T ucker (KKT) conditions g V ( s ) ( H ( r , s ) , V ⋆ ( s ) , λ ⋆ ) = 0 , (D.3a) λ ⋆  K X k =1 Z S B ∥ v ⋆ k ( s ) ∥ 2 d s − C max  = 0 . (D.3b) Now permute V ⋆ ( s ) and H ( r , s ) as Π T V ⋆ ( s ) and Π T H ( r , s ) , respectiv ely . The KKT conditions in (D.3) remain satisfied under this permutation. Therefore, Π T V ⋆ ( s ) is an optimal solution for the permuted channel collection Π T H ( r , s ) . Since Π T H ( r , s ) corresponds to Π T P o , it follows that Π T V ⋆ ( s ) is optimal for the input Π T P o in (40). This completes the proof. A P P E N D I X E P RO O F O F P RO P O S I T I O N 4 T o prove Proposition 4, we first apply Lemma 1 to the in verse kernels in (38) and (29), and then substitute the resulting expression of (29) into (38). Define m ( s ) ≜ [ c 1 ( s ) , · · · , c K ( s )] ∈ C 1 × K d and n ( s ) ≜  c 1 ( s ) W H 1 , · · · , c K ( s ) W H K  H ∈ C K d × 1 , so that T k ( s 1 , s ) = µδ ( s 1 − s ) + m ( s 1 ) n ( s ) . (E.1) Substituting (E.1) into (38) and applying Lemma 1 giv es v k ( s ) = Z S B  1 µ δ ( s 1 − s ) − 1 µ m ( s ) ( I K d + G ) − 1 1 µ n ( s 1 )  c k ( s 1 ) W k d s 1 = 1 µ c k ( s ) W k − 1 µ 2 m ( s ) ( I K d + G ) − 1 Z S B n ( s 1 ) c k ( s 1 ) W k d s 1 ≜ 1 µ c k ( s ) W k − m ( s ) D k , (E.2) where D k ≜ 1 µ 2 ( I K d + G ) − 1 R S B n ( s ) c k ( s ) W k d s and G = 1 µ R S B n ( s ) m ( s ) d s . Since D k ∈ C K d × d , it can be partitioned into K stacked block matrices as D k =  ¯ D T k 1 , . . . , ¯ D T kK  T with ¯ D ki ∈ C d × d , ∀ i . Using the definition of m ( s ) in (E.1), (E.2) can be rewritten as v k ( s ) = 1 µ c k ( s ) W k − X K i =1 c i ( s ) ¯ D ki = X K i =1 c i ( s ) F ki , (E.3) where F kk = 1 µ W k − ¯ D kk and F ki = − ¯ D ki for i  = k . Substituting the definition of c i ( s ) in (34) into (E.3) yields v k ( s ) = X K i =1  Z S i U h H i ( r , s ) u i ( r ) d r  F i . (E.4) T o handle the inv erse kernel J k ( r 1 , r 2 ) in (29), define o k ( r ) ≜ [ a k 1 ( r ) , · · · , a kK ( r )] ∈ C 1 × K d , so that J k ( r 1 , r 2 ) = σ 2 n δ ( r 1 − r ) + o k ( r 1 ) o H k ( r 2 ) . (E.5) Substituting (E.5) into (29) and applying Lemma 1 giv es u k ( r ) = Z S k U  1 σ 2 n δ ( r 1 − r ) − 1 σ 4 n o k ( r )  I K d + ¯ G k  − 1 o H k ( r 1 )  a kk ( r 1 ) d r 1 = 1 σ 2 n a kk ( r ) − 1 σ 4 n o k ( r )( I K d + G k ) − 1 Z S k U o H k ( r 1 ) a kk ( r 1 ) d r 1 ≜ 1 σ 2 n a kk ( r ) − 1 σ 4 n o k ( r ) Z k , (E.6) where Z k ≜ ( I K d + G k ) − 1 R S k U o H k ( r 1 ) a kk ( r 1 ) d r 1 and G k = 1 σ 2 n R S k U o H k ( r ) o k ( r ) d r . Since Z k ∈ C K d × d , it can be partitioned into K stacked block matrices as Z k =  ¯ Z T k 1 , . . . , ¯ Z T kK  T with ¯ Z ki ∈ C d × d , ∀ i . Using the definition of o ( r ) in (E.5), (E.6) can be rewritten as u k ( r ) = 1 σ 2 n a kk ( r ) − X K i =1 a ki ( r ) ¯ Z ki = X K i =1 a ki ( r ) L ki , (E.7) where L kk = 1 σ 2 n I d − ¯ Z kk and L ki = − ¯ Z ki for i  = k . Using (E.7) and (E.4), the updates of u ( l +1) k ( r ) and v ( l +1) k ( s ) at iteration ( l + 1) in T able I can be written as u ( l +1) k ( r ) = X K i =1 a ( l ) ki ( r ) L ( l ) ki , (E.8a) v ( l +1) k ( s ) = X K i =1  Z S i U h H i ( r , s ) u ( l +1) i ( r ) d r  F ( l +1) i , (E.8b) where a ( l ) ij ( r ) = R S B h i ( r , s 1 ) v ( l ) j ( s 1 ) d s 1 , and L ( l ) ki and F ( l ) i are the corresponding coefficient matrices at iteration ( l + 1) . Substituting (E.8a) into (E.8b) giv es v ( l +1) k ( s ) = K X i =1 Z S i U h H i ( r , s )  X K j =1 a ( l ) ij ( r ) L ( l ) ij  d r F ( l +1) i = Z S k U h H k ( r , s ) a ( l ) kk ( r ) d rΘ k + K X i =1 K X j =1 ( i,j )  =( k,k ) Z S k U h H i ( r , s ) a ( l ) ij ( r ) d rΣ k ij , (E.9) where Θ k = L ( l +1) kk F ( l +1) k and Σ k ij = L ( l +1) ij F ( l +1) i . This is exactly (43), which completes the proof. R E F E R E N C E S [1] E. G. Larsson, O. Edfors, F . T ufvesson, and T . L. Marzetta, “Massive MIMO for next generation wireless systems, ” IEEE Commun. Mag. , vol. 52, no. 2, pp. 186–195, 2014. [2] E. Bj ¨ ornson, E. G. Larsson, and T . L. Marzetta, “Massi ve MIMO: ten myths and one critical question, ” IEEE Commun. Mag. , vol. 54, no. 2, pp. 114–123, 2016. [3] C. Huang, S. Hu, G. C. Alexandropoulos, A. Zappone, C. Y uen, R. Zhang, M. D. Renzo, and M. Debbah, “Holographic MIMO surfaces for 6G wireless networks: Opportunities, challenges, and trends, ” IEEE W irel. Commun. , vol. 27, no. 5, pp. 118–125, 2020. [4] S. Bahanshal, Q.-U.-A. Nadeem, and M. Jahangir Hossain, “Holographic MIMO: How many antennas do we need for energy efficient communi- cation?” IEEE Tr ans. W ireless Commun. , vol. 24, no. 1, pp. 118–133, 2025. 13 [5] S. Chen and S. Han, “Learning-based multiuser beamforming for holographic mimo systems, ” arXiv preprint , 2026. [6] Y . Liu, X. Liu, X. Mu, T . Hou, J. Xu, M. Di Renzo, and N. Al-Dhahir, “Reconfigurable intelligent surfaces: Principles and opportunities, ” IEEE Commun. Surv . T utor . , vol. 23, no. 3, pp. 1546–1577, 2021. [7] E. Basar, M. Di Renzo, J. De Rosny , M. Debbah, M.-S. Alouini, and R. Zhang, “W ireless communications through reconfigurable intelligent surfaces, ” IEEE Access , vol. 7, pp. 116 753–116 773, 2019. [8] B. Zhao, C. Ouyang, X. Zhang, and Y . Liu, “Continuous-aperture ar- ray (CAP A)-based wireless communications: Capacity characterization, ” IEEE T rans. Wir eless Commun. , 2025, early access. [9] L. Sanguinetti, A. A. D’Amico, and M. Debbah, “W avenumber -division multiplexing in line-of-sight holographic MIMO communications, ” IEEE T rans. W ir eless Commun. , vol. 22, no. 4, pp. 2186–2201, Apr. 2023. [10] M. Qian, L. Y ou, X.-G. Xia, and X. Gao, “On the spectral efficiency of multi-user holographic MIMO uplink transmission, ” IEEE T rans. W ireless Commun. , vol. 23, no. 10, pp. 15 421–15 434, Oct. 2024. [11] Z. Zhang and L. Dai, “Pattern-di vision multiplexing for multi-user continuous-aperture MIMO, ” IEEE J. Sel. Areas Commun. , vol. 41, no. 8, pp. 2350–2366, Aug. 2023. [12] Z. W ang, C. Ouyang, and Y . Liu, “Beamforming optimization for continuous aperture array (CAP A)-based communications, ” IEEE Tr ans. W ireless Commun. , 2025, early access. [13] ——, “Optimal beamforming for multi-user continuous aperture array (CAP A) systems, ” IEEE T rans. Commun. , 2025, early access. [14] Y . Liu, C. Ouyang, Z. W ang, J. Xu, X. Mu, and Z. Ding, “CAP A: Continuous-aperture arrays for rev olutionizing 6G wireless communica- tions, ” , 2024. [15] M. Qian, X. Mu, L. Y ou, and M. Matthaiou, “Continuous aper- ture array (CAP A)-based multi-group multicast communications, ” arXiv:2505.01190 , 2025. [16] Z. W ang, C. Ouyang, and Y . Liu, “Beamforming design for continuous aperture array (CAP A)-based MIMO systems, ” , 2025. [17] J. Guo, Y . Liu, and A. Nallanathan, “Multi-user continuous-aperture array communications: Ho w to learn current distrib ution?” in Pr oc. IEEE 43r d Glob. Commun. Conf. , 2024, pp. 1–6. [18] S. Chen, J. Guo, and S. Han, “Implicit neural representation of beamforming for continuous aperture array systems, ” IEEE T rans. V eh. T echnol. , 2026, early Access. [19] M. Eisen and A. Ribeiro, “Optimal wireless resource allocation with random edge graph neural networks, ” IEEE T rans. Signal Process. , vol. 68, pp. 2977–2991, 2020. [20] J. Guo and C. Y ang, “Learning power allocation for multi-cell-multi- user systems with heterogeneous graph neural networks, ” IEEE T rans. W ireless Commun. , vol. 21, no. 2, pp. 884–897, T eb . 2021. [21] Y . Shen, Y . Shi, J. Zhang, and K. B. Letaief, “Graph neural networks for scalable radio resource management: Architecture design and theoretical analysis, ” IEEE J. Sel. Are as Commun, , vol. 39, no. 1, pp. 101–115, 2021. [22] M. Lee, G. Y u, and G. Y . Li, “Graph embedding-based wireless link scheduling with few training samples, ” IEEE T rans. W ireless Commun. , vol. 20, no. 4, pp. 2282–2294, 2021. [23] Z. Zhang, T . Jiang, and W . Y u, “Learning based user scheduling in reconfigurable intelligent surface assisted multiuser downlink, ” IEEE J. Sel. T opics Signal Pr ocess. , vol. 16, no. 5, pp. 1026–1039, 2022. [24] B. Zhao, J. Guo, and C. Y ang, “Understanding the performance of learn- ing precoding policies with graph and con volutional neural networks, ” IEEE T rans. Commun. , vol. 72, no. 9, pp. 5657–5673, 2024. [25] S. He, J. Y uan, Z. An, W . Huang, Y . Huang, and Y . Zhang, “Joint user scheduling and beamforming design for multiuser MISO downlink systems, ” IEEE T rans. W ireless Commun. , v ol. 22, no. 5, pp. 2975–2988, 2023. [26] J. Guo and C. Y ang, “ A model-based GNN for learning precoding, ” IEEE T rans. Wir eless Commun. , vol. 23, no. 7, pp. 6983–6999, 2024. [27] L. Zhang, S. Han, and C. Y ang, “Gradient-driv en graph neural networks for learning digital and hybrid precoder, ” IEEE T rans. Commun. , vol. 74, pp. 706–722, 2026. [28] Q. Hu, Y . Cai, Q. Shi, K. Xu, G. Y u, and Z. Ding, “Iterative algorithm induced deep-unfolding neural networks: Precoding design for multiuser MIMO systems, ” IEEE Tr ans. Wir eless Commun. , vol. 20, no. 2, pp. 1394–1410, 2021. [29] A. Chowdhury , G. V erma, A. Swami, and S. Segarra, “Deep graph unfolding for beamforming in MU-MIMO interference networks, ” IEEE T rans. Wir eless Commun. , vol. 23, no. 5, pp. 4889–4903, 2024. [30] O. Lavi and N. Shlezinger , “Learn to rapidly and robustly optimize hybrid precoding, ” IEEE T rans. Commun. , vol. 71, no. 10, pp. 5814– 5830, 2023. [31] Y . Y uan, G. Zheng, K.-K. W ong, B. Ottersten, and Z.-Q. Luo, “Transfer learning and meta learning-based fast downlink beamforming adapta- tion, ” IEEE T rans. W ireless Commun.s , vol. 20, no. 3, pp. 1742–1755, 2021. [32] J. Kim, H. Lee, S.-E. Hong, and S.-H. P ark, “ A bipartite graph neural network approach for scalable beamforming optimization, ” IEEE T rans. W ireless Commun. , vol. 22, no. 1, pp. 333–347, 2023. [33] S. Chen, S. Han, and Y . Li, “Gradient-based information aggregation of GNN for precoder learning, ” in Pr oc. IEEE 97th V eh. T echnol. Conf. , Dec. 2023, pp. 1–6. [34] S. Chen, S. Han, and J. Guo, “Functional WMMSE algorithm for multiuser continuous aperture array systems, ” in Proc. IEEE 24th Int. Conf. Commun. , 2026, pp. 1–6. [35] M. Ghermezcheshmeh and N. Zlatanov , “Parametric channel estimation for LoS dominated holographic massiv e MIMO systems, ” IEEE Access , vol. 11, pp. 44 711–44 724, May 2023. [36] A. Pizzo, L. Sanguinetti, and T . L. Marzetta, “Fourier plane wave series expansion for holographic MIMO communications, ” IEEE T rans. W ireless Commun. , vol. 21, no. 9, pp. 6890–6905, Sept. 2022. [37] H. F . Davis, F ourier Series and Orthogonal Functions . Ne w Y ork: Dover Publications, 1953. [38] Y . Peng, J. Guo, and C. Y ang, “Learning resource allocation policy: V ertex-GNN or edge-GNN?” IEEE T rans. Mach. Learn. Commun. Netw . , vol. 2, pp. 190–209, 2024. [39] J. Guo and C. Y ang, “Recursive gnns for learning precoding policies with size-generalizability , ” IEEE Tr ans. Mac h. Learn. Commun. Netw . , vol. 2, pp. 1558–1579, Oct. 2024. [40] P . J. Davis and P . Rabinowitz, Methods of Numerical Integration . Courier Corporation, 2007. [41] B. Burley , “Practical hash-based owen scrambling, ” J. Comput. Gr aph. T ech. , vol. 9, no. 4, pp. 1–20, 2020. [42] E. Bj ¨ ornson, F . Kara, N. K olomvakis, A. Kosasih, P . Ramezani, and M. B. Salman, “Enabling 6G performance in the upper mid-band by transitioning from massive to gigantic MIMO, ” IEEE Open J. Commun. Soc. , 2025. [43] T . M. Co ver and J. A. Thomas, Elements of Information Theory , 2nd ed. Hoboken, NJ, USA: John Wile y & Sons, 2006.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment