Wireless Digital Twin Calibration: Refining DFT-Domain Channel Information

W ireless Digital T win Calibration: Reﬁning DFT -Domain Channel Information Hao Luo ∗ , Saeed R. Khosravirad † , and Ahmed Alkhateeb ∗ ∗ School of Electrical, Computer , and Energy Engineering, Arizona State Uni versity { h.luo, alkhateeb } @asu.edu † Nokia Bell Laboratories { saeed.khosravirad } @nokia-bell-labs.com Abstract —Wir eless digital twins can be leveraged to pro- vide site-speciﬁc synthetic channel information thr ough precise physical modeling and signal propagation simulations. This can help reduce the o verhead of channel state information (CSI) acquisition, particularly needed for lar ge-scale MIMO systems. For high-quality digital twin channels, the classical approach is to increase the digital twin ﬁdelity via mor e accurate modeling of the en vir onment, propagation, and hard ware. This, howev er , comes with high computational cost, making it unsuitable f or real-time applications. In this paper , we propose a new framework that, instead of calibrating the digital twin model itself, calibrates the DFT -domain channel information to reduce the gap between the low-ﬁdelity digital twin and its high-ﬁdelity counterpart or the real world. This allows systems to leverage a low-complexity digital twin for generating real-time channel information with- out compr omising quality . T o ev aluate the effectiv eness of the proposed approach, we adopt codebook-based CSI feedback as a case study , where reﬁned synthetic channel information is used to identify the most relevant DFT codewords for each user . Simulation results demonstrate the effectiveness of the proposed digital twin calibration approach in achieving high CSI acquisition accuracy while reducing the computational overhead of the digital twin. This pa ves the way for realizing digital twin assisted wireless systems. I . I N T R O D U C T I O N Current and future wireless communication networks rely heavily on scaling MIMO systems at the infrastructure and user devices. Employing larger antenna arrays can lead to potential gains in spatial multiplexing and array gains. Fully realizing these gains, howe ver , requires acquiring channel state information (CSI) for downlink precoding design. In frequency division duplex systems, downlink CSI acquisition typically consists of three steps [1]: (i) The base station (BS) sends pilot signals, and users estimate their channels based on the receiv ed pilot signals. (ii) Users compress and quantize their estimated channels using a predeﬁned codebook and report the corresponding codeword indices and coef ﬁcients back to the BS. (iii) The BS reconstructs the CSI from the reported feedback and designs the downlink precoding vectors accord- ingly . This process incurs signiﬁcant overhead, especially in large-scale MIMO systems, due to the need for extensiv e pilot transmission and feedback. Therefore, reducing the overhead of downlink CSI acquisition is crucial for enabling ef ﬁcient communication in large-scale MIMO systems. W ireless digital twins [2] hav e emerged with the poten- tial to beneﬁt large-scale antenna systems. Through multi- modal sensing and measurement techniques, digital twins can replicate real-world communication en vironments using 3D geometric models, electromagnetic (EM) material properties, and hardware representations. Moreover , signal propagation simulations, such as ray tracing, can be used generate synthetic channel information for a given transmitter–receiv er pair . This channel information can then serv e as prior kno wledge for CSI acquisition and reduce the overhead. Nonetheless, digital twins present their own set of challenges: (i) The digital twin may not be perfectly modeled, which leads to inaccurate channel information. (ii) Ray tracing simulations can be time- consuming, making them unsuitable for real-time applications. (iii) The communication environment is often dynamic, so digital twins need to adapt to changing conditions. Therefore, further research efforts are needed to fully realize the potential of digital twins in wireless communication systems. Prior work has explored how digital twins can assist real- time physical layer operations of wireless communication systems. For example, in [3], a wireless digital twin is used to generate synthetic channel information for assisting downlink CSI acquisition. This type of application, ho wever , requires not only accurate modeling but also fast simulations to sup- port real-time operation. A potential solution is to use low- complexity ray tracing algorithms to accelerate the simulation process, though this often compromises the accuracy of the resulting channel information. Motiv ated by the inherent trade- off between the accuracy and complexity , in this paper , we study a lightweight approach to calibrate a lo w-complexity digital twin to reﬁne its synthetic channel information. Notably , wireless digital twin calibration is a nov el research area, and the calibration may be applied at different points in the processing chain. For instance, the twin itself could be calibrated if it is modular and parameterized (e.g., EM properties could be adjusted based on real-world measurements [4]). Alternatively , one can calibrate its output, such as the synthetic channel information, which is the speciﬁc focus of this paper . In this work, we propose a novel calibration approach that reﬁnes the synthetic channel information generated by a low- complexity digital twin. The calibration is implemented using a lightweight deep learning model, which can be ef ﬁciently trained via ofﬂine supervised learning. The contributions of this paper are summarized as follows: • W e propose a digital twin calibration approach that lev er- ages a lightweight deep learning model to reﬁne the synthetic channel information, which is represented by the discrete Fourier transform (DFT) weights of the user channel. The calibration model can be trained with ofﬂine Low-fidelity DFT weights BS Classical Wir eless System UE Downlink channel estimation CSI feedback : Signal propagation law : Communication environment : Hardware characteristics BS Digital T win Aided Wireless System UE CSI feedback : Signal propagation law : Communication environment : Hardware characteristics C SI estimation with selected codewords Online refinement with deep learning Refined DFT weights Offline training dataset or High-fidelity digital-twin channel Historical real-world data BS UE High-Fidelity Digital T win : Ray-tracing : EM 3D model : Hardware model High-complexity ray tracing High-fidelity EM 3D model BS UE Low-Fidelity Digital T win : Ray-tracing : EM 3D model : Hardware model Low-complexity ray tracing Low-fidelity EM 3D model Fig. 1. This ﬁgure illustrates the proposed digital twin calibration framework and its use case of codebook-based CSI feedback. The BS generates synthetic channel information for a user using a low-ﬁdelity digital twin. The synthetic channel information is then reﬁned using a lightweight deep learning model. The reﬁned channel information is used to select the most relev ant DFT codewords for that user . The user receiv es pilot signals corresponding to these selected codew ords and estimates their coefﬁcients, which are reported back to the BS for downlink precoder design. supervised learning, using either high-ﬁdelity digital twin data or historical CSI feedback from existing systems. • W e demonstrate the effecti veness of the proposed cali- bration approach with a case study of codebook-based CSI feedback. The reﬁned synthetic channel information is used to identify the most relev ant codewords, i.e., DFT beams, for each user . This enables more efﬁcient downlink CSI feedback and reduces pilot signal ov erhead. Simulation results highlight the effecti veness of our proposed approach in improving CSI acquisition accuracy while simul- taneously reducing computational overhead of the digital twin. Furthermore, the frame work enables a lo w-complexity digital twin to generate synthetic channel information which, through calibration, can achieve performance lev els approaching those of high-ﬁdelity digital twins. This capability brings us closer to the vision of real-time digital twins [2]. Finally , it is worth noting that the proposed methodology is not limited to codebook-based CSI feedback but can also be applied to beam management for mmW ave frequency bands [5]. I I . S Y S T E M M O D E L As sho wn in Fig. 1, we consider a MIMO system with K single-antenna users served by a BS equipped with N antennas. The BS is assisted by a digital twin that provides prior knowledge of the user channel, which is used to reduce the ov erhead of downlink CSI acquisition in the real world. Next, we present the system model, including the signal model, channel model, and digital twin model. A. Signal and Channel Models The BS precodes the transmitted symbol for the k th user , denoted as s k ∈ C , using a precoding vector f k ∈ C N × 1 . The receiv ed signal at the k th user is giv en by y k = h H k f k s k + X l  = k h H k f l s l + n k , (1) where h k ∈ C N × 1 is the channel between the BS and the k th user , and n k ∼ N (0 , σ 2 ) is the receiv e noise at the k th user . A geometric channel model is considered, where the channel between the BS and the k th user is represented as a sum of L k multi-path components, giv en by h k = L k X l =1 α k,l a ( ϕ k,l , θ k,l ) , (2) where α k,l is the complex gain, and a ( ϕ k,l , θ k,l ) is the array response vector at the BS, which depends on the azimuth and elev ation angles of departure (AoD) ϕ k,l and θ k,l , respectively . B. Digital T win Model W ireless digital twins are virtual representations of the physical communication en vironment, which can be used to generate synthetic channel information [2]. In the real world, wireless channels are determined by three factors: (i) Commu- nication environment E comprises the positions, orientations, dynamics, and shapes of the BS, users, and surrounding objects, e.g., reﬂectors and scatterers. (ii) Signal propagation law g(.) determines how signals propagate through the en vi- ronment. (iii) Hardware characteristics S specify the physi- cal properties and impairments of communication hardware. Giv en these components, the channel set for the K users H = { h 1 , . . . , h K } can be expressed as: H = g ( E , S ) . (3) Digital twins approximate the communication environment using a 3D EM model and simulate signal propagation via ray tracing. Hardware characteristics can be modeled through a combination of measurement and analytical techniques. Thus, the synthetic channel e H = { e h 1 , . . . , e h K } generated by the site-speciﬁc digital twin can be written as e H = e g ( e E , e S ) , (4) where e g ( . ) , e E , and e S denote the ray tracing algorithm, the 3D EM model, and the hardware model, respectively . I I I . P R O B L E M F O R M U L A T I O N In this work, we aim to reﬁne the synthetic channel in- formation generated by a low-comple xity digital twin using a lightweight deep learning model. W e speciﬁcally focus on reﬁning the DFT -domain channel information, which is rep- resented by the DFT weights of the user channel. The DFT weights for the k th user are computed as z k = | D H h k | ∈ R N × 1 , where D = [ d 1 , . . . , d N ] ∈ C N × N is the DFT matrix comprising N orthogonal DFT v ectors. When we apply the DFT matrix to the spatial-domain channel, it can be viewed as projecting the channel onto a set of orthogonal beams, i.e., DFT beams. Thus, the DFT weights can also be interpreted as the correlation between the user channel and each DFT beam that covers a speciﬁc angular direction. Aside from the DFT weights, the user’ s position can also provide useful information for reﬁning the channel information. This is because the user’ s position helps to resolve ambiguities in the DFT weights, especially when the user is in a non-line-of-sight (NLoS) region. Similar DFT weights may require different reﬁnements according to the geometry around the users. Therefore, we consider both the user position and the DFT weights generated by the digital twin as inputs to our calibration model. The goal of the calibration is to learn a mapping function to reﬁne the DFT weights generated by the low-comple xity digital twin. Let p k ∈ R 3 × 1 denote the 3D position of the k th user , and e z k = | D H e h k | ∈ R N × 1 represent the DFT weights obtained from the low-comple xity digital twin. The mapping function can be denoted as f ( p k , e z k ; Θ) , where Θ is the set of learnable parameters. The optimization problem can be formulated as min Θ K X k =1 ∥ z ⋆ k − b z k ∥ 2 s.t. b z k = f ( p k , e z k ; Θ) , ∀ k , (5) where z ⋆ k is the ground truth DFT weights, and b z k is the reﬁned DFT weights predicted by the mapping function. The objective function in (5) minimizes the difference between the reﬁned DFT weights and the ground truth DFT weights for all users. In the next section, we present the proposed training approaches for obtaining the ground truth DFT weights and the model design for the mapping function. I V . P R O P O S ED S O L U T I O N In this section, we present the proposed digital twin calibra- tion approach to reﬁne the DFT -domain channel information generated by a low-comple xity digital twin. W e also discuss its application in codebook-based CSI feedback. A. Ke y Idea A wireless digital twin with higher ﬁdelity can provide more accurate synthetic channel information; ho we ver , it often comes with higher computational costs, which mak es it unsuitable for real-time applications. In contrast, a low-comple xity digital twin can generate synthetic channel information more quickly but may compromise accuracy . T o address this trade-off, we propose a calibration approach that reﬁnes the DFT -domain channel information generated by a low-comple xity digital twin. The calibration is performed using a lightweight deep learning model, which offers the following two beneﬁts: • Fidelity enhancement : A high-ﬁdelity digital twin can provide more accurate synthetic channel information, but its computational demands may be prohibiti ve for real- time applications. The proposed calibration approach can learn from the high-ﬁdelity digital twin in an ofﬂine phase, offering a faster alternati ve to directly using a high-ﬁdelity digital twin in the online phase. • Real-world adaptation : DFT -based codebooks are widely used in current standardized wireless systems [1], and it is feasible to collect historical CSI feedback data from these systems. With this data, we can train the cal- ibration model to adapt the digital twin to the real-world communication en vironment, bridging the gap between synthetic and actual channel information. In the following subsections, we present the details of the proposed digital twin calibration approach. B. T raining Approac hes of the Calibration Model In this subsection, we introduce two training approaches for the calibration model. This calibration is achieved using deep learning with supervised training. Labeled data for training can be generated in two ways, depending on the target context: • High-ﬁdelity digital twin : When a high-ﬁdelity digital twin is the reﬁning target, we can directly obtain the true DFT weights from its synthetic channel information. In an ofﬂine phase, we generate a paired dataset for training that includes user positions and the corresponding DFT weights from both the lo w-complexity and high-ﬁdelity digital twins. This dataset is then used to train our model to reﬁne the output of the low-comple xity twin. • Real world : When the goal is to adapt the digital twin to the real world, historical CSI feedback from current standardized wireless systems can serve as labels. For example, 3GPP T ype-II codebook allows users to report the best 2 to 4 DFT beam indices along with their cor- responding coefﬁcients [1]. By correlating user positions with historical CSI feedback, we can construct labeled datasets necessary for training the calibration model. In this work, we adopt the U-Net architecture [6] to learn the calibration. The user’ s position is ﬁrst passed through an embedding layer to obtain a ﬁxed-size vector representation, which is then concatenated with the DFT weights. The cali- bration process can be expressed as e k = f embed ( p k ) , (6) b z k = f UNet ( e k , e z k ) , (7) where f embed ( . ) is the embedding function that maps the user position to a ﬁxed-size vector , and f UNet ( . ) is the U-Net model that learns the mapping from the user position and the DFT weights to the reﬁned DFT weights. The loss function for T ABLE I. Proposed U-Net Model Architecture Layer Module K. S. Input Shape Output Shape Encoder Input & Concat. MLP - - ( B , N ) & ( B , 3) ( B , 2 , N ) Layer 1 Con v . Block 3 1 ( B , 2 , N ) ( B , 16 , N ) Downsample MaxPool 3 2 ( B , 16 , N ) ( B , 16 , N/ 2) Layer 2 2 × Conv . Blocks 3 1 ( B , 16 , N/ 2) ( B , 32 , N/ 2) Downsample MaxPool 3 2 ( B , 32 , N/ 2) ( B , 32 , N/ 4) Layer 3 2 × Conv . Blocks 3 1 ( B , 32 , N/ 4) ( B , 64 , N/ 4) Downsample MaxPool 3 2 ( B , 64 , N/ 4) ( B , 64 , N/ 8) Layer 4 2 × Conv . Blocks 3 1 ( B , 64 , N/ 8) ( B , 128 , N/ 8) Decoder Upcon v Con vTranspose 3 2 ( B , 128 , N/ 8) ( B , 64 , N/ 4) Layer 1 2 × Conv . Blocks 3 1 ( B , 64 + 64 , N/ 4) ( B , 64 , N/ 4) Upcon v Con vTranspose 3 2 ( B , 64 , N/ 4) ( B , 32 , N/ 2) Layer 2 2 × Conv . Blocks 3 1 ( B , 32 + 32 , N/ 2) ( B , 32 , N/ 2) Upcon v Con vTranspose 3 2 ( B , 32 , N/ 2) ( B , 16 , N ) Layer 3 2 × Conv . Blocks 3 1 ( B , 16 + 16 , N ) ( B , 16 , N ) Output Output Layer 1 × Con v . Block 3 1 ( B , 16 , N ) ( B , 1 , N ) Final Output Softmax - - ( B , 1 , N ) ( B , 1 , N ) Note: K. and S. denote the kernel size and stride, respectiv ely . B denotes the batch size. MLP stands for multi-layer perceptron. training the calibration model can be deﬁned as the mean squared error (MSE), giv en by L ( z ⋆ k , b z k ) = ∥ z ⋆ k − b z k ∥ 2 2 . (8) In the ﬁrst training approach, z ⋆ k is obtained from the high- ﬁdelity digital twin, allowing us to use the full DFT weights as ground truth. For the second training approach, z ⋆ k is deriv ed from historical CSI feedback, which provides the top P DFT beam indices and their corresponding coefﬁcients. Consequently , z ⋆ k is subject to the constraint that only P codew ords are selected, i.e., ∥ z ⋆ k ∥ 0 = P . C. Deep Learning Model Arc hitectur e W e adopt a 1-dimensional U-Net architecture for the cal- ibration model, as summarized in T able I. The model takes two inputs: DFT weights and a N -dimensional embedded representation of the user’ s position. The position is processed by a multi-layer perceptron to provide contextual information. The U-Net follows an encoder-decoder structure with skip connections. The encoder down-samples the feature maps through three layers, each consisting of two sequential sets of con volutional, batch normalization, and ReLU layers, followed by a max-pooling operation. The decoder mirrors this, up- sampling the features with transpose conv olutions. Skip con- nections concatenate feature maps from corresponding encoder layers to their upsampled counterparts in the decoder . The ﬁnal output layer is a conv olutional operation followed by a softmax activ ation, which produces the reﬁned DFT coefﬁcients. D. Case Study: Codebook-Based CSI F eedback Now that we have established the digital twin calibration approach, an important question remains: How can we utilize the reﬁned DFT -domain channel information to assist wireless communication tasks? T o answer this question, we consider codebook-based CSI feedback as a case study , which is illus- trated in Fig. 1. The existing codebook-based CSI feedback methods, e.g., T ype-I/II codebooks in 3GPP [1], require full channel estimation at the user, which incurs high pilot signal ov erhead, particularly for lar ge-scale MIMO systems. Ho wev er , with the assistance of digital twins, we can lev erage the syn- thetic channel information to reduce this o verhead. Speciﬁcally , we use the reﬁned DFT -domain channel information to identify the most relev ant DFT beams for each user before the CSI feedback process. The digital twin aided CSI feedback process consists of the following steps. First, the BS generates synthetic channel information e z k using digital twin. Then, the synthetic channel information is reﬁned using the trained calibration model, resulting in b z k . The reﬁned channel information serves as prior knowledge for the BS to identify the most relev ant codew ords for each user . The BS selects the P codewords that hav e the highest values in b z k , which can be expressed as { b i k, 1 , . . . , b i k,P } = arg sort i = { 1 ,...,N } [ b z k ] i . (9) The BS then transmits pilot signals precoded by these selected DFT code words, allo wing users to estimate their coef ﬁcients. The receiv ed signal at user k can be expressed as b y k = h H k b Q k S k + n k , (10) where b Q k ∈ C N × P is the sub-matrix of the DFT codebook corresponding to the selected code words. S k ∈ C P × P is the matrix that contains P transmitted pilot symbols on its diagonal. n k ∼ N (0 , σ 2 I ) is the noise vector at the k th user . This measurement process can be viewed as projecting the channel onto the selected DFT codewords, which enables users to estimate the effecti ve channel coefﬁcients. Based on the receiv ed signal, each user can estimate the effecti ve channel coefﬁcients and report the normalized coefﬁcients back to the BS, denoted as b x k ∈ C P × 1 . Finally , the BS reconstructs the CSI from the reported coefﬁcients as a linear combination of the selected codewords, i.e., b w k = b Q k b x k , and designs the precoding vector b f k accordingly . This step is similar to existing codebook-based CSI feedback methods. Overall, this digital twin aided approach leverages reﬁned synthetic channel information to reduce downlink pilot transmission overhead while maintaining the same feedback overhead as existing methods. Notably , this approach is not limited to codebook- based CSI feedback but can also be applied to beam selection for mmW ave frequency bands [5]. V . S I M U L A T I O N R E S U LT S A. Simulation Setup Scenario setup: In the simulation, we consider two ray- tracing scenarios. First, the tar get scenario represents either a high-ﬁdelity digital twin or a real-world communication en vironment, with its geometry reﬂecting the Arizona State Univ ersity (ASU) campus, as illustrated in Fig. 2. For this scenario, we used Wireless Insite [7], a high-complexity ray tracer , to simulate signal propagation. W e set the maximum number of reﬂections and diffractions to 6 and 1 , respectiv ely , BS BS orientation User grid Fig. 2. This ﬁgure shows the bird’ s-eye view of the Arizona State University (ASU) campus, which serves as the study area for the simulation. The BS is located at the rooftop of a building at the top right corner , and the user grid is highlighted by the red box. and enabled diffuse scattering. Second, the baseline scenario serves a low-ﬁdelity digital twin, utilizing Sionna R T [8] for ray tracing. In this scenario, we set the maximum number of interactions to 6, with reﬂections, diffractions, and scattering all enabled. It is important to note that Sionna v0.19.2 assumes diffraction is the only interaction if it exists in a path. While this signiﬁcantly reduces the computational complexity of the ray-tracing simulation, it can result in less accurate channel information. For both scenarios, we consider a BS equipped with a uniform linear array (ULA) of N = 32 antennas, positioned at a height of 22 m. The operation frequency is set to 3 . 5 GHz. A user grid is deployed near the BS, measuring 410 m × 320 m, with single-antenna users uniformly distributed at a spacing of 2 . 5 m. Finally , a DFT codebook comprising 32 codew ords is adopted, and the number of selected code words for each user’ s CSI feedback is set to P = 4 . Dataset generation: W e generate our dataset by performing ray-tracing simulations in both scenarios. This process yields essential path parameters, including the complex gain α k,l , as well as the azimuth and elev ation AoDs, ϕ k,l and θ k,l . Sub- sequently , we employ the DeepMIMO channel generator [9] to produce the user channels. The channels of both scenarios are then utilized to construct the paired data necessary for training our calibration model. The training dataset comprises 13191 samples. T o ev aluate the spatial generalization ability of the calibration model, we also generate an additional 7895 off-the-grid samples. These samples are distinct from the training dataset, created by randomly selecting user positions and simulating their channel coefﬁcients using the ray tracers. B. P erformance Evaluation In this subsection, we e valuate the performance of the pro- posed digital twin calibration approach. The main performance metric is the cosine similarity between the ground-truth channel and the estimated CSI, deﬁned as ρ ( h , b w ) = | h H b w | ∥ h ∥ 2 ∥ b w ∥ 2 . (11) Cosine similarity measures the directional alignment between the actual channel and the estimated CSI; higher values in- dicate more accurate CSI and greater potential for precoding performance. Next, we address the following key questions: What is the impact of a low-complexity digital twin on the performance? W e ﬁrst address this question by comparing the top-4 DFT beam indices selected in the target and baseline scenarios. In Fig. 3, we present heatmaps showing the absolute differences between these top-4 DFT beam indices. The main observation is that larger differences occur when the user is in the NLoS region, which is consistent with the fact that the baseline scenario uses a ray tracer with a simpliﬁed diffraction model. Another perspective is to ev aluate the performance when the channel from the low-comple xity digital twin is used directly as the estimated CSI, i.e., b w k = e h k . This serves as a benchmark to assess the impact of relying solely on a low-comple xity digital twin. In Fig. 4, we include this bench- mark and show its performance in the cumulati ve distribution function (CDF) of cosine similarity . The results indicate that directly using the channel from the low-complexity digital twin leads to poor performance, highlighting the necessity of a calibration approach to reﬁne this channel information. How effective is the proposed calibration approach? T o ev aluate the effecti veness of our proposed calibration approach, we compute the cosine similarity for the estimated CSI ob- tained using the top-4 DFT beams selected by our method. For comparison, we compute the cosine similarity for the tar get and baseline scenarios, where the estimated CSI is deri ved from the top-4 DFT beams selected by their respecti ve scenarios. These serve as upper and lower bounds for performance. W e also compare our proposed method against a grid search benchmark. Giv en the user position for ev aluation, the grid search method identiﬁes the nearest neighbor in the training user grid of the target scenario and uses the corresponding beams from that position. In Fig. 4, the results indicates that the proposed calibration approach achieves signiﬁcantly higher cosine similarity compared to the baseline scenario. Also, the proposed method consistently outperforms the grid search benchmark. This shows that the calibration model can effecti vely leverage both user position and low-ﬁdelity channel information to generalize to unseen user positions. Further- more, the performance of our proposed approach is close to that of the target scenario, demonstrating its effecti veness in reﬁning low-ﬁdelity channel information. How much computational overhead can be reduced? T o assess the computational efﬁcienc y , we analyze the computa- tion time required for the steps in volv ed in the process, which is conducted on an Nvidia R TX A5000 GPU. This includes the time taken for ray tracing, as well as the time required for the reﬁnement. The ﬁndings indicate that the computation time for ray tracing using Sionna R T (0.0592 seconds per sample) is much lower than that of W ireless Insite (1.2019 seconds per sample). Also, the number of parameters in our U-Net model is only 181 K, leading to a lightweight model where the inference time is minimal (0.0018 seconds per sample). Overall, the total computation time for the combination of Sionna R T and DL- Fig. 3. This ﬁgure presents the heatmaps illustrating the top-4 DFT beam indices selected by the Sionna RT (baseline) and Wireless Insite (target) ray tracers. The key observation is that larger differences occur when the user is situated in the NLoS region (the upper and lower -left areas of the user grid), which is aligned with the fact that the baseline scenario employs a ray tracer with a simpliﬁed diffraction model. Cosine similarity 0 0.2 0.4 0.6 0.8 1 CDF 0 0.2 0.4 0.6 0.8 1 DFT beams based on Wireless Insite channel Proposed method Grid search DFT beams based on Sionna channel Sionna channel The performance gain of DL-based refinement Sionna makes a simpler assumption for diffraction Fig. 4. This ﬁgure presents the CDF of the cosine similarity between the ground-truth channel and the estimated CSI using the top-4 DFT beams selected by the proposed calibration approach and the benchmark scenarios. By integrating a lightweight calibration model (inference: 0 . 0018 s/sample), our approach nears the performance upper bound while maintaining a total cost of 0 . 0610 s/sample, which is signiﬁcantly lower than the 1 . 2019 s/sample required by a high-ﬁdelity digital twin. based reﬁnement is substantially lower than that of W ireless Insite alone. This shows our proposed calibration approach can effecti vely leverage a low-comple xity digital twin to achiev e performance comparable to a high-ﬁdelity digital twin, while signiﬁcantly reducing computational ov erhead. V I . C O N C L U S I O N In this paper , we study the calibration problem of digital twins, which aims to reﬁne the synthetic channel information generated by a low-complexity digital twin using a lightweight deep learning model. W e present two training approaches, which can learn from either a high-ﬁdelity digital twin or his- torical CSI feedback from standardized wireless systems. W e demonstrate the effecti veness of the proposed approach with a case study of codebook-based CSI feedback, where the reﬁned synthetic channel information is used to identify the most relev ant codewords for each user . Simulation results highlight the ef fectiv eness of our proposed approach in improving CSI acquisition accuracy . For future work, an interesting direction is to extend this framework to support adapting the digital twin to dynamic environments based on limited feedback, such as receiv ed power [3]. This would enable the digital twin to continuously learn and adapt to changing conditions, further enhancing its utility in real-world applications. R E F E R E N C E S [1] 3GPP , “Physical Layer Procedures for Data (Release 15), ” 3rd Generation Partnership Project (3GPP), T echnical Speciﬁcation (TS) 38.214, Apr . 2022, version 15.16.0. [2] A. Alkhateeb, S. Jiang, and G. Charan, “Real-Time Digital T wins: V ision and Research Directions for 6G and Beyond, ” IEEE Commun. Mag. , vol. 61, no. 11, pp. 128–134, 2023. [3] S. Alikhani and A. Alkhateeb, “Digital T win Aided Channel Estima- tion: Zone-Speciﬁc Subspace Prediction and Calibration, ” in 2025 IEEE ICMLCN , 2025, pp. 1–6. [4] S. Jiang et al. , “Learnable W ireless Digital T wins: Reconstructing Elec- tromagnetic Field W ith Neural Representations, ” IEEE Open Journal of the Communications Society , vol. 6, pp. 1568–1590, 2025. [5] M. Giordani et al. , “A Tutorial on Beam Management for 3GPP NR at mmW ave Frequencies, ” IEEE Communications Surveys & T utorials , vol. 21, no. 1, pp. 173–196, 2018. [6] O. Ronneberger , P . Fischer, and T . Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation, ” in Proc. of International Conference on Medical Image Computing and Computer Assisted Intervention , 2015, pp. 234–241. [7] Remcom, “W ireless InSite, ” http://www .remcom.com/wireless- insite. [8] J. Hoydis et al. , “Sionna R T : Differentiable Ray T racing for Radio Propagation Modeling, ” in Pr oc. of IEEE Globecom W orkshops , 2023, pp. 317–321. [9] A. Alkhateeb, “DeepMIMO: A Generic Deep Learning Dataset for Mil- limeter W ave and Massive MIMO Applications, ” in Pr oc. of Inf. Theory and Appl. W orkshop , 2019, pp. 1–8.

Wireless Digital Twin Calibration: Refining DFT-Domain Channel Information

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment