Solar-GECO: Perovskite Solar Cell Property Prediction with Geometric-Aware Co-Attention
📝 Abstract
Perovskite solar cells are promising candidates for next-generation photovoltaics. However, their performance as multi-scale devices is determined by complex interactions between their constituent layers. This creates a vast combinatorial space of possible materials and device architectures, making the conventional experimental-based screening process slow and expensive. Machine learning models try to address this problem, but they only focus on individual material properties or neglect the important geometric information of the perovskite crystal. To address this problem, we propose to predict perovskite solar cell power conversion efficiency with a geometric-aware co-attention (Solar-GECO) model. Solar-GECO combines a geometric graph neural network (GNN) - that directly encodes the atomic structure of the perovskite absorber - with language model embeddings that process the textual strings representing the chemical compounds of the transport layers and other device components. Solar-GECO also integrates a co-attention module to capture intra-layer dependencies and inter-layer interactions, while a probabilistic regression head predicts both power conversion efficiency (PCE) and its associated uncertainty. Solar-GECO achieves state-of-the-art performance, significantly outperforming several baselines, reducing the mean absolute error (MAE) for PCE prediction from 3.066 to 2.936 compared to semantic GNN (the previous state-of-the-art model). Solar-GECO demonstrates that integrating geometric and textual information provides a more powerful and accurate framework for PCE prediction.
💡 Analysis
Perovskite solar cells are promising candidates for next-generation photovoltaics. However, their performance as multi-scale devices is determined by complex interactions between their constituent layers. This creates a vast combinatorial space of possible materials and device architectures, making the conventional experimental-based screening process slow and expensive. Machine learning models try to address this problem, but they only focus on individual material properties or neglect the important geometric information of the perovskite crystal. To address this problem, we propose to predict perovskite solar cell power conversion efficiency with a geometric-aware co-attention (Solar-GECO) model. Solar-GECO combines a geometric graph neural network (GNN) - that directly encodes the atomic structure of the perovskite absorber - with language model embeddings that process the textual strings representing the chemical compounds of the transport layers and other device components. Solar-GECO also integrates a co-attention module to capture intra-layer dependencies and inter-layer interactions, while a probabilistic regression head predicts both power conversion efficiency (PCE) and its associated uncertainty. Solar-GECO achieves state-of-the-art performance, significantly outperforming several baselines, reducing the mean absolute error (MAE) for PCE prediction from 3.066 to 2.936 compared to semantic GNN (the previous state-of-the-art model). Solar-GECO demonstrates that integrating geometric and textual information provides a more powerful and accurate framework for PCE prediction.
📄 Content
Machine learning has transformed materials science by enabling fast prediction of isolated material properties, such as bandgap, formation energy, or carrier mobility [1,2,3]. While these efforts have accelerated the discovery of promising compounds, they often focus on single-scale problems where the target property is intrinsic to the material itself. However, in real-world applications, optimal device performance arises from the coupled behavior of multiple components across diverse scales [4]. For complex optoelectronic devices such as perovskite solar cells, efficiency depends not only on the properties of the perovskite absorber, but also on the underlying interactions between transport layers, electrodes, and their interfaces [5]. This multiscale interdependence poses challenges that go beyond conventional single-material property prediction [6].
Perovskite solar cells have achieved rapid progress in laboratory efficiency [7], but their commercialization encounters a fundamental bottleneck: the extensive combinatorial space of potential materials, device architectures, and processing conditions. Each device layer-such as the hole transport layer (HTL), electron transport layer (ETL), and encapsulation-can be produced using several prospective materials, each with its own variants in stoichiometry, morphology, and processing [8,9]. Therefore, 39th Conference on Neural Information Processing Systems (NeurIPS 2025) Workshop: AI4Mat. arXiv: 2511.19263v1 [cs.LG] 24 Nov 2025 the total number of possible configurations grows exponentially. Physics-guided and intuition-driven design cannot explore this design space at the pace required by current innovation cycles [10,11]. The result is a gap between the growing diversity of prospect materials and the rate at which optimal full-device architectures can be identified.
Conventional development pipelines often adopt a sequential strategy: first identifying highperforming materials in isolation, then attempting to integrate them into full devices [12]. However, this approach can be misleading because many promising combinations only exhibit their full potential when considered holistically, as interactions between layers can enhance-or severely degradeperformance [13]. This phenomenon is not unique to perovskite solar cells; similar issues arise in other multicomponent systems such as batteries, catalysts, and thermoelectrics [14]. Bridging the gap between layer-level and full-device performance prediction requires models that can represent both intra-layer properties and inter-layer relationships within the same device. The fused representation is used to predict the power conversion efficiency (PCE) of the device and its associated uncertainty (right).
In this work, we propose to predict perovskite solar cell power conversion efficiency with a geometric-aware co-attention (Solar-GECO) model. Solar-GECO is a hybrid algorithm where we explicitly integrate crystal-level information from the perovskite layer with device-level architectural context. Unlike prior work on semantic device graphs [6], which rely solely on text embeddings from a large language model (LLM), our method processes the crystal structure of the perovskite absorber directly with a geometric graph neural network (GNN). This allows the extraction of physically grounded features from the crystal structure as shown in Figure 1. Other layers in the device are encoded using LLM-derived molecular embeddings.
In Solar-GECO, we also introduce self-attention and cross-attention mechanisms to jointly model intralayer dependencies and inter-layer interactions, capturing how the interaction between atoms in the perovskite layer and the context layer of the device propagates to the device performance. Significant variability in the PCE may derive from fabrication process sensitivities, including parameters like humidity and annealing temperature, and unmodeled factors such as human error, material insolubility, and poor wettability. To account for some of this inherent uncertainty, our model predicts the PCE using a Gaussian negative log-likelihood (NLL) loss function. We evaluate Solar-GECO on a curated subset of the Perovskite Database [15] and the Materials Project [16], achieving state-of-the-art performance. Our main contributions are as follows:
• We propose a novel model, Solar-GECO, that combines crystal-level geometric GNN encoding of the perovskite absorber with LLM-based molecular embeddings for the context device layers.
• We introduce a co-attention module that combines self-attention within layers and cross-attention across layers, enabling mutual refinement of graph and text representations.
• Solar-GECO models uncertainty in PCE predictions by training with a Gaussian NLL loss.
• Our model achieves state-of-the-art for PCE prediction in perovskite solar devices.
Property prediction in materials. Several machine learning approaches have been applied to predict material properties. E
This content is AI-processed based on ArXiv data.