Ab initio auxiliary-field quantum Monte Carlo (AFQMC) is a systematically improvable many-body method, but its application to extended solids has been severely limited by unfavorable computational scaling and memory requirements that obstruct direct access to the thermodynamic and complete-basis-set limits. By combining tensor hypercontraction via interpolative separable density fitting with $\mathbf{k}$-point symmetry, we reduce the computational and memory scaling of ab initio AFQMC for solids to $O(N^3)$ and $O(N^2)$ with arbitrary basis, respectively, comparable to diffusion Monte Carlo. This enables direct and simultaneous thermodynamic-limit and complete-basis-set AFQMC calculations across insulating, metallic, and strongly correlated solids, without embedding, local approximations, empirical finite-size corrections, or composite schemes. Our results establish AFQMC as a general-purpose, systematically improvable alternative to diffusion Monte Carlo and coupled-cluster methods for predictive ab initio simulations of solids, enabling accurate energies and magnetic observables within a unified framework.
Accurate simulation of solid-state systems is crucial across many areas, from fundamental science to technologies spanning disciplines such as condensed matter physics, materials science, and chemistry. The de facto workhorse in electronic structure calculations of materials is Kohn-Sham density functional theory (DFT) [1,2]. While approximate, due to the balance between accuracy and cost (O(N 3 ) with N being the system size), it has been applied to a broad range of solid-state problems [3][4][5][6][7]. The two challenges in DFT have yet to be overcome: strong correlation [8][9][10] and self-interaction error [11]. Solving these challenges within the DFT framework is an interesting direction, but an alternative active research area is to use many-body methods that go beyond DFT. Because the cost of these manybody methods scales more steeply with system size than that of DFT, one often struggles to reach the thermodynamic limit (TDL) and complete basis set (CBS) limit with these methods. We shall briefly review ongoing efforts in this area and highlight the challenges.
One of the most widely used many-body approaches for solids is fixed-node diffusion Monte Carlo (DMC) [12][13][14], which approximately performs imaginary-time evolution. It has an attractive O(N 3 ) cost per statistical sample [15] and O(N 2 ) storage cost [16]. Moreover, it works directly in the CBS limit. Two sources of bias in DMC are difficult to control and quantify: pseudopotential and fixednode errors. The pseudopotential error in DMC has often been a significant source of error [17][18][19], and a more accurate pseudopotential is still an active research area in DMC [20][21][22][23]. The fixed-node error is the bias introduced to control the sign problem and maintain statistical efficiency. While some prior work exists [24,25], quantifying * joonholee@g.harvard.edu the fixed-node error has been challenging, partly because it works in the CBS limit where obtaining exact, reference theoretical results is difficult.
Another popular class of methods applied in a solidstate context is diagrammatic ones, such as random phase approximation (RPA) [26][27][28][29][30] and coupled-cluster (CC) methods [31][32][33]. In particular, CC with singles, doubles, and perturbative triples (CCSD(T)) is the gold-standard method for gapped systems with mainly dynamic correlation [34][35][36] at the cost of O(N 7 ) with O(N 4 ) storage [37,38]. Unlike DMC, these methods can perform all-electron calculations or projector augmented wave calculations, so pseudopotential errors have not been a significant concern. However, these methods work in a finite basis, so one must extrapolate correlation energies to both the CBS and TDL limits. While accuracy is reliable, performing CCSD(T) calculations towards these limits often requires local correlation approximations, even for simple solids [39], introducing biases that stem from local approximations. It is also possible to utilize more advanced size corrections [32] and composite corrections via low-level methods [39][40][41] to approximate the TDL. The underlying bias of these corrections is often difficult to gauge.
We would also like to mention embedding approaches in which one defines a local impurity problem that is solved accurately, while the rest of the problem is handled by a mean-field method. Dynamical mean-field theory [42][43][44] and density matrix embedding theory [45,46] belong to this category. These approaches help to reach the CBS and TDL of a given impurity method. Still, their accuracy is ultimately limited by the underlying impurity solver and by the locality error (similar in spirit to local correlation methods). Completely removing the locality error is possible by increasing the impurity size, but the (steep) computational scaling of the impurity solver quickly limits this strategy. Because these approaches inevitably introduce locality errors, they motivate alternative methods that can reach the TDL without embed-arXiv:2602.16679v1 [cond-mat.str-el] 18 Feb 2026 ding.
Driven partly by the success of constrained-path auxiliary-field quantum Monte Carlo [47][48][49] for the Hubbard model and other related lattice models [50][51][52], ab initio phaseless AFQMC [53,54] for studying molecular systems has gained significant attention in recent years [55][56][57][58][59][60][61][62]. Simulating ab initio solid-state systems with AFQMC has received relatively less exploration due to the substantially greater computational cost of faithfully performing the TDL and CBS extrapolations without relying on composite schemes [63], downfolding [64], or DFT-based size-corrections [65][66][67][68]. While relying on these, prior solid-state AFQMC applications have used relatively coarse k-point sampling [69][70][71]. This restriction is partly due to the O(N 4 ) computational cost and the O(N 3 ) storage requirements of AFQMC when used with an arbitrary basis, which ultimately constrains the calculations
This content is AI-processed based on open access ArXiv data.