# Fileset

[2504.20481v2.pdf](https://mdr.nims.go.jp/filesets/6ffe5718-f584-4211-8462-cce18f00a380/download)

## Creator

Giacomo Tenti, [Kousuke Nakano](https://orcid.org/0000-0001-7756-4355), Michele Casula

## Rights


## Other metadata

[Self-Consistency Error Correction for Accurate Machine Learning Potentials from Variational Monte Carlo](https://mdr.nims.go.jp/datasets/b0a6afa5-0e89-450a-8778-95f82a194887)

## Fulltext

Self-consistency error correction for accurate machine learning potentials from variational Monte CarloSelf-consistency error correction for accuratemachine learning potentials from variationalMonte CarloGiacomo Tenti,∗,† Kousuke Nakano,‡,¶ and Michele Casula§†International School for Advanced Studies (SISSA), Via Bonomea 265, 34136 Trieste, Italy‡Center for Basic Research on Materials, National Institute for Materials Science (NIMS), 1-2-1Sengen, Tsukuba, Ibaraki 305-0047, Japan¶Center for Emergent Matter Science (CEMS), RIKEN, 2-1 Hirosawa, Wako, Saitama, 351-0198,Japan.§Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie (IMPMC), SorbonneUniversité, CNRS UMR 7590, MNHN, 4 Place Jussieu, 75252 Paris, France* E-mail: gtenti@sissa.itAbstractVariational Monte Carlo (VMC) can be used to train accurate machine learning interatomicpotentials (MLIPs), enabling molecular dynamics (MD) simulations of complex materials ontime scales and for system sizes previously unattainable. VMC training sets are often basedon partially optimized wave functions (WFs) to circumvent expensive energy optimizationsof the whole set of WF parameters. However, frozen variational parameters lead to VMCforces and pressures not consistent with the underlying potential energy surface, a bias calledthe self-consistency error (SCE). Here, we demonstrate how the SCE can spoil the accuracyof MLIPs trained on these data, taking high-pressure hydrogen as test case. We then apply arecently introduced SCE correction [Phys. Rev. B 109, 205151 (2024)] to generate unbiased1arXiv:2504.20481v2  [cond-mat.str-el]  9 Nov 2025gtenti@sissa.ithttps://arxiv.org/abs/2504.20481v2VMC training sets based on a Jastrow-correlated single determinant WF with frozen Kohn-Sham orbitals. The MLIPs generated within this framework are significantly improved and canapproach in quality those trained on datasets built with fully optimized WFs. Our conclusionsare further supported by MD simulations, which show how MLIPs trained on SCE-correcteddatasets systematically yield more reliable physical observables. Our framework opens thepossibility of constructing extended high-quality training sets with VMC.1 IntroductionSince their introduction,1,2 machine learning interatomic potentials (MLIPs) have revolutionizedthe field of molecular dynamics (MD).3–12 These models can learn the potential energy surface(PES) of a given system starting from a representative set of configurations, called the training set.Each element of this set is labeled by its physical properties (e.g., energy, ionic forces, pressure,etc.), usually computed using ab initio electronic structure methods. After it is trained, an MLIPis capable of reproducing the observables of the system on new configurations with the accuracyof the original method but with a computational cost of several orders of magnitude smaller.2This has opened the possibility of running MD simulations at ab initio accuracy with sizes andlengths that were previously out of reach.13 By far, the most common method used for generatingMLIPs datasets is density functional theory (DFT),14 which can easily produce ∼ 104-105 train-ing points with modest computational resources, usually sufficient to train data-heavy machinelearning frameworks.7,15 For several systems, however, DFT fails to capture the strong effect ofelectronic correlations or may provide results that strongly depend on the choice of the exchange-correlation functional. Quantum chemistry methods, such as the density matrix renormalizationgroup (DMRG),16 and the complete active space self-consistent field (CASSCF),17,18 provide amuch better description of correlation, but their applicability to periodic extended systems is lim-ited by their large computational cost. Because of this, even if significant progress has been maderecently, also thanks to the implementation of new machine learning schemes,19 the range of appli-cability of these approaches remains limited to a few tens of atoms and a few ps-long trajectories.2Quantum Monte Carlo (QMC) methods are a series of stochastic algorithms that provide state-of-the-art results for a large class of systems, from molecules to solids,20,21 while being less compu-tationally demanding. Despite this, QMC approaches still have a cost about 10-100 times largerthan DFT. In the context of MLIPs, this implies that the number of training configurations thatcan be generated within a given amount of resources is generally smaller than the one attainablewith mean-field methods. Recently proposed solutions to this problem make use of data-efficientschemes in combination with hierarchical machine learning (or ∆-learning22) to construct QMC-based MLIPs.23,24 However, within the most popular QMC approach, i.e., variational Monte Carlo(VMC), another issue prevents a straightforward application of MLIPs, presented in what follows.The efficiency of VMC is largely due to the complexity of the wave function (WF) employedin the calculations to describe the electronic part. A possible choice for the WF Ψ isΨ = exp(J)×ΦAS,where exp(J) is a bosonic term (i.e., symmetric under particle exchange) called the Jastrow factor,and ΦAS is an antisymmetric (AS) part. Both terms contain variational parameters, whose numbercan lead to flexible WFs. However, there is a practical difficulty in optimizing a large number ofvariables in a stochastic setting, particularly those belonging to the AS part. Thus, only the Jastrowparameters are usually optimized, and ΦAS ≡ΦSD is often taken as a single Slater determinant (SD)obtained from a DFT calculation.21 This gives rise to the frozen-orbital Jastrow Slater determinant(JSD) wave function, an appealing candidate for VMC-based machine learning applications, sinceit provides an excellent tradeoff between accuracy and computational cost. Indeed, the Jastrowoptimization is much less demanding than the full optimization also involving the determinant.Nevertheless, the use of the frozen Slater determinant introduces a bias in the VMC forces andpressures computed with this WF. This bias has been dubbed as the self-consistency error ,25,26and we are going to show that it is particularly relevant in the context of VMC-based MLIPs.Even though machine learning potentials are, by definition, consistent, a biased training set can3nonetheless spoil the final accuracy of the model. A solution would be to use only the energies forthe training; this, however, usually increases by orders of magnitude the number of configurationsnecessary to reach a given accuracy27,28.In Ref. 26, a suitable correction was developed to remove the self-consistency error, addinga relatively small computational overhead to the calculation. The aim of this work is to showthe actual impact of the self-consistency error on an MLIP in a working case scenario, and todemonstrate how its correction can be applied to generate unbiased VMC datasets with a JSDWF for machine-learning applications. As a test case, we consider high-pressure hydrogen, awidely studied system with applications ranging from planet modeling29 to inertial confinementfusion,30–32 and often used to benchmark numerical methods.33 It is also a system where DFTusually fails, when compared with the experimental data,34 making it an ideal testbed for ourpurposes. In particular, we focus on density and temperatures relevant for the Hugoniot curve, i.e.,the set of possible states reachable with a shock wave.35 The deuterium Hugoniot was recentlycomputed in Ref. 24 using MLIPs trained on accurate QMC data. There, the self-consistency errorwas avoided by relaxing the full set of variational parameters in the WF, i.e., including both theJastrow factor and the antisymmetric part. The resulting optimization improves the consistency ofthe forces and pressures and also changes the underlying PES, at a computational cost about 10times larger than the standard frozen-orbital JSD optimization.Here, we explore a different path, by showing that a very similar accuracy can be obtained bytraining models on a frozen-orbital JSD dataset supplemented by the self-consistency error correc-tion. In particular, we compute VMC forces and pressures on the same configurations using a JSDWF, with and without corrections, and use them to train the different models. The performances ofthese MLIPs are compared with the one of an MLIP trained on the fully optimized dataset of Ref.24, which we take as the reference. In this way, we show the relevance of the self-consistency errorin the MLIP generation and the need for its correction, by studying the impact of both the correc-tion and the full optimization of the WF on thermodynamic equilibrium quantities such as pressureand radial distribution functions. This work demonstrates the importance of obtaining unbiased4QMC datasets for MLIPs at the VMC level using cheap-to-optimize JSD-type WFs, including alsoionic forces and stress components.2 MethodsWe briefly present the main methods used in this work. In Sec. 2.1 we introduce general aspects ofMLIPs, focusing on one specific framework, i.e., the kernel ridge regression, and on the loss func-tion used for model training. In Sec.2.2, we review the forces and pressure correction developedin Ref. 26 to solve the self-consistency error in VMC and describe its implementation in our WFrepresentation.2.1 Generalities of MLIPsThe basic goal of any MLIPs is to predict the energy of a given N-atom configuration describedby its ionic positions, i.e., E = E (R1, · · · ,RN). We now suppose that E can be expressed as a sumof atomic contributions,2 each depending on the relative coordinates of all the other atoms withrespect to the i-th one:E =N∑i=1e(Ri) where Ri = (Ri1, . . . ,RiN) with Ri j ≡ R j −Ri. (1)Here, we refer to Ri as the local environment around the i-th atom and to e(Ri) as the local atomicenergy, and suppose the minimum image convention holds when periodic boundary conditions areapplied. Often, the function e(Ri) in Eq.(1) is considered local, i.e., only dependent on the atomscloser than a certain cutoff radius rc. Different types of MLIPs may be distinguished by theirinternal representation of the local atomic energy. Many techniques have been successfully ap-plied over the years, including deep neural networks,2,4,11 Gaussian approximation potentials,36,37kernel ridge regression,23,38 graph neural networks,39,40 and more. Independently of the specific5architecture, e(Ri) usually depends on a set of model parameters p which have to be optimized:e(Ri)≡ e(Ri;p) . (2)Most MLIPs implement a functional form that already satisfies some of the symmetries requiredby physical constraints, such as permutational and rotational invariance. To determine the modelparameters p, we consider a set of Ntrain configurations, i.e., the training set, for which we computethe desired physical quantities using a target numerical method. In particular, we indicate withRµi ≡ (Rµ1 −Rµi , . . . ,RµNµ−Rµi ) the i-th atomic environment belonging to the µ-th configurationof the set, with µ = 1, . . . ,Ntrain, having Nµ atoms and volume Vµ (for simplicity we will considerthe supercell to be cubic). Moreover let Eµref, Fµj,ref and Pµref be the values of the total energy, theforce acting on the j-th atom, and the (isotropic) virial pressure of the configuration. The modelparameters p are then optimized by minimizing a loss function such asL (p) =WE1NtrainNtrain∑µ=1[1NµNµ∑i=1e(Rµi ;p)− 1NµEµref]2+WF1NtrainNtrain∑µ=113NµNµ∑j=1∑α=x,y,z[− ∂∂Rµj,αNµ∑i=1e(Rµi ;p)−Fµj,α,ref]2+WP1NtrainNtrain∑µ=1[− ∂∂VµNµ∑i=1e(Rµi ;p)−Pµref]2. (3)The three terms in Eq.(3) are the mean squared error (MSE) of the energy per atom, ionicforces, and virial isotropic pressure, respectively, and WE , WF , and WP are tunable weights mul-tiplying the different MSEs. Other functional forms of the loss function are possible, includingadditional physical quantities (e.g., charges, all 6 independent components of the stress, etc). Inneural networks, L (p) can be minimized using stochastic optimization methods that computethe gradients using only a small batch of training configurations. In this paper, we primarily useMLIPs based on the kernel ridge regression (KRR) method. Within KRR, the local atomic energy6is expressed ase(R;p) =Nenv∑ν=1pνK (R,Rν) (4)where Rν , ν = 1, . . . ,Nenv belong to a subset of all the local environments in the training set, andK(R,Rν) is the normalized kernel between R and Rν . Following Refs 23 and 24, we adopt akernel based on a variant of the Smooth Overlap of Atomic Positions (SOAP) descriptor41,42(seeApp. A.1), and we select the Nenv local environments using the farthest point sampling method,23according to the distance introduced by the kernel K. Notice how plugging Eq.(4) into the lossfunction (3) and taking the derivatives with respect to the variational coefficients turns the mini-mization problem into a linear system, which is then solved by the conjugate gradient method. Weconclude this Section by stressing the fact that, by definition, any MLIPs will give energy deriva-tives (e.g., forces and virial stress components) that are consistent with their underlying potentialenergy surface. Any effect of potential inconsistency in the dataset, such as the ones studied inthis work, will manifest in the final performance of the models and its dependence on the relativetraining weights WE , WF , and WP. This is demonstrated in Sec.3.2, where we compare the qualityof MLIPs trained on different datasets as a function of WE , WF , and WP.2.2 Self-consistency error in VMC and its correctionWithin VMC, the force acting on the i-th atom is computed usingFVMCi =−〈∂∂RiEL〉−2〈(EL −E)∂∂RilogΨ〉, (5)where ⟨· · · ⟩ indicates the quantum expectation value on the wave function Ψ evaluated stochasti-cally, EL ≡ ĤΨ/Ψ the local energy with Ĥ being the Hamiltonian of the system and E = ⟨EL⟩.To obtain an efficient and statistically meaningful value of the force in Eq. (5) with finite variance,techniques such as the zero-variance zero-bias principle,43 the space warp transformation44 andreweighting43,45–49 are applied. FVMCi is unbiased only when all the variational parameters of theWF are at their variational minimum. As mentioned earlier, this assumption is not true for partially7optimized WFs, such as the frozen-orbital JSD, one of the most popular choices. In this case, anextra termFcorri =−NSD∑k=1∂E∂αSDkdαSDkdRi(6)has to be considered, containing the derivatives of the NSD variational coefficients αSDk included inthe Slater determinant.Discarding the term in Eq.(6) causes an inconsistency between the forces and the underlyingPES computed with a JSD WF, meaning that FVMCi ̸= − dEdRi. This is called the self-consistencyerror (SCE).25,26 An analogous discussion applies to other energy derivatives, such as the virialpressure, that contain a similar bias and for which we can derive similar corrections.In our applications, following the implementation of the TURBORVB package50 (here usedfor all the QMC calculations), we consider a representation of the SD part in terms of the antisym-metrized geminal power (AGP)ΦAGP = A (g(x1,x2) · · ·g(xN−1,xN)) , (7)where xi = (ri,σi) indicates the collective spatial and spin coordinates of the i-th electron, withi = 1, . . . ,N, A is the antisymmetrizer operator and g is a pairing function. In Eq. (7) we alsosuppose N to be even for simplicity. In the spin unpolarized case, the pairing function can beexpressed as the product of a spatial part f and a spin singlet, namelyg(xi,x j) = f (ri,r j)| ↑↓⟩− | ↓↑⟩√2. (8)In TURBORVB the spatial part of the pairing function is written in terms of a localized basis of Latomic orbitals (AOs) φkf (ri,r j) =L∑k=1L∑k′=1λk,k′φθk (ri)φθ ′k′ (r j). (9)In our case, the AOs are taken as Gaussian type orbitals (GTOs). For periodic systems, such as theones considered in this work, the localized basis is appropriately generalized to satisfy periodic or8twisted boundary conditions,51 depending on the phase θ (θ ′) for spin-up (spin-down) electrons.Here, θ =−θ ′, such that the global phase factor is canceled, making the function f invariant undera global translation. Fixing this gauge leads to a numerically stable evaluation of the dλk,k′/dRderivatives implied by Eq. 6.An alternative expression for the function f can also be obtained using molecular orbitals(MOs) Φn(r) = ∑k cn,kφk(r). In particular, when the number M of MOs is equal to N/2, theresulting AGP is exactly equivalent to a single SD, i.e., ΦAGP ≡ ΦSD. Within TURBORVB, theMOs (and the corresponding λk,k′ matrix) are initialized by a local density approximation52 (LDA)DFT calculation. This is performed in a basis consisting of the aforementioned AOs with anadditional one-body Jastrow factor to automatically take into account Kato cusp conditions.50,53Because of this, the electronic integrals entering into the Kohn-Sham equations are evaluated ona real space grid with a suitably chosen lattice space. Finally, we also mention that the basis isregularized by removing any eigenvector of the overlap matrix Sk,k′ = ⟨φk|φk′⟩ with a correspondingeigenvalue (si) satisfying the condition si/smax < εcut,54 where smax is the maximum eigenvalue andεcut is set to 10−7 in this study. This regularization is very general and is also able to automaticallycure the problem of basis-set redundancy coming from delocalized periodic Gaussians.Finally, the expression in Eq. (6) is stochastically evaluated in VMC according toFcorri =−2Re{〈ELL∑k=1L∑k′=1[(Ok,k′ −Ok,k′) dλk,k′dRi]〉}, (10)where Ok,k′ =∂ logΨ∂λk,k′and Ok,k′ = ⟨Ok,k′⟩. In the previous expression, the logarithmic derivativesin Ok,k′ are efficiently computed using the adjoint algorithmic differentiation,47 while the deriva-tives of the parameters,dλk,k′dRi, are evaluated within DFT-LDA using the finite differences method(FDM), as described in Ref. 26. For the virial isotropic pressure, an analogous derivation yieldsthe following expression for the correction term:Pcorr =−2Re{〈ELL∑k=1L∑k′=1[(Ok,k′ −Ok,k′) dλk,k′dV]〉}. (11)9The above Equation can be easily extended to the non-isotropic case.3 ResultsHenceforth, we analyze how the SCE affects the training of MLIPs and discuss how the correctiondescribed in Sec. 2.2 can effectively be implemented to improve the accuracy of the resultingmodels. With this aim, we consider a dataset made of pristine hydrogen configurations for whichwe computed VMC energy, forces, and pressure using different types of WF and with/without biascorrection. The dataset is the same as the one used in Ref. 24 and comprises 558 configurationsof 128 atoms each. These configurations were extracted from DFT-driven MD simulations attemperatures between 4000 K and 20000 K and Wigner-Seitz radii rs between 1.80 and 2.12. Thisrange of thermodynamic conditions corresponds to the expected location of the Hugoniot curve ofdeuterium, defined as the set of states that can be reached by producing a shock wave in a sample.For these configurations, we fully optimized the wave function, going beyond the frozen-orbitalJSD ansatz. As previously mentioned, we take this dataset as the reference one. In Sec. 3.1, westudy the SCE for the same set of configurations as the reference dataset, and show the effectof both forces and pressure corrections. In Sec. 3.2 we directly compare the accuracy of MLIPstrained on a biased dataset (i.e. affected by the SCE) and a corrected one, respectively. We alsocompare both models with the reference dataset used in Ref. 24. Finally, in Sec. 3.3, we analyzethe results of molecular dynamics simulations driven by the different MLIPs.3.1 Correcting the SCE bias in the Hugoniot datasetTo assess the magnitude of the SCE in the Hugoniot dataset, we use a JSD WF with a basis set of[4s2p1d] and [2s2p1d] GTOs for the antisymmetric part and Jastrow factor, respectively. The sameprimitive basis set was used in Ref. 24. We initialize the SD by running an LDA-DFT calculationwith twisted boundary conditions at the Baldereschi point, i.e, at k =(14 ,14 ,14)in crystal units. TheJastrow part is optimized using the stochastic reconfiguration method.5510We assess the consistency of the JSD forces and pressures by calculating the forces and pres-sure on a few selected configurations with two separate approaches: (i) evaluating Eq.(5) (alongwith the analogous expression for pressure) and (ii) using the FDM by fitting the PES of the sys-tem and evaluating the derivative numerically. As previously discussed, the SCE manifests itselfas a difference between these two values. The results are shown in Fig. 1 for one of the forcecomponents and the virial pressure.We observe a significant bias for both quantities, with a discrepancy as large as ∼ 10% oftheir values. This clearly highlights the importance of curing the SCE for the biased dataset. Forthis reason, we applied the corrections of Eqs. (10) and (11) to forces and pressures, respectively.For forces, we computed the numerical derivatives of the parametersdλk,k′dR with the FDM fromDFT-LDA, the theory used to obtain λk,k′ , with a displacement ∆R =±0.003 Å. The derivatives ofλk,k′ with respect to the volume, entering in the pressure correction, were obtained with the FDMusing relative volume variations of ±0.03%. In Fig. 1 we observe that the corrected forces andpressure are perfectly compatible with their values directly computed through the fit of the VMCPES. Fig. 2 shows the values of both the biased and corrected force components (pressures) versusthe forces (pressures) estimated from the PES, for several configurations. Notice how the pressurecorrection acts almost like a rigid shift of ∼ 1 GPa. The root mean squared error (RMSE) of biasedand corrected quantities is also shown, further demonstrating the effectiveness of the correction inremoving the SCE.Sometimes, the added correction significantly increases the error bar σF of the corrected forces.We then disregard the corrections that lead to forces with 3σF > 0.015 Ha/Bohr. For pressures, thethreshold value of 1.5×10−5 is applied to 3σp. See App. A.2 for further details. In principle, theFDM used to estimate the parameter derivativesdλk,k′dR can be replaced with more robust and fasterapproaches (e.g., using linear response theory56). We leave this possibility for future work.11(a) (b)Figure 1: (a) Comparison among the biased pressure evaluated with an expression equivalent toEq. (5) (red diamonds), the corrected pressure obtained by applying Eq. (11) (violet square), andthe pressure obtained by fitting the PES (dashed green line and green diamond) for a selectedconfiguration in the Hugoniot dataset.24 The PES of the system is also shown (green dots anddash-dotted line). (b) Comparison among the biased force evaluated with Eq. (5) (red diamonds),the corrected force obtained by applying Eq. (10) (violet square), and the force calculated by fittingthe PES (dashed green line and green diamond) for the z-component of the second hydrogen atombelonging to the same configuration as in (a). The PES of the system along the atomic displacementis also shown (green dots and dash-dotted line).12(a) (b)Figure 2: (a) Value of the biased (red markers) and corrected (violet markers) pressure for 6 differ-ent 128-atom configurations as a function of the numerical pressure estimated from the derivativeof the PES with respect to the volume. (b) Values of the biased (red markers) and corrected (violetmarkers) force components as a function of the numerical force estimated by the PES for one ofthe configurations belonging to the Hugoniot dataset.24 The dashed line is a reference for perfectconsistency.3.2 Comparison between VMC datasetsTo illustrate how using an SCE-corrected JSD dataset affects the MLIPs generation, we comparethe performances of MLIPs trained on three different datasets: (i) the aforementioned "biaseddataset" obtained with a JSD WF and with forces and pressure computed with the standard expres-sions (Eq. 5), (ii) a "corrected dataset" where the corrections of Eqs. (10) and (11) are applied asdescribed in Sec. 2.2 and (iii) a "reference dataset", i.e., the one introduced in Ref. 24. In particu-lar, the latter was obtained from an improved Jastrow-correlated AGP WF, where the Jastrow andantisymmetric part were both optimized, thanks to a combination of geminal embedded orbitals57to reduce the number of variational parameters and a restricted optimization based on the local-ity of the AGP λk,k′ matrix. The λk,k′ coefficients were optimized for those basis set elements kand k′ centered on atoms closer than a cutoff distance of 4 Bohr radii.24 No significant loweringof the variational energy was found beyond this threshold, suggesting that the dependence of theenergy on the remaining parameters can be disregarded. Thanks to this approach, the SCE on this13dataset was effectively removed and we thus decided to take the models trained on it as the refer-ence. Notice how the explicit optimization of the whole WF, as the one performed in Ref.,24 is aprocedure that is about an order of magnitude more expensive than the partial optimization of thefrozen-orbital JSD WF, and requires advanced minimization techniques.58,59For all the MLIPs, a KRR model is trained on the difference between VMC quantities andDFT ones, the latter using the Perdew-Burke-Ernzerhof (PBE) exchange-correlation functional,60following the ∆-learning approach.22–24 In this way, a much higher accuracy is achieved withrelatively small datasets such as the ones used here. The values of the hyperparameters chosen forthese models are reported in App. A.1. Among the 558 configurations of the datasets, we select58 configurations for testing. We investigate the effect of varying the training weights in the lossfunction (Eq. (3)) on the RMSE computed for energy, forces, and pressure of the test set for eachdataset.Since our goal here is to compare MLIPs trained on different datasets (i.e., different trainingvalues for energy E, forces F and pressure P), the RMSE may not be the best metric to assess therelative accuracy of one model with respect to the others. For this reason, we define the relativeimprovement of the model for each system property (X = E,F,P) as:∆X =RMSE(Xtest −Xdummy)−RMSE(Xtest −Xpred)RMSE(Xtest −Xdummy), (12)where Xpred are the model predictions, Xtest the test set values, Edummy =1Ntest∑i∈test Ei, Fdummy = 0,and Pdummy = 0. In other words, ∆X measures the relative improvement of the model predictionon the quantity X with respect to a "dummy" model, corresponding to a perfectly flat PES equal tothe average energy on the test set for all configurations. Notice how ∆X = 1 when RMSE(Xtest −Xpred) = 0, indicating that the model has learned exactly the quantity X , while a value ∆X ≤ 0means that the model has an error equal to, or larger than, a flat model. In the present work, thetest quantities refer to the difference between the VMC and the DFT-PBE baseline, while Edummyis the average difference on the test set.Figs. 3 and 4 show the results obtained for ∆E , ∆F , and ∆P using MLIPs trained with different14values of the training weights WE , WF , and WP. We first analyze what happens by progressivelyincreasing the weight on the forces for the different models without explicitly training on pressures(i.e., WP = 0). The force bias in the biased dataset is evident from the lower ∆F of the correspondingmodel compared to that of both the corrected and reference datasets (see Fig. 3). Notice how, forsmall values of WF/WE , ∆F is negative in the biased case, which in the ∆-learning frameworkmeans that the correction on forces is detrimental. As the weight on forces is increased, all modelsreach a plateau in ∆F , while simultaneously losing some accuracy on the other quantities. Forthe energy, this loss of accuracy is significantly larger in the models trained on the biased dataset,further demonstrating the effect of the SCE.We observe very similar performances between the corrected and reference models. For pres-sure, the models including the SCE correction show the best performances for almost all the valuesof WF . Since the loss does not include the pressure in this case, this is another indication of con-sistency. In Fig. 4 we plot the results obtained by varying the weight on pressure, for two valuesof WF/WE , namely WF/WE = 3/128 and WF/WE = 15/128. Also in this case, a better accuracy isgenerally achieved for the models trained on the corrected and reference datasets.Finally, we mention how these conclusions exclusively depend on the quality of the datasetsthemself, i.e., their internal consistency, and not on the MLIPs specific architecture. To demon-strate this feature, in App. A.3, we discuss the relative improvement ∆X for models obtained withthe MACE40 framework, which show the same behavior as the one found for the KRR MLIPs.3.3 MD results comparisonTo conclude our analysis, we run a set of MD simulations using MLIPs trained on the three differ-ent datasets. For each model, we select the relative training weights WF/WE and WP/WE to havea good tradeoff between the RMSE values on energy, forces, and pressure. In particular, for theMLIPs trained on the reference and corrected weight, we choose the last value of WF/WE beforethe significant decrease in ∆E (i.e., the fourth smallest value in Fig. 3), in order to increase as muchas possible the accuracy on forces without spoiling too much the energy. A similar reasoning is1510 2 100WF/WE0.40.50.60.70.80.9EEnergy10 2 100WF/WE0.20.00.20.4FForces10 2 100WF/WE0.40.20.00.20.40.6PPressureBiased (WP = 0) Corrected (WP = 0) Reference (WP = 0)Figure 3: Relative RMSE variation ∆X (Eq. 12) for energy, forces, and pressure, as a function ofthe force/energy weight ratio in the loss function of Eq. 3 (expressed in atomic units) for MLIPstrained on the biased, corrected and reference datasets, respectively. The weight on the pressure isset to zero for all models.Figure 4: ∆X (Eq. 12) for energy, forces, and pressure, as a function of the pressure/energy weightratio in the loss function of Eq. 3 used for training (expressed in atomic units), for the differentmodels. Solid lines correspond to WE = 1 and WF = 3/128, while dashed lines correspond toWE = 1 and WF = 15/128. The arrows highlight the "best" MLIPs for each of the three datasets,which are used for the MD simulations described in Sec. 3.3.16used to determine the optimal value of WP/WE (see Fig. 4). For the MLIP trained on the biaseddataset, we choose a larger value of WF/WE , which is necessary to reach a reasonable accuracy onforces. This comes at the price of sacrificing the accuracy on the energy, a direct consequence ofthe SCE. The weights used for each MLIP are summarized in Tab. 1. While the values of the bestrelative training weights WF/WE and WP/WE are model dependent, we highlight that this procedureis generally applicable to any other system.Table 1: Training weights for the MLIPs used in the MD simulations, for each dataset. For boththe forces and pressure weights, we report the relative value with respect to WE . All values are inatomic units.Dataset WF/WE WP/WEBiased 15/128 106/1282Corrected 3/128 106/1282Reference 3/128 106/1282We run MD simulations at three different densities and temperature conditions close to theposition of the deuterium Hugoniot curve given by the VMC-MLIP of Ref. 24. In particular,we considered thermodynamic conditions that span both the molecular liquid and the atomic one.Because of the ∆-learning approach adopted, the resulting efficiency of these calculations is thesame as a DFT Born-Oppeheimer MD simulation. Indeed, the energy, ionic forces, and pressure ofthe system are calculated at each step with DFT, and the corresponding corrections are then addedusing the MLIP. Notice how the DFT baseline can be, in principle, replaced by a faster potential(e.g., using an MLIP trained on DFT data61), improving the overall efficiency. Further detailsregarding the computational setup of the MD simulations are given in App. A.4.A comparison of the equilibrium pressure during the dynamics for three different temperaturesis reported in Tab. 2. Notice that, in principle, the reference dataset and the corrected one can givedifferent results. Indeed, the optimization of the antisymmetric part of the WF not only improvesthe consistency of forces and pressure, but can also modify the PES of the system. Nevertheless,if compared with the biased model, the MLIP trained on the corrected data gives results that areconsistently much closer to the reference ones. This suggests that the main discrepancy between17the biased and reference models comes from the SCE, and that the SCE correction improves con-siderably the physical description of the system. We observe a maximum discrepancy of ∼ 2 GPaat rs = 2.02 and T = 6000 K between the biased and corrected results, while the corrected andreference ones are in statistical accordance within a joint error bar of 0.2 GPa at the same ther-modynamic conditions. Remarkably, this state is very close to the Hugoniot compressibility peak,i.e., the maximum in the density reached by the shocked state along the Hugoniot curve. This ther-modynamic point is very sensitive to changes in the underlying equation of state, and a pressureshift as the one observed here between the corrected and biased models has a significant impact inthe final position of the Hugoniot.62 Similar conclusions can be reached by looking at the radialdistribution function g(r) (Fig. 5). Analogously to the pressure analysis, the g(r) of the correctedmodel is compatible with the one given by the reference one. Also for this quantity the maximumdiscrepancy between the corrected and biased models is observed close to the compressibility peakat T = 6000 K, which corresponds to the onset of the the molecular-to-atomic transition along theHugoniot curve. These results further highlight how SCE forces and pressure corrections signifi-cantly improve the physical outcome of the resulting MLIPs.Table 2: Average pressure from MD simulations obtained with different MLIPs trained on thebiased, corrected, and reference datasets, respectively. All models employ the ∆-learning techniquewith a DFT-PBE baseline.Pbias (GPa) Pcorr (GPa) Pref (GPa)rs = 2.16, T = 4000 K 17.1(1) 17.4(1) 17.4(1)rs = 2.02, T = 6000 K 26.8(1) 28.5(1) 28.7(1)rs = 1.92, T = 8000 K 38.3(1) 39.8(1) 39.9(1)4 ConclusionsMachine learning models have become a key tool to reach system sizes and simulation times notachievable with first principle methods, while showing a similar accuracy. In particular, MLIPstrained on data generated by advanced quantum-mechanical algorithms, such as quantum Monte181 2 3r (Å) -0.200.2 g(r)0123g(r)rs = 2.16, T = 4000 KBiasedCorrectedReference1 2 3r (Å) -0.200.2 g(r)0123g(r)rs = 2.02, T = 6000 K1 2 3r (Å) -0.200.2 g(r)0123g(r)rs = 1.92, T = 8000 KFigure 5: Radial distribution function g(r) for densities and temperatures close to the Hugoniotcurve, obtained with MLIPs trained on different datasets. The difference ∆g(r) with respect to themodel trained on the reference dataset is also shown for both the "biased" and "corrected" models.The shaded area indicates the statistical uncertainty in ∆g(r).19Carlo (QMC), are essential to improve the description and our physical understanding of phenom-ena for which standard mean-field methods fail. In this work, we provided evidence on how VMC,in combination with a cheap-to-optimize JSD WF and a force/pressure correction, can be used toconstruct reliable QMC-based training sets. Our results can be summarized in the following threepoints.First, we clearly demonstrated how the SCE affecting derivative quantities (i.e., forces andpressures) computed with a partially optimized WF, such as the frozen-orbital JSD, can heavilyspoil the accuracy of MLIPs trained on those data. This is manifested by a large dependence of themodels’ accuracy on the relative weights in the loss function.Second, we showed how an easy-to-evaluate bias correction, previously developed in Ref. 26,can be applied to remove the SCE from the training set, and improve the accuracy of the resultingMLIPs.Finally, we directly compared equilibrium thermodynamic quantities yielded by MD simula-tions driven by different MLIPs for the deuterium Hugoniot curve, taken as test system. We ob-served how the main discrepancy between MLIPs obtained from frozen-orbital and fully optimizedWF datasets does not necessarily come from a different PES description of the dataset but from theintrinsic SCE affecting forces and pressures computed with partially optimized WFs. Indeed, inour working case provided by high-pressure hydrogen, the MLIPs trained on the corrected datasetgive results in statistical agreement with those yielded by a reference model trained on VMC datawith a fully optimized WF.Our results reveal the importance of appropriately curing forces and pressures when they arecomputed with partially optimized WF in QMC and subsequently used to train MLIPs. The MLIPsquality degradation due to the SCE plaguing these VMC datasets is an effect that has been over-looked in previous applications. This could be even more relevant for heavier elements, since theSCE usually grows with the number of valence electrons.25 Moreover, the SCE correction schemeis general and applicable to systems more complex than hydrogen, because it is based on cheaperDFT calculations. We have also shown that some numerical instabilities coming from the finite20difference approximation used to evaluate the SCE correction can be cured by an appropriate cut-off in the corresponding error bars. The development of a more efficient scheme for evaluating theSCE correction based on linear response theory is currently in progress. The availability of an SCEcorrection framework used in combination with other approaches, such as ∆-learning, to furtherincrease the precision of the final models, opens the way to the efficient generation of accurateMLIP based on QMC electronic structure calculations.AcknowledgmentsK.N. is grateful for computational resources from the Numerical Materials Simulator at NationalInstitute for Materials Science (NIMS), and from Research Institute for Information Technology atKyushu University under the category of General Projects on the supercomputer GENKAI. M.C.acknowledges GENCI for providing computational resources on the CEA-TGCC Irene supercom-puter cluster under project number A0170906493 and the TGCC special session. M. C. is gratefulto EuroHPC for the computational grant EHPC-EXT-2024E01-064 allocated on Leonardo boosterpartition.K.N. acknowledges financial support from MEXT Leading Initiative for Excellent Young Re-searchers (Grant No. JPMXS0320220025), from Murata Science and Education Foundation (GrantNo. M24AN006), and from Japan Science and Technology Agency (JST), PRESTO (Grant No. JP-MJPR24J9). M. C. thanks the European High Performance Computing Joint Undertaking (JU) forthe partial support through the "EU-Japan Alliance in HPC" HANAMI project (Hpc AlliaNce forApplications and supercoMputing Innovation: the Europe - Japan collaboration).21Associated ContentData Availability StatementThe ab initio QMC package used in this work, TurboRVB, is available from its GitHub repositoryhttps://github.com/sissaschool/turborvb.All data necessary for reproducing the results reported in this work is available in the followingGitHub repository https://github.com/giacomotenti/unbiased_qmc_MLA AppendixA.1 Kernel regression models parametersIn this Appendix we provide further details on the hyperparameters and specific kernels used forthe models described in Sec. 3.2. The KRR models use a kernel as the one described in Ref. 23.Each environment is described by a density function written as a sum of Gaussian functions (withspread ε) multiplied by a localized function fc depending on a cutoff radius rc, i.e.,ρ (r,Ri) ∝ ∑|Ri j|≤rcfc(∣∣Ri j∣∣)exp(−∣∣r−Ri j∣∣2ε). (13)A similarity kernel between Ri and R j is then defined by averaging the overlap between thetwo densities over a discrete set of Nsym symmetry operations Ûk:K(Ri,R j)=1NsymNsym∑k=1[∫d3rρ(r;Ri)ρ(r;ÛkR j)]n. (14)Finally, the normalized kernel is written asK(Ri,R j)=[K(Ri,R j)K (Ri,Ri)K(R j,R j)]η. (15)22https://github.com/sissaschool/turborvbhttps://github.com/giacomotenti/unbiased_qmc_MLIn the application presented here, we chose rc = 4 Bohr and ε = 1.5 Bohr2 in Eq. (13). InEq. (14) we averaged over the cubic symmetry group (here appropriate since we are using cubiccells) and took n = 2. In Eq. (15) a value of η = 2 was employed.For the local energy (Eq. (4)) we also added an additional pair-wise term as done in Ref.,24using cubic spline functions63 defined on 10 equally spaced grid points between 0.3 Bohr and 5.5Bohr.Finally, the furthest point sampling method was used to select Nenv = 6000 local environmentsamong the training configurations.A.2 Cutoff in the correction errorFigure 6 shows the error bars of the corrected forces in the dataset. We observed that, for cer-tain force components and pressures (∼ 10% of all components), the correction is occasionallyaccompanied by large error bars. We identified the cause of this phenomenon in our current work-flow, which estimates the parameter derivativesdλk,k′dRianddλk,k′dVin Eq.10 and Eq.11 using theFDM by performing multiple DFT calculations at displaced coordinates.26 On the one hand, whenemploying localized basis sets in DFT, a cutoff is usually applied to reduce basis redundancy byeliminating elements corresponding to overlap matrix eigenvalues below a certain threshold, aswritten in Sec. 2.2. Without this cutoff, the forces and pressures computed according to Eq. 5exhibit large error bars, as reported in Ref. 64. On the other hand, applying this cutoff causesthe basis set to depend on atomic displacements in the FDM approach, as the quality of the ba-sis set changes when ions move. This displacement-induced dependence occasionally deterioratesthe error cancellation in Eqs. 10 and 11, resulting in larger error bars. Another source of largeerror bars is the finite smearing parameter used for molecular orbital (MO) occupations. Sincehydrogen exhibits metallic behavior depending on its density, a smearing technique is necessaryto achieve stable and smooth convergence for all configurations. However, employing smearingmethods can introduce discontinuities when evaluating parameter derivativesdλk,k′dRianddλk,k′dVby23FDM. Additionally, the discrete real-space grid size used in DFT calculations contributes to errordeterioration. Within the FDM approach, we need to perform 3N DFT calculations to computethe correction terms. To keep their computational cost low for large N, we chose a real-space gridsize of 0.153 Bohr3. We underline that the finite-difference step in ionic positions and cell volumefor FDM, the smearing parameter, and the real-space grid have been chosen to provide convergedforces and pressures, such that any possible bias is comparable with their total stochastic errorbars. Yet, the occurrence of large error bars in the SCE correction could not be avoided for someparticular ionic configurations by playing solely with these parameters. These factors indicate thatthe FDM method proposed in Ref. 26 can be improved for metallic systems in the future, althoughit is already robust for insulating systems with band gaps, as we have verified for c-BN, SiC, Si,and diamond.To manage these erratic components, in this study, we retained the biased force values inthe corrected data set whenever three standard deviations exceeded a given threshold, set hereto 0.015 Ha/Bohr. As shown in Fig. 6, this threshold corresponds to a minimum in the standarddeviation distribution for the corrected forces, indicating the onset of departure from normality. Asimilar strategy was employed for pressure, using a threshold of 1.5× 10−5 a.u. We confirmedthat forces and pressures with error bars below the threshold are unbiased, consistent with theircorresponding potential energy surfaces (PESs), as illustrated in Figs.1 and 2. This choice wasfurther validated by analyzing the behavior of ∆E , ∆F , and ∆P (see Eq.(12)) as a function of therelative weight WF/WE in the loss function, using models trained on datasets corrected with differ-ent thresholds for σF . The results, depicted in Fig. 7, confirm that the threshold of 0.015 Ha/Bohryields the best overall performance. Indeed, although ∆E and ∆P both steadily increase with in-creasing threshold values (for fixed WF/WE), the accuracy of forces deteriorates significantly whencorrections with large error bars (corresponding to the largest threshold, i.e. 3σF ≤ 0.04 Ha/Bohr)are included.240 0.015 0.03 0.045 0.06 0.075 0.093 F020000400006000080000100000# counts0 0.015 0.03 0.045 0.063 F05001000150020002500# countsFigure 6: Distribution of three standard deviations for the corrected forces. The inset zooms-in itstail, to highlight the outliers. The vertical line indicates the 3σ threshold chosen in our application,i.e., 0.015 Ha/Bohr.10 2 100WF/WE0.600.650.700.750.800.85EEnergy10 2 100WF/WE0.00.10.20.30.4FForces10 2 100WF/WE0.00.20.40.6PPressure3 0.0043 0.0053 0.0153 0.04Figure 7: ∆X for energy, forces, and pressure, as a function of the force/energy weight ratio in theloss function used for training (expressed in atomic units), for models trained on different versionsof the corrected dataset. Each dataset corresponds to a different value of the 3σF threshold used toaccept/discard the correction. The thresholds in the legend are reported in Ha/Bohr. For pressure,a threshold of 1.5×10−5 a.u. is used. The weight on the pressure was set to zero for all models.25A.3 MACE models comparisonHere, we further support the conclusions made in Sec. 3.2 by carrying out the same analysis usingMACE.40 This is a completely different MLIP architecture, because it is a framework based onequivariant message-passing neural networks with high-body order messages. The scope of thiscomplementary analysis is to demonstrate that the behavior observed with the KRR models isgeneral and only due to the training set consistency, and not linked to any particular implementationof the MLIP. The models consider 256 inner invariant features and a cutoff of 3 Å (effectivelydoubled when considering the message passing step). For each dataset, both training and test setsare the same as the one considered before for the KRR models in Sec. 3.2. The Adam optimizer65is used to find the models parameters. In particular, at each step the gradient of the loss functionis evaluated on a subset of the training set ("batch") of Nb = 4 configurations. The results for thedifferent datasets are reported in Fig. 8, where we show the behavior of ∆X , defined in Eq. (12),for energy, forces, and pressure. They are consistent with those previously discussed in Sec. 3.2,10 2 10 1 100 101WF/WE0.50.60.70.8EEnergy10 2 10 1 100 101WF/WE0.20.00.20.4FForces10 2 10 1 100 101WF/WE0.20.00.20.40.6PPressureBiased (WP = 0) Corrected (WP = 0) Reference (WP = 0)Figure 8: Relative RMSE variation ∆X (Eq. 12) for energy, forces, and pressure, as a function ofthe force/energy weight ratio in the loss function (expressed in atomic units) for MACE MLIPstrained on the biased, corrected and reference datasets, respectively. The weight on the pressure isset to zero for all models.showing the effect of the SCE and the importance of correcting it regardless of the particular choicefor the MLIP framework.26A.4 Molecular dynamics simulation detailsIn this Appendix, we give further details on the MD simulations discussed in Sec. 3.3.At each step, the energy, forces, and pressure were calculated at the DFT level using theQUANTUM ESPRESSO package in its GPU accelerated version66–68 with the PBE functional, andthen summed with those predicted by the different MLIPs. For the DFT simulations, a 60 Ryplane-wave cutoff with a projector augmented wave pseudopotential69 was used together with a4×4×4 Monkhorst-Pack k-point grid. For the MD simulations we used a time step of 0.25 fs anda Langevin thermostat46,70 with damping γ = 0.13 fs−1.After equilibration, we ran each simulation for about 4 ps to obtain the equilibrium quantities.The behavior of pressure as a function of the simulation time for different densities and tempera-tures and for each MLIP is shown in Fig. 9.270.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0Time (ps)1617181920Pressure ( GPa )(a): rs = 2.16, T = 4000 KBiased Corrected Optimized0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5Time (ps)24262830Pressure ( GPa )(b): rs = 2.02, T = 6000 K0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5Time (ps)36384042Pressure ( GPa )(c): rs = 1.92, T = 8000 KFigure 9: Pressure as a function of simulation time for a system of 128 hydrogen atoms at differentthermodynamic conditions: (a) rs = 2.16 and temperature T = 4000K, (b) rs = 2.02 and tempera-ture T = 6000K, and (c) rs = 1.92 and temperature T = 8000K. Each solid line corresponds to anMLIP trained on the different datasets while the dashed lines are the average values.28References(1) Blank, T. B.; Brown, S. D.; Calhoun, A. W.; Doren, D. J. Neural network models of potentialenergy surfaces. J. Chem. Phys. 1995, 103, 4129–4137.(2) Behler, J.; Parrinello, M. Generalized Neural-Network Representation of High-DimensionalPotential-Energy Surfaces. Phys. Rev. Lett. 2007, 98, 146401.(3) Handley, C. M.; Popelier, P. L. A. Potential Energy Surfaces Fitted by Artificial Neural Net-works. J. Phys. Chem. A 2010, 114, 3371–3383.(4) Behler, J. Neural network potential-energy surfaces in chemistry: a tool for large-scale simu-lations. Phys. Chem. Chem. Phys. 2011, 13, 17930–17955.(5) Manzhos, S.; Dawes, R.; Carrington, T. Neural network-based approaches for buildinghigh dimensional and quantum dynamics-friendly potential energy surfaces. Int. J. QuantumChem. 2015, 115, 1012–1020.(6) Botu, V.; Batra, R.; Chapman, J.; Ramprasad, R. Machine Learning Force Fields: Construc-tion, Validation, and Outlook. J. Phys. Chem. C. 2017, 121, 511–522.(7) Deringer, V. L.; Caro, M. A.; Csányi, G. Machine Learning Interatomic Potentials as Emerg-ing Tools for Materials Science. Adv. Mater. 2019, 31, 1902765.(8) Wood, M. A.; Cusentino, M. A.; Wirth, B. D.; Thompson, A. P. Data-driven material modelsfor atomistic simulation. Phys. Rev. B 2019, 99, 184305.(9) Dral, P. O. Quantum Chemistry in the Age of Machine Learning. J. Phys. Chem. Lett. 2020,11, 2336–2347.(10) Noé, F.; Tkatchenko, A.; Müller, K.-R.; Clementi, C. Machine Learning for Molecular Sim-ulation. Annu. Rev. Phys. Chem. 2020, 71, 361–390.29(11) Behler, J. Four Generations of High-Dimensional Neural Network Potentials. Chem. Rev.2021, 121, 10037–10072.(12) Unke, O. T.; Chmiela, S.; Sauceda, H. E.; Gastegger, M.; Poltavsky, I.; Schütt, K. T.;Tkatchenko, A.; Müller, K.-R. Machine Learning Force Fields. Chem. Rev. 2021, 121,10142–10186.(13) Jia, W.; Wang, H.; Chen, M.; Lu, D.; Lin, L.; Car, R.; Weinan, E.; Zhang, L. Pushing the limitof molecular dynamics with ab initio accuracy to 100 million atoms with machine learning.SC20: International conference for high performance computing, networking, storage andanalysis. 2020; pp 1–14.(14) Martin, R. M. Electronic Structure: Basic Theory and Practical Methods; Cambridge Uni-versity Press, 2004.(15) Schleder, G. R.; Padilha, A. C. M.; Acosta, C. M.; Costa, M.; Fazzio, A. From DFT tomachine learning: recent approaches to materials science–a review. J. Phys.: Mater. 2019, 2,032001.(16) Olivares-Amaya, R.; Hu, W.; Nakatani, N.; Sharma, S.; Yang, J.; Chan, G. K.-L. The ab-initiodensity matrix renormalization group in practice. The Journal of Chemical Physics 2015, 142,034102.(17) Jensen, H. J. A.; Jørgensen, P.; Ågren, H. Efficient optimization of large scale MCSCF wavefunctions with a restricted step algorithm. Journal of Chemical Physics 1987, 87, 451–466.(18) Werner, H.-J.; Knowles, P. J. A second order multiconfiguration SCF procedure with optimumconvergence. The Journal of chemical physics 1985, 82, 5053–5063.(19) Rath, Y.; Booth, G. H. Interpolating numerically exact many-body wave functions for accel-erated molecular dynamics. Nature Communications 2025, 16, 2005.30(20) Foulkes, W. M. C.; Mitas, L.; Needs, R. J.; Rajagopal, G. Quantum Monte Carlo simulationsof solids. Rev. Mod. Phys. 2001, 73, 33–83.(21) Becca, F.; Sorella, S. Quantum Monte Carlo approaches for correlated systems; CambridgeUniversity Press, 2017.(22) Ramakrishnan, R.; Dral, P. O.; Rupp, M.; von Lilienfeld, O. A. Big Data Meets QuantumChemistry Approximations: The ∆-Machine Learning Approach. J. Chem. Theory Comput.2015, 11, 2087–2096.(23) Tirelli, A.; Tenti, G.; Nakano, K.; Sorella, S. High-pressure hydrogen by machine learningand quantum Monte Carlo. Phys. Rev. B 2022, 106, L041105.(24) Tenti, G.; Nakano, K.; Tirelli, A.; Sorella, S.; Casula, M. Principal deuterium Hugoniot viaquantum Monte Carlo and ∆-learning. Phys. Rev. B 2024, 110, L041107.(25) Tiihonen, J.; Clay, I., Raymond C.; Krogel, J. T. Toward quantum Monte Carlo forces onheavier ions: Scaling properties. J. Chem. Phys. 2021, 154, 204111.(26) Nakano, K.; Casula, M.; Tenti, G. Efficient calculation of unbiased atomic forces in ab initiovariational Monte Carlo. Phys. Rev. B 2024, 109, 205151.(27) Chmiela, S.; Tkatchenko, A.; Sauceda, H. E.; Poltavsky, I.; Schütt, K. T.; Müller, K.-R.Machine learning of accurate energy-conserving molecular force fields. Sci. Adv. 2017, 3,e1603015.(28) Christensen, A. S.; Von Lilienfeld, O. A. On the role of gradients for machine learning ofmolecular energies and forces. Mach. Learn.: Sci. Technol. 2020, 1, 045018.(29) Helled, R.; Mazzola, G.; Redmer, R. Understanding dense hydrogen at planetary conditions.Nat. Rev. Phys. 2020, 2, 562–574.(30) Hu, S. X.; Militzer, B.; Goncharov, V. N.; Skupsky, S. First-principles equation-of-state tableof deuterium for inertial confinement fusion applications. Phys. Rev. B 2011, 84, 224109.31(31) Craxton, R. S.; Anderson, K. S.; Boehly, T. R.; Goncharov, V. N.; Harding, D. R.;Knauer, J. P.; McCrory, R. L.; McKenty, P. W.; Meyerhofer, D. D.; Myatt, J. F.; Schmitt, A. J.;Sethian, J. D.; Short, R. W.; Skupsky, S.; Theobald, W.; Kruer, W. L.; Tanaka, K.; Betti, R.;Collins, T. J. B.; Delettrez, J. A.; Hu, S. X.; Marozas, J. A.; Maximov, A. V.; Michel, D. T.;Radha, P. B.; Regan, S. P.; Sangster, T. C.; Seka, W.; Solodov, A. A.; Soures, J. M.;Stoeckl, C.; Zuegel, J. D. Direct-drive inertial confinement fusion: A review. Phys. Plasmas2015, 22, 110501.(32) Hu, S. X.; Goncharov, V. N.; Boehly, T. R.; McCrory, R. L.; Skupsky, S.; Collins, L. A.;Kress, J. D.; Militzer, B. Impact of first-principles properties of deuterium–tritium on inertialconfinement fusion target designs. Phys. Plasmas 2015, 22, 056304.(33) Bonitz, M.; Vorberger, J.; Bethkenhagen, M.; Böhme, M. P.; Ceperley, D. M.; Filinov, A.;Gawne, T.; Graziani, F.; Gregori, G.; Hamann, P.; Hansen, S. B.; Holzmann, M.; Hu, S. X.;Kählert, H.; Karasiev, V. V.; Kleinschmidt, U.; Kordts, L.; Makait, C.; Militzer, B.; Mold-abekov, Z. A.; Pierleoni, C.; Preising, M.; Ramakrishna, K.; Redmer, R.; Schwalbe, S.;Svensson, P.; Dornheim, T. Toward first principles-based simulations of dense hydrogen.Phys. Plasmas 2024, 31, 110501.(34) Monacelli, L.; Casula, M.; Nakano, K.; Sorella, S.; Mauri, F. Quantum phase diagram ofhigh-pressure hydrogen. Nature Physics 2023, 19, 845–850.(35) Duvall, G. E.; Graham, R. A. Phase transitions under shock-wave loading. Rev. Mod. Phys.1977, 49, 523–579.(36) Bartók, A. P.; Payne, M. C.; Kondor, R.; Csányi, G. Gaussian Approximation Potentials: TheAccuracy of Quantum Mechanics, without the Electrons. Phys. Rev. Lett. 2010, 104, 136403.(37) Bartók, A. P.; Csányi, G. Gaussian approximation potentials: A brief tutorial introduction.Int. J. Quantum Chem. 2015, 115, 1051–1057.32(38) Rupp, M.; Tkatchenko, A.; Müller, K.-R.; von Lilienfeld, O. A. Fast and Accurate Model-ing of Molecular Atomization Energies with Machine Learning. Phys. Rev. Lett. 2012, 108,058301.(39) Batzner, S.; Musaelian, A.; Sun, L.; Geiger, M.; Mailoa, J. P.; Kornbluth, M.; Molinari, N.;Smidt, T. E.; Kozinsky, B. E (3)-equivariant graph neural networks for data-efficient andaccurate interatomic potentials. Nat. commun. 2022, 13, 2453.(40) Batatia, I.; Kovacs, D. P.; Simm, G.; Ortner, C.; Csanyi, G. MACE: Higher Order Equivari-ant Message Passing Neural Networks for Fast and Accurate Force Fields. Adv. Neural Inf.Process. 2022; pp 11423–11436.(41) Bartók, A. P.; Kondor, R.; Csányi, G. On representing chemical environments. Phys. Rev. B2013, 87, 184115.(42) De, S.; Bartók, A. P.; Csányi, G.; Ceriotti, M. Comparing molecules and solids across struc-tural and alchemical space. Phys. Chem. Chem. Phys. 2016, 18, 13754–13769.(43) Assaraf, R.; Caffarel, M. Zero-variance zero-bias principle for observables in quantum MonteCarlo: Application to forces. The Journal of Chemical Physics 2003, 119, 10536–10552.(44) Umrigar, C. J. Two aspects of quantum monte carlo: Determination of accurate wavefunctionsand determination of potential energy surfaces of molecules. Int. J. Quantum Chem. 1989, 36,217–230.(45) Assaraf, R.; Caffarel, M. Computing forces with quantum Monte Carlo. J. Chem. Phys. 2000,113, 4028–4034.(46) Attaccalite, C.; Sorella, S. Stable Liquid Hydrogen at High Pressure by a Novel Ab InitioMolecular-Dynamics Calculation. Phys. Rev. Lett. 2008, 100, 114501.(47) Sorella, S.; Capriotti, L. Algorithmic differentiation and the calculation of forces by quantumMonte Carlo. J. Chem. Phys. 2010, 133, 234111.33(48) Filippi, C.; Assaraf, R.; Moroni, S. Simple formalism for efficient derivatives and multi-determinant expansions in quantum Monte Carlo. J. Chem. Phys. 2016, 144, 194105.(49) Nakano, K.; Raghav, A.; Sorella, S. Space-warp coordinate transformation for efficient ionicforce calculations in quantum Monte Carlo. J. Chem. Phys. 2022, 156, 034101.(50) Nakano, K.; Attaccalite, C.; Barborini, M.; Capriotti, L.; Casula, M.; Coccia, E.;Dagrada, M.; Genovese, C.; Luo, Y.; Mazzola, G.; Zen, A.; Sorella, S. TurboRVB: A many-body toolkit for ab initio electronic simulations by quantum Monte Carlo. J. Chem. Phys.2020, 152, 204121.(51) Dovesi, R.; Erba, A.; Orlando, R.; Zicovich-Wilson, C. M.; Civalleri, B.; Maschio, L.;Rérat, M.; Casassa, S.; Baima, J.; Salustro, S.; Kirtman, B. Quantum-mechanical condensedmatter simulations with CRYSTAL. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2018, 8, e1360.(52) Perdew, J. P.; Zunger, A. Self-interaction correction to density-functional approximations formany-electron systems. Phys. Rev. B 1981, 23, 5048–5079.(53) Kato, T. On the eigenfunctions of many-particle systems in quantum mechanics. Commun.Pure Appl. Math. 1957, 10, 151–177.(54) Azadi, S.; Cavazzoni, C.; Sorella, S. Systematically convergent method for accurate totalenergy calculations with localized atomic orbitals. Phys. Rev. B 2010, 82, 125112.(55) Sorella, S. Green Function Monte Carlo with Stochastic Reconfiguration. Phys. Rev. Lett.1998, 80, 4558–4561.(56) Toulouse, J. Introduction to the calculation of molecular properties by response theory.https://hal.science/hal-03934866/document, 2018.(57) Sorella, S.; Devaux, N.; Dagrada, M.; Mazzola, G.; Casula, M. Geminal embedding schemefor optimal atomic basis set construction in correlated calculations. J. Chem. Phys. 2015, 143,244112.34https://hal.science/hal-03934866/document(58) Holzmann, M.; Ceperley, D. M.; Pierleoni, C.; Esler, K. Backflow correlations for the electrongas and metallic hydrogen. Phys. Rev. E 2003, 68, 046707.(59) Slootman, E.; Poltavsky, I.; Shinde, R.; Cocomello, J.; Moroni, S.; Tkatchenko, A.; Filippi, C.Accurate Quantum Monte Carlo Forces for Machine-Learned Force Fields: Ethanol as aBenchmark. Journal of Chemical Theory and Computation 2024, 20, 6020–6027.(60) Perdew, J. P.; Burke, K.; Ernzerhof, M. Generalized Gradient Approximation Made Simple.Phys. Rev. Lett. 1996, 77, 3865–3868.(61) Tenti, G.; Jäckl, B.; Nakano, K.; Rupp, M.; Casula, M. Hydrogen liquid-liquid transitionfrom first principles and machine learning. 2025; https://arxiv.org/abs/2502.02447.(62) Clay, R. C.; Desjarlais, M. P.; Shulenburger, L. Deuterium Hugoniot: Pitfalls of thermody-namic sampling beyond density functional theory. Phys. Rev. B 2019, 100, 75103.(63) Xie, S. R.; Rupp, M.; Hennig, R. G. Ultra-fast interpretable machine-learning potentials. npjComput. Mater. 2023, 9, 162.(64) Nakano, K.; Morresi, T.; Casula, M.; Maezono, R.; Sorella, S. Atomic forces by quan-tum Monte Carlo: Application to phonon dispersion calculations. Phys. Rev. B 2021, 103,L121110.(65) Kingma, D. P.; Ba, J. Adam: A Method for Stochastic Optimization. 3rd International Con-ference on Learning Representations (ICLR), San Diego, California, USA, May 7–9. 2015.(66) Giannozzi, P.; Baroni, S.; Bonini, N.; Calandra, M.; Car, R.; Cavazzoni, C.; Ceresoli, D.;Chiarotti, G. L.; Cococcioni, M.; Dabo, I.; Corso, A. D.; de Gironcoli, S.; Fabris, S.;Fratesi, G.; Gebauer, R.; Gerstmann, U.; Gougoussis, C.; Kokalj, A.; Lazzeri, M.; Martin-Samos, L.; Marzari, N.; Mauri, F.; Mazzarello, R.; Paolini, S.; Pasquarello, A.; Paulatto, L.;Sbraccia, C.; Scandolo, S.; Sclauzero, G.; Seitsonen, A. P.; Smogunov, A.; Umari, P.; Wentz-35https://arxiv.org/abs/2502.02447covitch, R. M. QUANTUM ESPRESSO: a modular and open-source software project forquantum simulations of materials. J. Phys.: Condens.Matter 2009, 21, 395502.(67) Giannozzi, P.; Andreussi, O.; Brumme, T.; Bunau, O.; Nardelli, M. B.; Calandra, M.; Car, R.;Cavazzoni, C.; Ceresoli, D.; Cococcioni, M.; Colonna, N.; Carnimeo, I.; Corso, A. D.;de Gironcoli, S.; Delugas, P.; DiStasio, R. A.; Ferretti, A.; Floris, A.; Fratesi, G.; Fugallo, G.;Gebauer, R.; Gerstmann, U.; Giustino, F.; Gorni, T.; Jia, J.; Kawamura, M.; Ko, H.-Y.;Kokalj, A.; Küçükbenli, E.; Lazzeri, M.; Marsili, M.; Marzari, N.; Mauri, F.; Nguyen, N. L.;Nguyen, H.-V.; de-la Roza, A. O.; Paulatto, L.; Poncé, S.; Rocca, D.; Sabatini, R.; Santra, B.;Schlipf, M.; Seitsonen, A. P.; Smogunov, A.; Timrov, I.; Thonhauser, T.; Umari, P.; Vast, N.;Wu, X.; Baroni, S. Advanced capabilities for materials modelling with Quantum ESPRESSO.J. Phys.: Condens.Matter 2017, 29, 465901.(68) Giannozzi, P.; Baseggio, O.; Bonfà, P.; Brunato, D.; Car, R.; Carnimeo, I.; Cavazzoni, C.;de Gironcoli, S.; Delugas, P.; Ferrari Ruffino, F.; Ferretti, A.; Marzari, N.; Timrov, I.;Urru, A.; Baroni, S. Quantum ESPRESSO toward the exascale. J. Chem. Phys. 2020, 152,154105.(69) H.PBE-KJPAW_PSL.1.0.0.UPF pseudopotential from http://pseudopotentials.quantum-espresso.org/legacy_tables/ps-library/h.(70) Ricci, A.; Ciccotti, G. Algorithms for Brownian dynamics. Mol. Phys. 2003, 101, 1927–1931.36http://pseudopotentials.quantum-espresso.org/legacy_tables/ps-library/hhttp://pseudopotentials.quantum-espresso.org/legacy_tables/ps-library/h Introduction Methods Generalities of MLIPs Self-consistency error in VMC and its correction Results Correcting the SCE bias in the Hugoniot dataset Comparison between VMC datasets MD results comparison Conclusions Acknowledgments Associated Content Data Availability Statement Appendix Kernel regression models parameters Cutoff in the correction error MACE models comparison Molecular dynamics simulation details References