# Fileset

[s41524-026-02013-0.pdf](https://mdr.nims.go.jp/filesets/2d0d548d-b4f9-45c9-bc8c-dab3efac9889/download)

## Creator

[Enda Xiao](https://orcid.org/0000-0002-4372-1575), [Terumasa Tadano](https://orcid.org/0000-0002-8132-2161)

## Rights

[Creative Commons BY Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/)

## Other metadata

[Accurate screening of functional materials with machine-learning potential and transfer-learned regressions: Heusler alloy benchmark](https://mdr.nims.go.jp/datasets/8b4bec3a-3a6c-4cf9-ac4e-04c558b69880)

## Fulltext

Accurate screening of functional materials with machine-learning potential and transfer-learned regressions: Heusler alloy benchmarknpj | computationalmaterials ArticlePublished in partnership with the Shanghai Institute of Ceramics of the Chinese Academy of Scienceshttps://doi.org/10.1038/s41524-026-02013-0Accurate screening of functionalmaterials withmachine-learning potentialand transfer-learned regressions: Heusleralloy benchmarkCheck for updatesEnda Xiao1 & Terumasa Tadano1,2We present a machine learning-accelerated high-throughput (HTP) workflow for the discovery offunctionalmaterials. As a test case, quaternary and all-dHeusler compoundswere screened for stablecompounds with large magnetocrystalline anisotropy energy (Eaniso). Structure optimization andevaluation of formation energy and energy above the convex hull were performed using the eSEN-30M-OAM interatomic potential, while local magnetic moments, phonon stability, magnetic stability,and Eaniso were predicted by eSENmodels trained on our DxMag Heusler database. A frozen transferlearning strategy was employed to improve accuracy. Candidate compounds identified by the ML-HTP workflow were validated with density functional theory, confirming high predictive precision. Wealso benchmark the performance of different uMLIPs, discuss the fidelity of local magnetic momentprediction, and demonstrate generalization to unseen elements via transfer learning from a universalinteratomic potential.The high-throughput (HTP) screening approach has emerged as a powerfulstrategy for accelerating the discovery of novel materials by systematicallyexploring large chemical spaces computationally or experimentally1–3. Thedensity functional theory (DFT)-based HTP workflows have been widelyemployed to identify materials with target properties. However, as thesearch space increases, the associated computational cost becomes unsus-tainable, often restricting screening efforts to a manageable subspace4,5. Toaddress this bottleneck, machine learning (ML) offers a promising route bydrastically reducing computational costs. In this work, we demonstrate therobust integration of state-of-the-artML techniques into theHTPworkflow(ML-HTP) through a practical case study focused on screening quaternaryand all-d Heusler compounds for stable candidates with strong magneticanisotropy energy (Eaniso).Initial realizations of theML-HTP paradigm relied onMLmodels thatutilize compositional descriptors as input features6–9. These models directlymap chemical formulas to target properties, offering efficiency and sim-plicity. However, composition-based models are inherently unable to dis-tinguish compounds with identical stoichiometry but different atomicarrangements. One workaround involves assigning layer indices to atomicsites, but this approach fixes the number of sites and can yield inconsistentpredictions for symmetry-equivalent structures10,11. Crystal graph-basedmodels do not have such drawbacks since they explicitly incorporatestructure information as input, capturing structure-property relationshipsmore accurately12,13. However, crystal graph-based models introduce anadditional computational step, as geometry optimization must precedeproperty prediction.Although a single DFT optimization typically requires only a fewminutes, the cumulative cost of screening a large number of candidatecompounds becomes prohibitively expensive, particularly for magneticsystems where multiple magnetic configurations must be considered. Apromising solution lies in leveraginguniversalmachine learning interatomicpotentials (uMLIPs), which can accelerate structure optimization by severalorders of magnitude relative to DFT. The uMLIP field has witnessed rapidadvancements in recent years, with many crystal graph-based modelsproposed. Despite the conceptual appeal, reliable and robust uMLIP-basedstructure optimization has only become practical recently, especially fol-lowing the release of the large-scale and diverse Meta Open Materials 2024Dataset (OMat24) training dataset in 202414. This is demonstrated in thecurrent work by benchmarking several uMLIPs, ranging from early-stageimplementations to state-of-the-art developments.With optimized structures, properties can be predicted using amachine learning regression model (MLRM). This approach substantially1Research Center for Magnetic and Spintronic Materials, National Institute for Materials Science, Tsukuba, Ibaraki, Japan. 2Digital Transformation Initiative Center forMagnetic Materials (DXMag), National Institute for Materials Science, Tsukuba, Ibaraki, Japan. e-mail: Xiao.Enda@nims.go.jp; Tadano.Terumasa@nims.go.jpnpj Computational Materials |          (2026) 12:133 11234567890():,;1234567890():,;http://crossmark.crossref.org/dialog/?doi=10.1038/s41524-026-02013-0&domain=pdfhttp://crossmark.crossref.org/dialog/?doi=10.1038/s41524-026-02013-0&domain=pdfhttp://crossmark.crossref.org/dialog/?doi=10.1038/s41524-026-02013-0&domain=pdfmailto:Xiao.Enda@nims.go.jpmailto:Tadano.Terumasa@nims.go.jpwww.nature.com/npjcompumatsreduces computational cost, particularly for properties that are expensive tocompute via DFT, such as phonon spectra, conductivity, magnetic criticaltemperature (Tc), and Eaniso. However, training accurate MLRMs typicallyrequires large, high-quality datasets. To overcome this challenge, transferlearning (TL) techniques canbe employed toadaptpretrainedMLmodels tonew tasks15–17. TL leverages models that have already learned generalizablerepresentations from extensive datasets and fine-tunes them using smaller,task-specific datasets. This strategy enhances predictive accuracy whilesubstantially reducing data requirements.As a case study, we conducted a ML-HTP screening on Heuslercompounds, which have garnered significant attention due to their diversefunctional properties, technological potential, and structure complexity18.Numerous DFT-HTP screenings have been carried out to identify candi-dates with different desirable properties1,4,5,19–21. In our previous work, wedeveloped DXMag Computational Heusler Database (HeuslerDB), acomprehensive database encompassing nearly all conventional ternaryHeusler compounds. The present study significantly extended the searchspace to includequaternaryandall-dHeusler compounds, targeting stabilityand Eaniso as key screening criteria. In earlier DFT-HTP studies of Heuslercompounds, 10 candidates with large Eaniso were identified out of 286selected compositions, and 15 among 29,784 Co-based structures4,5. Thislow yield underscores the rarity of such materials and highlights the diffi-culty of this search problem, making it a stringent test case for ML-HTPapproaches.Previous studies attemptedML approaches to Eaniso in systems outsidethe Heusler family. They employed early MLmodels, such as crystal graphconvolutional neural networks (CGCNN) and compositional-descriptormodels, to predict Eaniso in Fe-Co-N alloys and physically motivated 2Dmaterials selected based on domain knowledge, working on the order ofhundreds of compounds22–24. Here, we employ state-of-the-artMLmethodsto extend the scope to hundreds of thousands of compoundswith improvedaccuracy and practicality.In this work, we demonstrate the use of uMLIP and TL-MLRMs asdrop-in replacements for DFT structure optimization and property eva-luation within the HTP framework, as illustrated in Fig. 1a. As a practicalapplication,we employed this approach to identify conventional quaternaryand all-d Heusler compounds with large Eaniso, while simultaneouslysatisfying thermodynamic, dynamic, and magnetic stability. Structureoptimization and thermodynamic stability evaluationwereperformedusingthe eSEN-30M-OAMuMLIP25. The following predictions of localmagneticmoment ({mi}), minimum phonon frequency (ωmin), Tc, and Eaniso wereperformed using MLRMs. The MLRMs were trained via frozen transferlearning, using eSEN-30M-OAM uMLIP as the base model and fine-tunedusing HeuslerDB data and newly computed data. ML-selected candidateswere validated through DFT calculations to demonstrate the significantreliability of this ML-HTP approach. We further examine key factors thatinfluence the performance of such ML-HTP workflows, including theaccuracy of uMLIP-based structure optimization, the magnetic configura-tion prediction, the performance of the frozen transfer learning techniqueand generalization to unseen elements. The underlying code for the ML-HTP workflow is made available as open-source packages MLIP High-throughput Optimization and Thermodynamics (MLIP-HOT) and MLIPFrozen Transfer Learning (MLIP-FTL), which can be found on our group’swebsite and Git repository.ResultsTo accumulate data for training Eaniso MLRM, we first computed the Eanisoof conventional ternary Heusler compounds within HeuslerDB using DFT.The Eaniso of some Heusler compounds was reported in the previous workand the agreement between our DFT results and previous work isdemonstrated in Fig. S14,5. Among all conventional ternary Heusler com-pounds, 2190 (7.9 %) exhibit an Eaniso magnitude greater than 1MJ/m3.When further screened for thermodynamic, dynamical, and magnetic sta-bility, only 135 compounds (0.5 %) meet both the high Eaniso and stabilitycriteria, which are presentated in Table S2. These low percentages under-score the difficulty of identifying stable, high Eaniso compounds and high-light the need for more efficient screening methods, as demonstrated incurrent work as a case study. The ML-HTP workflow for this case study issummarized in Fig. 1(b); detailed computational procedures are provided inSec. “Methods”.For conventional quaternary compounds, we enumerated all combi-nations whereX andY are transitionmetals from the d-block (excluding Tcand Hg), and Z is a main-group element from groups 13, 14, or 15 of the p-block. In addition, La and Lu were included for X and Y because their 4forbitals are either empty or fully filled. This exhaustive enumeration,accounting for symmetry constraints, yielded131,544unique compositions.For the all-d Heuslers, we extended the screening space to include d-blocktransitionmetals together with La and Lu across all four sites (X1,X2,Y, andZ), resulting in a separate set of 105,763 unique compositions. A schematicof the screened chemical space is presented in Supplementary Informationas Fig. S2.Validation of ML-HTP selected candidatesUsing eSEN-30M-OAMuMLIP in combinationwithMLRMs, we screenedalmost all conventional quaternary and all-d Heusler spaces for stablecompoundswithhighEaniso.As a result, 334 and924 candidateswere found,uMLIP(eSEN-30M-OAM) a)  b)TL-MLRMProperty(e.g. ωmin, TC, {mi}, Eaniso)Transfer LearningInput embeddingOutput layerEnergy, Force, StressStructure OptimizationΔE, ΔHModified Output layer Input embeddingsreyaLnezorFsreyaLelbixelFHeuslerDBOMat24, MPtrj,  sAlexandriaTrain Dataset Train DatasetConventional quaternay and all-d Heuslers(b) Predict  ωmin, TC, {mi}, Eanisousing TL-MLRMPhonon stable?YUnstableNTc > 300 K?NYIs magnetic?YYN YYY334/92417,382/23,11611,988/15,2176,053/10,9232,532/5,4851,171/2,963131,544/105,763ΔE < 0 eV/atom?ΔH < 0.22 eV/atom?UnstableNN|Eaniso| > 1 MJ/m3?(c) Validate using DFT calculationStable compounds with large EanisomagnitudeCompounds for future exploration of other propertiesNTetragonal phase?(a) Optimize structure and calculate ΔE and ΔH using uMLIPFig. 1 | Frozen transfer learning overview andML-HTP workflow. a Schematic ofthe development of theMLRM via frozen transfer learning, using eSEN-30M-OAMuMLIP as the base model. The uMLIP is used to perform structure optimization,formation energy calculation, and convex hull distance evaluation. The MLRMpredicts properties from structures. bWorkflow of the case study which identifiedstable conventional quaternary and all-d Heusler compounds exhibiting strongEaniso. Counts of quaternary and all-d compounds at each stage are reported asquaternary/all-d.https://doi.org/10.1038/s41524-026-02013-0 Articlenpj Computational Materials |          (2026) 12:133 2www.nature.com/npjcompumatsrespectively. To evaluate the reliability of this ML workflow, all candidateswere validated usingDFT calculations. The results are summarized in Fig. 2and detailed data are provided in Tables S3 and S4. The percentages thatDFT results satisfy the screening criteria (i.e., precision) are shown as blueand yellow bars for conventional quaternary and all-dHeusler compounds,respectively. The selection criteria include c/a ratio, ΔE, ΔH, {mi}, ωmin, Tc,and Eaniso. For comparison, the precisions of the ML models, measured onthe test set of conventional ternary compounds, are also provided asgreen bars.InML-HTP, the c/a ratio,ΔE, andΔH are obtained from the optimizedstructure and the corresponding energy by eSEN-30M-OAM uMLIP.During the structure optimization process, relaxations were performedstarting from multiple initial structures, and the relaxed structure with thelowest energy was selected. Since a non-zero Eaniso requires the Heuslercompound to adopt a tetragonal phase, we applied a screening threshold of∣c/a− 1∣> 0.01 to identify tetragonality. Notably, allML-selected candidatesremained tetragonal in DFT validation, confirming that eSEN-30M-OAMreliably distinguishes between cubic and tetragonal phases. Amore detaileddiscussion of structure optimization and performance for lattice parametersa, c, and the c/a ratio prediction, alongwithperformance of otheruMLIPs, isprovided in Sec. “uMLIP optimization performance”.Using energies of candidate compounds, elements, and competingphases predicted by eSEN-30M-OAM, theΔE andΔHwere calculated. Thecriteria of ΔE < 0 eV/atom and ΔH < 0.22 eV/atom were employed toidentify thermodynamically stable candidates, following the thresholdsestablished in our previous DFT-HTP study21. Among ML-selected candi-dates, 99.1% of conventional quaternary and 97.8% of all-d Heusler com-pounds were validated to have ΔEDFT < 0 eV/atom. Similarly, 96.4%(quaternary) and 98.8% (all-d) of the compounds were found to haveΔHDFT < 0.22 eV/atom. These high validation rates demonstrate that state-of-the-art uMLIP, such as eSEN-30M-OAM, can reliably assess thermo-dynamic stability.It is important to note that the eSEN-30M-OAM uMLIP used is ageneral-purpose, pretrained model without any fine-tuning specific to theHeusler chemical space. Thus, these results highlight its strong general-ization, making it an effective drop-in replacement for DFT-based opti-mization and thermodynamic stability assessment in the HTP workflow.Notably, the model achieves strong performance on the studied magneticsystems, despite not explicitly incorporating magnetic moments into itsarchitecture or training. This strong performance and generalization areparticularly valuable for the initial screening of novel material systems,where uMLIP can greatly reduce the search space by rapidly and reliablyestimating optimized structures and thermodynamic stability.The properties {mi}, ωmin, Tc, and Eaniso were predicted using MLRMsapplied to uMLIP-optimized structures. Because Eaniso is a magneticproperty, {mi}was predicted, anda screening threshold of∑∣mi∣>0.1μB/f.u.was applied to identifymagnetic compounds.DFTvalidations confirmed allML-selected candidates to be magnetic. Moreover, the {mi} MLRM accu-rately predicts both the magnitude and sign of local moments, as discussedin detail in Sec. “Prediction of local magnetic moment”. Magnetic systemidentification is a critical yet computationally demanding step inDFT-HTP,as multiple initial {mi} values must be tested, and the low fraction of mag-netic systems in some material families can lead to substantial wastedcomputation. By incorporating the {mi} MLRMmethod as a pre-screeningstep, the search space can be substantially reduced.To identify compounds with dynamic stability, magnetic stability, andlarge Eaniso, we applied the criteriaωmin >− 10 cm−1,Tc> 300 K, and ∣Eaniso∣> 1MJ/m3. Among the ML-selected candidates, 89.2% of conventionalquaternary and 93.1% of all-d Heusler compounds were validated to haveωmin above −10 cm−1. For magnetic stability, 81.7% of conventional qua-ternary and 80.4% of all-d Heusler compounds were validated to have Tcabove 300 K. For the target property Eaniso, the validation rates were 82.0%and 68.2%, respectively. To assess the sensitivity to the criteria values, wealso evaluated the precision using a range ofmore stringent thresholds. Theresults, summarized in Fig. S3, show that the selection precision is notsensitive to the threshold values in the investigated range.The MLRMs were trained exclusively using train set of conventionalternary Heusler compounds, yet were applied to evaluate quaternary andall-d compositions. By comparing validation rates to the precisions calcu-lated using the test set of conventional ternary compounds, theMLRMs for{mi} and Tc generalize well to these expanded chemical spaces. In contrast,the Eaniso model exhibits lower performance for all-d compounds. Thisdiscrepancy can be attributed to the difference in chemical environments;while conventional quaternary compounds retain Z-site elements from thep-block−consistent with the training set−all-d compounds introduce Zelements from the d-block, which were absent during training. Eaniso is asensitive property, influenced by subtle details of the electronic structure,and thus more susceptible to domain shifts than {mi} or Tc.Relaxing the screening thresholds increases the pool of candidatecompounds and might capture promising cases missed initially at theexpense of more false positives. Additionally, the curated list of compoundsthatmeet stability criteria and furthermagnetic system criterion serves as anFig. 2 | DFT validation summary of ML-HTP selected compounds. For ML-selected candidate lists of conventional quaternary (334) and all-d (924) Heuslercompounds, the percentages that DFT results satisfy the screening criteria (i.e., theML-HTP precision) are shown as blue and yellow bars. For comparison, theprecision of the ML models measured on the test set of conventional ternary com-pounds is also shown as green bars. The test set size for c/a ratio, formation energy(ΔE), and energy above the convex hull (ΔH) is 10,000 and for {mi}, ωmin, Tc, andEaniso is 10% of the dataset size shown in Table 1.https://doi.org/10.1038/s41524-026-02013-0 Articlenpj Computational Materials |          (2026) 12:133 3www.nature.com/npjcompumatsefficient starting point for the investigation of other functional properties.For readers interested in exploring an expanded candidate list, the full set ofML-predicted data for 131,544 conventional quaternary and 105,763 all-dHeusler compounds will be accessible through HeuslerDB.Distribution of strong Eaniso candidatesIn addition to validating predictive precision, it is essential to determinewhether the MLmodels capture known physical trends. A well-establishedinsight is that compounds containing 4d and 5d elements typically exhibitlarger Eaniso than those composed of 3d elements, owing to the strongerspin–orbit coupling associated with the heavier atomic nuclei of 4d and 5delements. This behavior is clearly reproduced in the ML-HTP results, asshown in Fig. 3, which presents the distribution of candidate compoundsaccording to the presence of 4d/5d elements. The figure further highlightsthe specific 4d/5d elements that appear in the identified compounds. Forcomparison, Fig. 3 also includes the distribution of compounds with DFT-calculated Eaniso magnitudes exceeding 1MJ/m3.uMLIP optimization performanceIn recent years, uMLIP has advanced rapidly, with numerous new modelsproposed and trained. To identify themost suitablemodel for our screeningworkflow, we benchmarked representative uMLIP models developed since2023. These include ALIGNN-FF, CHGNet, SevenNet-l3i5, SevenNet-MF-ompa, HIENet, MatterSim-v1, eqV2-S-OAM, eqV2-M-OAM, eqV2-L-OAM, and eSEN-30M-OAM14,25–30. The evaluation focused on structureoptimization for 10,000 conventional ternary compounds randomly selec-ted from the ground states in HeuslerDB. To identify the global minimum,14 initial structures were generated by applying strain to the conventionalcell (two formula units) and converting it to the primitive cell (one formulaunit). Specifically, the a, b, and c axes were uniformly scaled by ± 10% and ±30%, or the c-axis alonewas varied by±10%,±20%, ±30%,±40%, and±50%.The lowest-energy structure from these relaxations was selected as thepredicted ground state. Convergence tests for all evaluated models areshown in Fig. S4, demonstrating that the selected ground states are well-converged.The performance of structure optimizationwas assessed by comparingthe predicted lattice constants a and c, and the resulting c/a ratio, withcorresponding DFT values. The results are summarized in Fig. 4. Therelative error (RE) is defined as the maximum of xMLxDFT� 1������ and xDFTxML� 1������,where xML and xDFT denote the values predicted by the ML and DFT,respectively. We report the fractions of compounds within 5% and 1% REtolerances. Among the evaluated models, the eSEN-30M-OAM and eqV2models achieved the highest accuracy at 5% RE, with eSEN-30M-OAMshowing slightly better performance at the stricter 1% RE threshold. A keydistinction between the two models is the number of local minimaencountered: eqV2-L-OAM identified 91,585 localminima, whereas eSEN-30M-OAM identified 32,606. Both counts are much greater than 10,000,confirming the presence of multiple local minima, but are still significantlyless than 10,000 × 14. This suggests that many different initial distortionsultimately converge to the same local minimum. Importantly, eSEN-30M-OAM found substantially fewer local minima than eqV2-L-OAM. Thisdifference is attributed to the smoother energy landscape of eSEN-30M-OAM. Convergence tests also demonstrate that eSEN-30M-OAM achievesconvergence with fewer initial structures, resulting in lower computationalcost for HTP purpose.The predictive performance of total energy (E), formation energy (ΔE),and energy above the convex hull (ΔH) using uMLIP was assessed bycomparing the uMLIPpredictionswithDFTvalues, using the absolute error(AE), ∣xML − xDFT∣. The fractions of compounds with AE below 0.01 and0.05 eV/atom are shown in Fig. 4. Among the benchmarkedmodels, eSEN-30M-OAM and the eqV2 variants showed the highest accuracy for ΔE andΔH at the 0.05 eV/atom threshold, with eSEN-30M-OAM displaying aslight drop in accuracy at the more stringent 0.01 eV/atom level. Predictedtotal energies from uMLIP were found to be systematically lower than DFTvalues, reducing direct agreement; however, this offset also applies to ele-mental references and competing phases, so the relative quantities ΔE andΔH remain in strong agreement withDFT. Given its robust performance inboth structure optimization and thermodynamic stability, eSEN-30M-OAM was selected for integration into the ML-HTP workflow.Fig. 3 | Distribution of ML-selected compoundson elements contained. Distribution of ML-selected candidate compounds based on whether 4dor 5d elements are present and distribution over 4dand 5d elements contained. The distribution of DFTvalidated strong Eaniso candidates is also shown.https://doi.org/10.1038/s41524-026-02013-0 Articlenpj Computational Materials |          (2026) 12:133 4www.nature.com/npjcompumatsImprovements over existing approachesPrevious studies have reported the performance of composition-basedmodels in predicting lattice constants, ΔE, and ΔH for cubic Heuslercompounds. For comparison, we evaluatedmetrics of eSEN-30M-OAMonthe cubic Heusler subset, with results summarized in Table 1. The R2 scorefor the lattice constant a is 0.994, surpassing the previously reported rangesof 0.80–0.94 across differentHeusler types and the values of 0.94, 0.979, and0.987 in other works8,11,31,32. Similarly, the R2 for ΔE reaches 0.995, out-performing earlier results of 0.80–0.88 and 0.93, 0.9828,11,31. TheR2 forΔH is0.98, exceeding prior values of 0.91 and 0.96911,33. The root mean squarederrors (RMSE) for a and ΔE are 0.023Å and 0.029 eV/atom, respectively,which are significantly lower than the 0.11–0.12Å and 0.117 eV/atomreported in earlier work10.The MLRMs used in screening were trained on the data from Heus-lerDB, supplemented with newly computed Tc and Eaniso values usingoptimized structures in HeuslerDB. Test set metrics are summarized inTable 1 and benchmarked against previously reported results. The MLRMfor {mi} achieved an R2 score of 0.989. For comparison with prior studiesthat used total magnetization (mtotal) as the target property, our modelyielded R2 = 0.986 formtotal, exceeding earlier values of 0.75–0.89, 0.82, and0.9278,11,34. ForTc, themodel attainedR2 = 0.91 and classification accuracy of0.91, both substantially higher than previously reported values of R2 = 0.76and 0.73 and accuracy of 0.736,35.To the best of our knowledge, no previous study has predicted phononstability or Eaniso of Heusler compounds using ML. The effectiveness of theωmin and Eanisomodels developed here is supported by the validation resultsin Sec. “Results”.While phonon stability could also be assessedusing uMLIPcombined with phonon calculationmethods, our regressor-based approachis motivated by both efficiency and accuracy. Only theminimum frequencyis needed for phonon stability assessment, thus, a regressor is sufficient andmuch faster than calculating the full spectrum. We also evaluated uMLIP+phonon usingCHGNet,MatterSim, and eSEN-30M-OAMon a test set of1000 conventional ternary compounds, and found stability/instabilityclassification accuracies of 62.5%, 74.9%, and 80.2%, respectively, which aresubstantially below the regressor’s 93.6%. Notably, CHGNet misclassifies67.2% of stable compounds as unstable, showing a strong tendency tounderestimate stability, while eSEN weakly overestimates it andMatterSimexhibits a more balanced performance. Additional details, includingexample phonon spectra by uMLIPs and DFT, are in the SupplementaryInformation. The Eaniso model achieved an R2 of 0.68, lower than those of{mi}, ωmin, and Tc, highlighting the higher sensitivity and complexity ofFig. 4 | Benchmark of uMLIP performance. Lattice constants a and c, c/a ratio, totalenergy (E), formation energy (ΔE), and convex hull distance (ΔH) predicted byvarious uMLIP are benchmarked against DFT references. For each property, thefraction of compounds with predictions falling within specified relative error (RE) orabsolute error (AE) thresholds is reported. Energetic quantities (E, ΔE, and ΔH) areexpressed in eV/atom. The test set consists of 10,000 ground-state compoundsrandomly sampled from HeuslerDB.Table 1 | Performance comparison of the eSEN uMLIP andMLRM with ALIGNN models and previously reported resultsProperty Metric eSEN-30M-OAMALIGNN-FFPreviousreportsa R2 0.994 0.128 0.80–0.94a,0.94b,0.987c, 0.979dRMSE 0.023 0.330 0.11–0.12eΔE R2 0.995 0.453 0.80–0.88a,0.93b, 0.982cRMSE 0.029 0.310 0.117eΔH R2 0.980 0.330 0.91f, 0.969cProperty Metric eSENMLRMALIGNNMLRMPrevious reports Datasetsize{mi} R2 0.989 — — 27,864mtotal R2 0.986 0.904 0.75–0.89a,0.82g, 0.927c27,864∑∣mi∣ R2 0.989 0.891 — 27,864ωmin R2 0.750 0.734 — 8198Tc R2 0.910 0.844 0.76h, 0.73i 2106Accu. 0.910 — 0.73hEaniso R2 0.680 0.592 — 6123The size of dataset used for MLRM is also listed.a8 Dataset size is about 1000,b31 Dataset size is about 65,000,c11 Dataset size is about 500,000 for a andmtotal, and about 450,000 for ΔE and ΔH,d32 Dataset size is 143,e10 Dataset size is 16,272,f33 Dataset size is 426,148,g34 Dataset size is 1153,h6 Dataset size is 408,i35 Dataset size is 6500.https://doi.org/10.1038/s41524-026-02013-0 Articlenpj Computational Materials |          (2026) 12:133 5www.nature.com/npjcompumatsEaniso as a target property. Nevertheless, despite the reduced R2, its classifi-cation accuracy remains satisfactory and sufficient for integration into theML-HTP workflow.To benchmark advances in ML techniques since 2023, we applied theALIGNN-FF uMLIP and ALIGNN MLRM to identify conventional qua-ternary candidate compounds and evaluated the validation rates of strongEaniso compounds13,26. In this test, the scalar quantity∑∣mi∣was useddirectlyas the target property rather than being calculated from the {mi} prediction.Themetrics ofALIGNN-FF andALIGNNMLRMare summarized inTable1. Using all screening thresholds, only 17 compounds qualified as candi-dates. To improve statistical robustness, we removed the phonon stabilitycriterion, expanding the candidate list to 107 compounds, of which 26(24.3%) exhibit ∣Eaniso∣>1MJ/m3.While this yield is notably higher than the7.9% obtained from direct DFT-HTP screening, it remains far below the82.0%success rate achievedwith the eSEN-basedML-HTPworkflow.Theseresults highlight the substantial improvements in screening precisionenabled by the state-of-the-art eSEN model.We further tested a hybrid workflow in which structure optimizationwas performed with eSEN-30M-OAM uMLIP, while property predictionwas carried out using ALIGNN MLRMs. This approach yielded 276 can-didate compounds, of which 149 (54.0%)were confirmedbyDFT to exhibitstrong Eaniso. The improved yield relative to ALIGNN-FF-based optimiza-tion underscores the critical importance of accurate structure optimizationwith eSEN-30M-OAM for enhancing ML-HTP screening. However, theyield still falls short of the 82.0%achieved by the fully eSEN-basedworkflow,indicating that progress in both the uMLIP and MLRM components isessential for maximal efficiency. We also tested the inverse hybrid config-uration, using ALIGNN-FF uMLIP for structure optimization combinedwith eSEN MLRMs for property prediction. This workflow identified 243candidates, of which only 76 (31.3%) were validated as strong Eaniso com-pounds. Thismarked reduction inperformancehighlights thepivotal role ofselecting an accurate uMLIP for structure optimization.Prediction of local magnetic momentSince the goal of this study is to identify compounds with large Eaniso, it isfirst necessary to determinewhether a compound ismagnetic. Relying solelyon the total magnetization is inadequate, as it cannot capture anti-ferromagnetic (AFM) or low-moment ferrimagnetic (FiM) compounds. Toaddress this, we employed the total absolute magnetic moment, defined as∑i∣mi∣, where mi denotes the local magnetic moment at atomic site i.Localmagneticmomentswere predicted using anMLRMbasedon theeSEN architecture, trained to output the moment at each atomic site.Restricting to collinear configurations in which all moments are alignedalong the z-axis, eachmoment is represented by a scalar whose sign encodesthe direction, with ℓ = 0 per site in the output head. To account for the z-direction ambiguity-where a configuration and its sign-inverted counter-part (i.e., all local moments flipped) are physically equivalent—wemodifiedthe loss function to compute losses for both the predicted {mi} and its sign-inverted counterpart, and take the smaller value as the loss. This ensuresinvariance under global spin inversion.Figure 5a shows a scatter plot comparing {mi} from the MLRM andDFT for compounds in the test set, with histograms along the axes illus-trating their distributions. Local moments at the atom site Z are omitted asthey are nonmagnetic. Because 73.4% of the test compounds are non-magnetic, both histograms exhibit a pronounced peak at zero. Formagneticsystems, the global sign is adjusted so that the total magnetic moment ispositive; since most magnetic compounds are ferromagnetic, positivemoments dominate in the distribution. Nearly all points fall along thediagonal, and only 1.4% of points lie along either axis, demonstrating thatthe model accurately predicts both the magnitude and sign of localmoments, and reliably distinguishes ferromagnetic (FM) and ferrimagnetic(FiM) systems. For the magnetic/nonmagnetic classification, we evaluatedperformance using receiver operating characteristic (ROC) and precision-recall (PR) curves, as shown in Fig. S6. The area under the curve (AUC)values are 0.98 and 0.97, respectively, indicating highly accurateclassification.Predicting {mi} is a common yet computationally demanding step inHTP studies of magnetic materials. The approach developed in this workachieves high accuracy in {mi} prediction and is readily transferable to othersystems. A central question, however, is howmany training compounds arerequired to reach satisfactory accuracy. To address this, we performed alearning-curve analysis by training the MLRM on progressively largersubsets of the dataset and evaluating performance on a fixed test set of 5000compounds, of which 1486 are magnetic. For evaluation, local moments atthe X1, X2, and Y sites were concatenated across samples into a single array,while the Z site was excluded since it is nonmagnetic in conventionalHeusler compounds.The learning curve is shown in Fig. 5b, illustrating how model per-formance improves with increasing training set size. In the top panel, theR2scores for both local moments and their magnitudes are presented. The gapFig. 5 | Local magnetic moment prediction performance. a Scatter plot comparingML-predicted {mi} with DFT values for the test set. b Learning curves for {mi}prediction. The top panel shows R2 scores for both local moments and theirmagnitudes. The bottom panel reports the magnetic/nonmagnetic classificationaccuracy, and the fraction of compounds with absolute prediction error below 0.1 μBfor all compounds and for the magnetic subset.https://doi.org/10.1038/s41524-026-02013-0 Articlenpj Computational Materials |          (2026) 12:133 6www.nature.com/npjcompumatsbetween the two curves indicates that, while the model generally capturesthe magnitude accurately, it sometimes assigns the incorrect sign. Forexample, Mn2ScGe with DFT-computed local moments {2.62, 3.03,−0.29,−0.10} μB is predicted as {−2.71, 3.04, 0.02, −0.02} μB by MLRM. Thebottom panel reports two key metrics: (i) classification accuracy for iden-tifyingmagnetic systems and (ii) the fractionof localmomentswith absoluteerror below 0.1 μB, evaluated across all compounds andwithin themagneticsubset. With 5000 training samples, the model achieves a classificationaccuracy of 0.92, and 90% of all compounds fall within the 0.1 μB errorthreshold. However, this fraction decreases to 72% when restricted to themagnetic compounds subset, indicating that the model identifies whether asite is magnetic with high reliability but remains less accurate in predictingexact {mi} values. Increasing the training set to 125,000 samples improvesthis fraction to 82%, while relaxing the threshold to 0.2 μB further raises it to92%. Although performance improves with larger datasets, the gainsbecome progressively smaller. These results highlight the critical role ofdataset size in improving {mi} accuracy and inform the selection of trainingsize in future work focusing on other magnetic systems.Frozen transfer learning for MLRM constructionTo improve the performance of the MLRM, we employed a frozen transferlearning strategy using the eSEN-30M-OAMuMLIP as the basemodel. TheeSEN-30M-OAMuMLIPwas trained on theOMat,MPtrj, and sAlexandriadatabases, providing a comprehensive training set spanning the periodictable14,27,36–38. Through this training, the embedding and the first severallayers learn general chemical and structure representations. To leverage this,we transferred the parameters of the embedding and the first several layersinto ourMLRMandkept themfixed (frozen layers),while only updating theremaining layers and the output layer (flexible layers). This approach isanalogous to a recent work in which the initial layers of ORB, EqV2, orMACEuMLIPwere used to generate feature vectors that were subsequentlypassed to property prediction models such as MODNet, XGBoost,and MLP39.The eSEN-30M-OAMmodel consists of 10 layers. Figure 6a showsthe R2 scores of models trained with different numbers of frozen layers,denoted TL-uMLIP-n. In the n = 0 case, the embedding layer is alsoflexible. Results for ωmin, Tc, and Eaniso are presented. Model perfor-mance improves as the number of frozen layers increases up to n = 7,after which it declines when more layers are frozen. This trend reflectsthe balance between transferring knowledge from the base model andmaintaining sufficient flexibility to adapt to the new task. Based on thisanalysis, TL-uMLIP-7 was used in the ML-HTP case study for ωmin, Tc,Eaniso, and {mi} MLRM. For comparison, a model trained from scratch(w/o TL) is included, which yields lower R2 scores and highlights thebenefit of transfer learning. We also tested a variant initialized from abase model pre-trained on formation energy (ΔE) data from theHeuslerDB database (denoted TL-ΔE-0). TL-uMLIP-0 outperforms TL-ΔE-0, indicating that initialization from the eSEN-30M-OAM model,trained on a comprehensive dataset, is more advantageous.Besides improving overall performance, transfer learning from uMLIPcan significantly enhance model generalization to unseen elements. Todemonstrate this, we evaluated model performance using group-wise splitanalysis. For each group-wise split test, three neighboring elements withinthe same period were used as the holdout elements. Any compound con-taining at least one of the holdout elements was reserved as the holdout testdata; the remaining compoundswere split into train/valwith a 9:1 ratio.Thissetup simulates realistic scenarios where models are applied to materialsystems containing elements not present in the training data, a typicaldomain shift in materials science. For direct comparison, we also measuredperformance using train/val/test sets created using random splits whilekeeping the counts of train/val/test the same as in the group-wise split. Wetested six sets of d-block elements and four sets of p-block elements asholdout sets. We used Tc as the target property for efficiency. Resultsobtained using frozen transfer learning andmodels trained from scratch areshown in Fig. 6b.Using FTL, performance on group-wise splits is generally lowerthan on random splits, indicating that the model performance onunseen elements is reduced. The drop in performance for p-blockholdouts is smaller, which is consistent with the fact that p-blockelements are typically nonmagnetic in these compounds and thus lessdirectly related to the target property Tc. For group-wise split testswith models trained from scratch, a much larger performancedecrease is observed. This occurs because FTL transfers the generalchemical representation learned by uMLIP to property regressors,whereas models trained from scratch do not benefit from this priorknowledge. This highlights the advantage of transfer learning fromuMLIP for improving generalization to unseen elements. Variation inrandom-split performance is due to different train set sizes since thenumber of holdout compounds varies with the chosen holdoutelements.It should benoted thatTL is not always effective. It can be limitedwhenthe representations learned by the basemodel do not generalize to the targetdomain. For example, source and target may differ substantially in materialclass (e.g., bulk versus surface or molecular systems), or downstream taskconcerns different physics (e.g., static energetics versus dynamicproperties)15,40,41. Furthermore, a recent work, which transfers from amodeltrained using computational data to learn experimental data, revealed thatthe error of the transfer-learned model decreases according to a power-lawas the size of the computational data for the base model increases42. Thissuggests that transfer learningmay also be less effectivewhen thebasemodelis trained on a small dataset, even if the base model performance within itsown domain is good.Fig. 6 | Frozen transfer learning performance andgeneralization. a R2 of models initialized from theeSEN-30M-OAM uMLIP and trained with differentnumbers of frozen layers (denoted TL-uMLIP-n). Inthe n = 0 case, the embedding layer is also trainable.Amodel trained from scratch (denoted w/o TL) anda transfer learning variant initialized from a basemodel pre-trained on ΔE data from the HeuslerDBdatabase (denoted TL-ΔE-0) are included for com-parison. b Comparison of model performance ongroup-wise splits versus random splits. Models wereevaluated on predicting Tc. Results are shown forfrozen transfer learning (FTL) and models trainedfrom scratch (w/o TL). For each test, three neigh-boring elements within the same periodwere used asthe holdout elements. Six sets of d-block elementsand four sets of p-block elements were tested.https://doi.org/10.1038/s41524-026-02013-0 Articlenpj Computational Materials |          (2026) 12:133 7www.nature.com/npjcompumatsDiscussionWe demonstrated the feasibility of combining uMLIPs and MLRMs forHTP screening. As a case study, we identified 334 conventional quaternaryHeusler compounds and 924 all-d Heusler compounds that exhibit ther-modynamic, dynamical, and magnetic stability, together with large Eaniso.The precision of this workflow was confirmed through DFT validation ofthe candidate list.For other material systems, if a database for a subset is available, thesameworkflow can be applied to explore the remaining chemical space. Fornovel materials, uMLIP can be used directly to reduce the search space byfiltering for thermodynamic stability, while target properties need to becomputed for a representative subset of compounds to train MLRMs forML-HTP screening of the broader space.The MLRMs in this study involve a domain shift, as the train datasetdoes not contain quaternary or all-dHeusler compounds, yet themodels areapplied to these systems.While performance is satisfactory, further gains arepossible using an active learning approach. Iteratively refining themodels byselecting informative compounds can improve performance and helpidentify candidates missed by the current one-shot approach. In such aniterative framework, current DFT validation results can be added to thetraining set, forming the first iteration of model refinement. This iterativestrategy naturally extends the present work, enabling more thoroughexploration of chemical space while progressively improving model accu-racy. Such an approach is especially valuable when the targetmaterial spacediffers substantially from the training data.In the framework of this study, uMLIP perform the critical task ofstructure optimization, which is traditionally handled by DFT-derivedmethods, while MLRMs predict the properties of the optimized structures,tasks that are typically carried out using DFT or DFT-derived methods.Together, uMLIP and MLRMs enable a drop-in replacement for DFT inconventional HTP workflows. This replacement is general and can bereadily extended to other properties, material classes, and DFT-based HTPpipelines, enabling accelerated HTP screening and discovery of newmaterials.MethodsML-HTP workflowThe ML-HTP workflow is schematically illustrated in Fig. 1(b) anddescribed in detail below.In step (a) of the ML-HTP workflow, the structures were optimizedand theΔE andΔHwere calculated using uMLIP. The initial lattice constantwas estimated as the average value of known X2YZHeusler compounds inthe HeuslerDB database that share two elemental species with the targetcomposition. A conventional cell in cubic phase with this estimated latticeconstant was then constructed. To generate initial structures, the latticeparameters of this cell were systematically varied: the a, b, and c wereuniformly scaled by ± 10% or ± 30%, and, alternatively, the c alone wasscaled by ±10% or ±30%. All generated structures were subsequently con-verted to the primitive unit cell and relaxed using the eSEN-30M-OAMuMLIP. The structure with the lowest energy after relaxation was identifiedas the ground state. The choice of eSEN-30M-OAM was motivated by itssuperior performance relative to other uMLIP, as discussed in Sec. “uMLIPoptimization performance”. The selection of initial structures was validatedby a convergence test, which is provided in Fig. S4.For a compound to be thermodynamically stable against decom-position into its constituent elements or competing phases, the formationenergy must be negative (ΔE < 0), and the distance to the convex hullmust be zero (ΔH = 0). In practice, however, metastable phases (ΔH > 0)at 0 K may become stabilized at finite temperature43. Following ourprevious work, we adopt a practical stability criterion of ΔE < 0.0 eV/atom and ΔH < 0.22 eV/atom, which has been shown to effectivelycapture experimentally accessible compounds21. Using the energies ofground state candidates, constituent elements and competing phasespredicted by eSEN-30M-OAM, we computed ΔE and ΔH to assessthermodynamic stability.In step (b) of theworkflow, theωmin, {mi},Tc, andEaniso were predictedwithMLRMs trained on theHeuslerDB and additional computed data. TheuMLIP-optimized structures were used as inputs. The construction of theseMLRMs is described in Sec. “Prediction of localmagneticmoment” and Sec.“Frozen transfer learning for MLRM construction”. Compounds wereclassified as dynamically stable forωmin >− 10 cm−1, magnetic for∑i∣mi∣ >0.1μB/f.u.,magnetically stable forTc>300 K, and strongEaniso candidates for∣Eaniso∣ > 1MJ/m3. Tetragonal compounds satisfying all of these conditionswere designated as promising stable materials with strong Eaniso.In step (c) of workflow, the candidate list was validated with DFTcalculations to assess the reliability of the ML-HTP workflow. Structureoptimizations were performed starting from various initial spin config-urations, consistent with our previous DFT-HTP work. For conventionalHeusler compounds, the magnetic moments at the X1, X2, and Y sites wereinitialized in configurations where theywere either parallel or antiparallel toeach other. For all-d Heuslers, spin arrangements on all four sites wereconsidered. To capture possible high-spin and low-spin states, two initialmagnitudes of the local moments (∣mi∣= 1 and 4 μB) were tested, alongwitha nonmagnetic configuration (∣mi∣ = 0). The uMLIP-optimized structuresserved as the starting geometries. After structure relaxation, the ground statewas identified by comparing total energies. For the resulting ground states,we computed ΔE, ΔH, Eaniso, phonon, and Tc using VASP, OQMD, ALA-MODE, and SPRKKR44–49.Computational methodsThe uMLIP-based structural optimizations were performed using theAtomic Simulation Environment (ASE) package50. The Fast InertialRelaxation Engine (FIRE) optimizer was employed, with symmetry con-straints enforced throughout the relaxationprocess51. To ensure consistencyin reference energies for computing ΔE and ΔH, the elemental phases andcompeting phases were also optimized using the same uMLIP. Their initialgeometries were taken from DFT-optimized structures in the OpenQuantum Materials Database (OQMD) database44,45.The MLRMs were developed to predict ωmin, {mi}, Tc, and Eaniso. Toleverage prior knowledge, we employed a frozen transfer learning strat-egy, as illustrated in Fig. 1a. Each MLRM was initialized from the pre-trained eSEN-30M-OAM uMLIP, with the embedding layer and the firstseven message-passing layers kept frozen, while the final three layers andthe output layer were fine-tuned. This framework was implementedusing a modified version of the FAIRChem package (v1)52. For the {mi}MLRM training, the loss function was adapted to address the global signambiguity of magnetic moments, as described in Sec. “Prediction of localmagnetic moment”. The training dataset consisted of HeuslerDB toge-ther with newly computed Eaniso and Tc values based on the optimizedstructures in HeuslerDB. The data were randomly partitioned intotraining, validation, and test sets in an 8:1:1 ratio. For {mi}, we used allground-state entries, resulting in 27,864 data points. The ωmin data wereavailable for all thermodynamically stable ground states, yielding 8198entries. For Tc, 2106 data points were used, including 750 newly com-puted values. Since Eaniso data were not included in HeuslerDB, we cal-culated Eaniso for all magnetic tetragonal ground-state systems, obtaining6123 entries.DFT calculations were performed primarily with Vienna ab initioSimulation Package (VASP)53,54, using the projector augmented wave(PAW) method and the generalized gradient approximation (GGA) withthe Perdew-Burke-Ernzerhof (PBE) functional55,56.ΔE,ΔH, phonon, andTcwere computed using OQMD, ALAMODE, and SPRKKR following themethodology of our previous DFT-HTP study of ternary Heuslercompounds21,44–49. The agreement ofmagneticmoments betweenVASPandSPRKKR is demonstrated in Fig. S7. Phonon calculations for the all-dcompoundsMnOsMnRe andMnReMnRu failed due to convergence issuesinDFT, and these compoundswere treated as unstable in the validation rateanalysis.Eanisowas calculatedasEaniso =E⊥−E∥using the force theorem57–59.Calculations were performed in the primitive cell with k-meshes generatedusing PythonMaterials Genomics (pymatgen) at a density of 6000Å−3, andhttps://doi.org/10.1038/s41524-026-02013-0 Articlenpj Computational Materials |          (2026) 12:133 8www.nature.com/npjcompumatsthe tetrahedron method with Blöchl corrections was applied60,61. Inputgeneration, structuremanipulation, and symmetry analysiswere carried outusing pymatgen, ASE, ASE2SPRKKR, and spglib45,50,60,62,63.Computational costThe computational cost for each task in the HTP workflow is summar-ized inTable2.DFTtimingsarebasedonvalidationruns forML-selectedcandidates,whileMLtimingsare taken fromtheML-HTPscreening.ForDFT,wereportper-job statistics suchasmean,median, and interquartilerange (IQR). uMLIP structure optimization and MLRM predictionswere performed in batches, and mean values are reported as individualper-job timings are not available. MLRM training times correspond tothe wall time required to train a single model on one GPU. DFT calcu-lations except phonon were performed on dual-socket Intel Xeon Pla-tinum 8268 (Cascade Lake, 24 cores per CPU, 2.9 GHz, 48 cores pernode), phonon calculations were performed on Fujitsu A64FX pro-cessors (Armv8.2-A SVE 512 bit, 48 compute cores per CPU, 2.2 GHz),uMLIP structure optimizations were performed on dual-socket IntelXeonGold 6230 (Cascade Lake, 20 cores per CPU, 2.1 GHz, 40 cores pernode), andMLRMswere trained andapplied onNVIDIARTX6000AdaGPUs. All DFT calculations, except for phonon calculations, were per-formedon2nodes; phononcalculationswereperformedon6nodes.TheuMLIP structure optimizations were performed on 1 node, and MLRMtraining and prediction were performed on 1 GPU.In the ML-HTP case study, uMLIP structure optimization, MLRMtraining, andMLRMpredictionused 1,835node-hours, 23GPU-hours, and4 GPU-hours, respectively. DFT calculations of ML-selected candidatesconsumed 1,645 node-hours for optimizations and 99,680 node-hours forproperty evaluations. Ignoring thenodedifference, the totalML-HTPcost is103,160 node-hours plus 27 GPU-hours. A DFT-HTP workflow screeningthe same chemical space would need about 256,160 node-hours for struc-ture optimizations and 18,802,625 node-hours for property evaluations,estimated using statistics reported in Table 2. This comparison shows anestimated speed-up of 185 times for ML-HTP with DFT validation, or 104times if used without DFT validation at the end. Please note that theseestimates are approximate, and waiting time for job execution, human timespent ondebugging andworkflowmanagement arenot included.Theactualspeed-ups also vary based on material systems, target properties, andcomputational resources.Data availabilityThe ML-HTP candidate list and DFT validation results are included in theSupplementary Information as Tables S3 and S4. The complete set of allscreened compounds, along with ML-predicted properties, will be madeavailable through the HeuslerDB database at https://www.nims.go.jp/group/spintheory/.Code availabilityThe developed packagesMLIP-HOTandMLIP-FTLwill bemade availablethrough the Spin Theory Group GitHub repository at https://github.com/nims-spin-theory and our group website at https://www.nims.go.jp/group/spintheory/.Received: 2 September 2025; Accepted: 8 February 2026;References1. Sanvito, S. et al. Accelerated discovery of newmagnets in theHeusleralloy family. Sci. Adv. 3, e1602241 (2017).2. Zhang, H. High-throughput design of magnetic materials. Electron.Struct. 3, 033001 (2021).3. Barwal, V. et al. Large magnetoresistance and high spin-transfertorque efficiency of Co2MnxFe1−xGe (0 ≤ x ≤ 1) Heusler alloy thin filmsobtained by high-throughput compositional optimization usingcombinatorially sputtered composition-gradient film. APL Mater. 12,111114 (2024).4. Faleev, S. V. et al. Heusler compounds with perpendicular magneticanisotropy and large tunneling magnetoresistance. Phys. Rev. Mater.1, 024402 (2017).5. Hu, K. et al. High-throughput design of Co-based magnetic Heuslercompounds. Acta Mater. 259, 119255 (2023).6. Hilgers, R., Wortmann, D. & Blügel, S. Machine Learning-basedestimation and explainable artificial intelligence-supportedinterpretation of the critical temperature from magnetic ab initioHeusler alloys data. Phys. Rev. Mater. 9, 044412 (2025).7. Baigutlin, D. R., Sokolovskiy, V. V., Buchelnikov, V. D. & Taskaev, S. V.Machine learning algorithms for optimization ofmagnetocaloric effectin all-d-metal Heusler alloys. J. Appl. Phys. 136, 183903 (2024).8. Mitra, S., Ahmad, A., Biswas, S. & Kumar Das, A. A machine learningapproach to predict the structural andmagnetic properties of Heusleralloy families. Comput. Mater. Sci. 216, 111836 (2023).9. Liu, C. et al. Machine learning to predict quasicrystals from chemicalcompositions. Adv. Mater. 33, 2102507 (2021).10. Xie, R., Crivello, J.-C. & Barreteau, C. Screening new quaternarysemiconductor Heusler compounds by machine-learning methods.Chem. Mater. 35, 7615–7627 (2023).11. Lu, Y., Sun, Y., Hou, C., Li, Z. & Ni, J. Explainable attention CNN forpredicting properties of Heusler alloys. J. Phys. Chem. C 129,14958–14967 (2025).12. Xie, T. &Grossman, J. C. Crystal graph convolutional neural networksfor an accurate and interpretable prediction of material properties.Phys. Rev. Lett. 120, 145301 (2018).13. Choudhary, K. & DeCost, B. Atomistic line graph neural network forimproved materials property predictions. npj Comput. Mater. 7, 1–8(2021).14. Barroso-Luque, L. et al. Open materials 2024 (OMat24) inorganicmaterials dataset and models.Preprint at arXiv https://doi.org/10.48550/arXiv.2410.12771 (2024).15. Yamada, H. et al. Predicting materials properties with little data usingshotgun transfer learning. ACS Cent. Sci. 5, 1717–1730 (2019).16. Lee, J. & Asahi, R. Transfer learning for materials informatics usingcrystal graph convolutional neural network.Comput. Mater. Sci. 190,110314 (2021).17. Hoffmann, N., Schmidt, J., Botti, S. & Marques, M. A. L. Transferlearning on large datasets for the accurate prediction of materialproperties. Digit. Discov. 2, 1368–1379 (2023).18. He, J., Rabe, K. M. & Wolverton, C. Computationally accelerateddiscovery of functional and structuralHeuslermaterials.MRSBull.47,559–572 (2022).19. Noky, J., Zhang, Y., Gooth, J., Felser, C. & Sun, Y. Giant anomalousHall and Nernst effect in magnetic cubic Heusler compounds. npjComput. Mater. 6, 1–8 (2020).Table 2 | Computational cost of each task in theHTPworkflow,reported in node-minutesDFT (mean/median/IQR)MLtraining (mean)MLprediction (mean)Structurerelaxation5.6/3.8/3.0 — 0.12{mi} — 876 <1e-03Phonon stability 4542/4008/1442 258 <1e-03Tc 70/66/14 78 <1e-03Eaniso 142/133/34 192 <1e-03DFTentries showper-job statistics (mean/median/IQR).MLentries showmeanvalues. For structurerelaxation, one job corresponds to a single relaxation from an initial distortion. For structureoptimization, the cost per compound should bemultiplied by the number of initial structures, whichvaries acrossmaterial systems; here, we report the cost for a single structure relaxation. The uMLIPis pre-trained and used without fine-tuning, so its training time is excluded. DFT structureoptimizations directly yield local magnetic moments, so no separate timing is reported.https://doi.org/10.1038/s41524-026-02013-0 Articlenpj Computational Materials |          (2026) 12:133 9https://www.nims.go.jp/group/spintheory/https://www.nims.go.jp/group/spintheory/https://github.com/nims-spin-theoryhttps://github.com/nims-spin-theoryhttps://www.nims.go.jp/group/spintheory/https://www.nims.go.jp/group/spintheory/https://doi.org/10.48550/arXiv.2410.12771https://doi.org/10.48550/arXiv.2410.12771https://doi.org/10.48550/arXiv.2410.12771www.nature.com/npjcompumats20. Xing, G., Masuda, K., Tadano, T. & Miura, Y. Chemical-substitution-driven giant anomalous Hall and Nernst effects in magnetic cubicHeusler compounds. Acta Mater. 270, 119856 (2024).21. Xiao, E. & Tadano, T. High-throughput computational screening ofHeusler compounds with phonon considerations for enhancedmaterial discovery. Acta Mater. 297, 121312 (2025).22. Xie, Y., Tritsaris, G. A., Grånäs, O. & Rhone, T. D. Data-driven studiesof themagnetic anisotropy of two-dimensional magnetic materials. J.Phys. Chem. Lett. 12, 12048–12054 (2021).23. Liao, T. et al. Predicting magnetic anisotropy energies using site-specific spin-orbit coupling energies and machine learning:application to iron-cobalt nitrides.Phys. Rev.Mater. 6, 024402 (2022).24. Dutta, A. & Sen, P. Machine learning assisted hierarchical filtering: astrategy for designing magnets with large moment and anisotropyenergy 10, 3404–3417 (2022).25. Fu, X. et al. Learning smooth and expressive interatomic potentials forphysical property prediction. Proc. Mach. Learn. Res. 267,17875–17893 (2025)..26. Choudhary, K. et al. Unified graph neural network force-field for theperiodic table: Solid state applications.Digit. Discov. 2, 346–355 (2023).27. Deng, B. et al. CHGNet as a pretrained universal neural networkpotential for charge-informedatomisticmodelling.Nat.Mach. Intell.5,1031–1041 (2023).28. Kim, J. et al. Data-efficient multifidelity training for high-fidelitymachine learning interatomic potentials. J. Am. Chem. Soc. 147,1042–1054 (2025).29. Yan, K. et al. A materials foundation model via hybrid invariant-equivariant architectures. Preprint at arXiv https://doi.org/10.48550/arXiv.2503.05771 (2025).30. Yang, H. et al. MatterSim: a deep learning atomistic model acrosselements, temperatures and pressures. Preprint at arXiv https://doi.org/10.48550/arXiv.2405.04967 (2024).31. Hu, X. et al. Searching high spin polarization ferromagnet in Heusleralloy via machine learning. J. Phys. Condens. Matter 32, 205901(2020).32. Miyazaki, H. et al. Machine learningbasedprediction of lattice thermalconductivity for half-Heusler compounds using atomic information.Sci. Rep. 11, 13410 (2021).33. Kim, K. et al. Machine-learning-accelerated high-throughputmaterials screening: Discovery of novel quaternary Heuslercompounds. Phys. Rev. Mater. 2, 123801 (2018).34. Liu, K. et al. Machine learning assisted development of Heusler alloysfor high magnetic moment. Comput. Mater. Sci. 250, 113692 (2025).35. Hirohata, A. et al. Machine learning for the development of newmaterials for a magnetic tunnel junction. npj Spintron. 3, 1–9 (2025).36. Jain, A. et al. Commentary: TheMaterials Project: amaterials genomeapproach to accelerating materials innovation. APL Mater. 1, 011002(2013).37. Schmidt, J. et al. Machine-learning-assisted determination of theglobal zero-temperature phase diagram of materials. Adv. Mater. 35,2210788 (2023).38. Schmidt, J. et al. Improving machine-learning models in materialsscience through large datasets.Mater. Today Phys. 48, 101560 (2024).39. Kim, S. Y., Park, Y. J. & Li, J. Leveraging neural network interatomicpotentials for a foundation model of chemistry (2025). https://doi.org/10.48550/arXiv.2506.18497 (2025).40. Chen, C., Ye,W., Zuo, Y., Zheng, C. & Ong, S. P. Graph networks as auniversal machine learning framework for molecules and crystals.Chem. Mater. 31, 3564–3572 (2019).41. Chang, R., Wang, Y.-X. & Ertekin, E. Towards overcoming datascarcity in materials science: unifying models and datasets with amixture of experts framework. npj Comput. Mater. 8, 242 (2022).42. Minami, S. et al. Scaling law of Sim2Real transfer learning inexpanding computational materials databases for real-worldpredictions. npj Comput. Mater. 11, 146 (2025).43. Sun, W. et al. The thermodynamic scale of inorganic crystallinemetastability. Sci. Adv. 2, e1600225 (2016).44. Saal, J. E., Kirklin, S., Aykol, M., Meredig, B. &Wolverton, C. Materialsdesign and discovery with high-throughput density functional theory:the open quantum materials database (OQMD). JOM 65, 1501–1509(2013).45. Bahn, S. & Jacobsen, K. An object-oriented scripting interface to alegacy electronic structure code. Comput. Sci. Eng. 4, 56–66 (2002).46. Tadano, T. & Tsuneyuki, S. Self-consistent phonon calculations oflattice dynamical properties in cubic SrTiO3 with first-principlesanharmonic force constants. Phys. Rev. B 92, 054301 (2015).47. Tadano, T., Gohda, Y. & Tsuneyuki, S. Anharmonic force constantsextracted from first-principles molecular dynamics: applications toheat transfer simulations. J. Phys.: Condens. Matter 26, 225402(2014).48. Ebert, H., Ködderitzsch, D. & Minár, J. Calculating condensed matterproperties using the KKR-Green’s function method—recentdevelopments and applications. Rep. Prog. Phys. 74, 096501 (2011).49. Liechtenstein, A. I., Katsnelson,M. I., Antropov, V. P. &Gubanov, V. A.Local spin density functional approach to the theory of exchangeinteractions in ferromagneticmetals andalloys.J.Magn.Magn.Mater.67, 65–74 (1987).50. Hjorth Larsen, A. et al. The atomic simulation environment—a Pythonlibrary for working with atoms. J. Phys. Condens. Matter 29, 273002(2017).51. Bitzek, E., Koskinen, P., Gähler, F., Moseler, M. & Gumbsch, P.Structural relaxationmade simple.Phys. Rev. Lett. 97, 170201 (2006).52. Chanussot, L. et al. Open Catalyst 2020 (OC20) dataset andcommunity challenges.ACSCatal.11, 6059–6072, https://doi.org/10.1021/acscatal.0c04525 (2021).53. Kresse, G. & Furthmüller, J. Efficiency of ab-initio total energycalculations formetals and semiconductors using a plane-wavebasisset. Comput. Mater. Sci. 6, 15–50 (1996).54. Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initiototal-energy calculations using a plane-wave basis set. Phys. Rev. B54, 11169–11186 (1996).55. Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to theprojector augmented-wave method. Phys. Rev. B 59, 1758–1775(1999).56. Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradientapproximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996).57. Daalderop, G. H. O., Kelly, P. J. & Schuurmans, M. F. H. First-principles calculation of the magnetocrystalline anisotropy energy ofiron, cobalt, and nickel. Phys. Rev. B 41, 11919–11937 (1990).58. Xing, G., Miura, Y. & Tadano, T. Lattice dynamics and its effects onmagnetocrystalline anisotropy energy of pristine and hole-dopedYCo5 from first principles. Phys. Rev. B 105, 104427 (2022).59. Xing, G., Miura, Y. & Tadano, T. First-principles prediction of phasetransition of YCo5 from self-consistent phonon calculations. Phys.Rev. B 108, 014304 (2023).60. Ong, S. P. et al. Python materials genomics (pymatgen): a robust,open-source Python library for materials analysis. Comput. Mater.Sci. 68, 314–319 (2013).61. Blöchl, P. E., Jepsen, O. & Andersen, O. K. Improved tetrahedronmethod for Brillouin-zone integrations.Phys. Rev. B 49, 16223–16233(1994).62. ASE2SPRKKR software package— ASE2SPRKKR documentation.https://ase2sprkkr.github.io/ase2sprkkr/.63. Togo, A., Shinohara, K. & Tanaka, I. Spglib: a software library forcrystal symmetry search. Sci. Technol. Adv. Mater. Methods 4,2384822 (2024).AcknowledgementsThis study used computational resources of the supercomputer Fugakuprovided by the RIKEN Center for Computational Science (Project ID:https://doi.org/10.1038/s41524-026-02013-0 Articlenpj Computational Materials |          (2026) 12:133 10https://doi.org/10.48550/arXiv.2503.05771https://doi.org/10.48550/arXiv.2503.05771https://doi.org/10.48550/arXiv.2503.05771https://doi.org/10.48550/arXiv.2405.04967https://doi.org/10.48550/arXiv.2405.04967https://doi.org/10.48550/arXiv.2405.04967https://doi.org/10.48550/arXiv.2506.18497https://doi.org/10.48550/arXiv.2506.18497https://doi.org/10.48550/arXiv.2506.18497https://doi.org/10.1021/acscatal.0c04525https://doi.org/10.1021/acscatal.0c04525https://doi.org/10.1021/acscatal.0c04525https://ase2sprkkr.github.io/ase2sprkkr/https://ase2sprkkr.github.io/ase2sprkkr/www.nature.com/npjcompumatshp250229), the computer resources provided by ISSP, U-Tokyo under theprogram of SCCMS, and the computer resources at NIMS NumericalMaterials Simulator. This study was supported by MEXT Program: DataCreation and Utilization-Type Material Research and Development Project(Digital Transformation Initiative Center for Magnetic Materials) GrantNumber JPMXP1122715503andas “ProgramforPromotingResearchesonthe Supercomputer Fugaku” (Data-Driven ResearchMethods Developmentand Materials Innovation Led by Computational Materials Science,JPMXP1020230327).Author contributionsT.T. conceptualized, designed, and supervised the project; reviewed andedited the manuscript. T.T. and E.X. developed the methodology and codeimplementation; performed the calculations and analysis; E.X. drafted themanuscript.Competing interestsThe authors declare no competing interests.Additional informationSupplementary information The online version containssupplementary material available athttps://doi.org/10.1038/s41524-026-02013-0.Correspondence and requests for materials should be addressed toEnda Xiao or Terumasa Tadano.Reprints and permissions information is available athttp://www.nature.com/reprintsPublisher’s note Springer Nature remains neutral with regard tojurisdictional claims in published maps and institutional affiliations.Open Access This article is licensed under a Creative CommonsAttribution 4.0 International License, which permits use, sharing,adaptation, distribution and reproduction in anymedium or format, as longas you give appropriate credit to the original author(s) and the source,provide a link to the Creative Commons licence, and indicate if changeswere made. The images or other third party material in this article areincluded in the article’s Creative Commons licence, unless indicatedotherwise in a credit line to the material. If material is not included in thearticle’sCreativeCommons licence and your intended use is not permittedby statutory regulation or exceeds the permitted use, you will need toobtain permission directly from the copyright holder. To view a copy of thislicence, visit http://creativecommons.org/licenses/by/4.0/.© The Author(s) 2026https://doi.org/10.1038/s41524-026-02013-0 Articlenpj Computational Materials |          (2026) 12:133 11https://doi.org/10.1038/s41524-026-02013-0http://www.nature.com/reprintshttp://creativecommons.org/licenses/by/4.0/www.nature.com/npjcompumats Accurate screening of functional materials with machine-learning potential and transfer-learned regressions: Heusler alloy benchmark Results Validation of ML-HTP selected candidates Distribution of strong Eaniso candidates uMLIP optimization performance Improvements over existing approaches Prediction of local magnetic moment Frozen transfer learning for MLRM construction Discussion Methods ML-HTP workflow Computational methods Computational cost Data availability Code availability References Acknowledgements Author contributions Competing interests Additional information