# Fileset

[s41524-026-02033-w.pdf](https://mdr.nims.go.jp/filesets/b05d68f2-dfcb-4365-84ac-765fd3e43767/download)

## Creator

Masato Ohnishi, Tianqi Deng, Pol Torres, Zhihao Xu, [Terumasa Tadano](https://orcid.org/0000-0002-8132-2161), Haoming Zhang, Wei Nong, Masatoshi Hanai, Zeyu Wang, Michimasa Morita, Zhiting Tian, Ming Hu, Xiulin Ruan, Ryo Yoshida, Toyotaro Suzumura, Lucas Lindsay, Alan J. H. McGaughey, Tengfei Luo, Kedar Hippalgaonkar, Junichiro Shiomi

## Rights

[Creative Commons BY Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/)

## Other metadata

[Database and deep-learning scalability of anharmonic phonon properties by automated brute-force first-principles calculations](https://mdr.nims.go.jp/datasets/204eb1aa-8754-4c07-8247-66db101cecfb)

## Fulltext

Database and deep-learning scalability of anharmonic phonon properties by automated brute-force first-principles calculationsnpj | computationalmaterials ArticlePublished in partnership with the Shanghai Institute of Ceramics of the Chinese Academy of Scienceshttps://doi.org/10.1038/s41524-026-02033-wDatabase and deep-learning scalability ofanharmonic phonon properties byautomated brute-force first-principlescalculationsCheck for updatesMasato Ohnishi1,2 , Tianqi Deng3,4, Pol Torres5, Zhihao Xu6, Terumasa Tadano7, Haoming Zhang3,4,Wei Nong8, Masatoshi Hanai9, Zeyu Wang10, Michimasa Morita10, Zhiting Tian11, Ming Hu12, Xiulin Ruan13,Ryo Yoshida2,14,15, Toyotaro Suzumura9, Lucas Lindsay16, Alan J. H. McGaughey17, Tengfei Luo6,18,Kedar Hippalgaonkar8,19 & Junichiro Shiomi1,2,10,20Understanding the anharmonic phonon properties of crystal compounds—such as phonon lifetimesand thermal conductivities—is essential for investigating and optimizing their thermal transportbehaviors. These properties also impact optical, electronic, and magnetic characteristics throughinteractions between phonons and other quasiparticles and fields. In this study, we develop anautomated first-principles workflow to calculate anharmonic phonon properties and build acomprehensive database encompassingmore than 6500 inorganic compounds. Utilizing this dataset,we train a graph neural network model to predict thermal conductivity values and spectra fromstructural parameters, demonstrating a scaling law in which prediction accuracy improves withincreasing training data size. High-throughput screening with the model enables the identification ofmaterials exhibiting extreme thermal conductivities—both high and low. The resulting database offersvaluable insights into the anharmonic behavior of phonons, thereby accelerating the design anddevelopment of advanced functional materials.In recent years, the integration of traditional materials science approaches,rooted in fundamental principles, with data-driven methodologies—col-lectively known as Materials Informatics (MI)—has rapidly advanced,leading to significant breakthroughs in the development of materials forbatteries1–3, catalysts4, magnetic systems5, and beyond. For inorganicmaterials, large-scale computational databases have served as the backboneofMI efforts, including theMaterials Project (2013)6–8 with data on 170,000materials, OQMD (2013)9,10 with 1.2 million materials, and AFLOW(2014)11 with 3.5 million materials. More recently, a series of emergingdatabases have expanded this landscape, such as a database dedicated toFm�3m cubic structures with over 200,000 entries, the Carolina MaterialsDatabase (2020)12,13, DeepMind’s GNoME containing 40 million novelcrystal structures (2024)14, and META’s OMat24 with 1.1 billion densityfunctional theory (DFT) calculation entries (2024)15. However, these data-bases primarily focus on crystal structures and properties derived fromrelatively straightforward calculations, such as electronic band structuresand band gaps.In contrast, databases centered on lattice thermal properties, whichdominate heat transport in non-metallic materials, remain relativelyscarce. Existing simulation-based resources largely provide harmonicphonon properties or lattice thermal conductivity estimates based onapproximations—for example, Phonondb16 offers harmonic propertiesfor ~10,000 materials, and AFLOW employs the quasiharmonic Debyeapproximation17. Experiment-based databases, such as Starrydata18 andAtomWork19, compile thermal conductivity and thermoelectric datafrom the literature. However, these data are significantly influenced byextrinsic factors such as grain size20,21, carrier density, composition22,23,impurities24,25, defects, strain26–29, and uncertainty of themeasurement30.Such factors are often undocumented and difficult to control, posingchallenges for reliable predictive modeling. Therefore, a first-principles-based database of anharmonic phonon properties is essen-tial for accurately capturing intrinsic thermal behavior, includingphonon lifetimes and thermal conductivity, without relying onempirical assumptions.A full list of affiliations appears at the end of the paper. e-mail: masato.ohnishi.ac@gmail.com; shiomi@photon.t.u-tokyo.ac.jpnpj Computational Materials |          (2026) 12:150 11234567890():,;1234567890():,;http://crossmark.crossref.org/dialog/?doi=10.1038/s41524-026-02033-w&domain=pdfhttp://crossmark.crossref.org/dialog/?doi=10.1038/s41524-026-02033-w&domain=pdfhttp://crossmark.crossref.org/dialog/?doi=10.1038/s41524-026-02033-w&domain=pdfmailto:masato.ohnishi.ac@gmail.commailto:shiomi@photon.t.u-tokyo.ac.jpwww.nature.com/npjcompumatsComplementing these efforts, a team at Microsoft has recently devel-oped an extensive database of anharmonic phonon properties for ~246,000materials31, using machine learning potentials32. While this represents asignificant step forward inmaterial research, the availablematerial space formachine learning potential is limited to relatively simple systems due to thefocus on high thermal conductivity as the target property—specifically,binary compounds with up to four atoms per primitive cell and ternarycompounds composed of group 13–16 elementswith up to seven atoms perprimitive cell. Additionally, machine learning potentials are trained on dataderived from first-principles calculations; therefore, their ability to inher-ently discover entirely new materials may be limited. Therefore, thereremains a need for a first-principles-based database that spans both simpleand structurally complex materials.A first-principles database of anharmonic phonon properties is valu-able not only for predicting thermal behaviors of materials but also forunderstanding a wide range of other material properties. Phonons interactwith various particles and excitations—such as electrons33, magnons34,35,photons36,37, plasmons38, and polaritons39,40—affecting39,40 mechanical,electrical, electronic, optical, and magnetic properties. This highlights theimportance of detailed phonon-property datasets that comprehensivelycapture vibrational properties of solids, particularly describing anharmonicphonon properties based on theoretical calculations using consistentcomputational approaches/parameters. Such a database will offer criticalinsights into diversematerial behaviors and accelerate the discovery of novelfunctional materials.First-principles approaches for calculating anharmonic phononproperties in condensed materials have been actively pursued for manyyears, triggered by the development of computational methods using DFTaround 201041–43. In standard first-principles phonon analysis, three-phonon scattering rates are evaluated via quantum perturbative theoriesunder the relaxation time approximation44–47 to solve the Boltzmanntransport equation (BTE)48. This approach has been widely applied and hasbecome a rigorous and foundational numerical application for under-standing and predicting thermal transport in materials. Building on thisframework, a variety of methods have been developed or integrated intocomputational packages to enhance the accuracy of phonon property cal-culations, particularly for systems with extreme thermal transport beha-viors. Iterative46,49,50 and direct51,52 solutions to the BTE offer improvedtreatment of phonon-phonon interactions by considering the effects of bothnormal and Umklapp scattering rates, whereas the relaxation timeapproximation considers only Umklapp scattering as resistive. Further-more, four-phonon interactions53,54 in non-metallic systems have beenshown to play a significant role in determining their thermal transportbehaviors.At finite temperatures, phonon renormalization modifies harmonicforce constants, a process that can be accounted for using first-order self-consistent phonon theory55–57 and its improved variant incorporating thebubble self-energy corrections58. The phonon gas model, which treatsphonons as heat-carrying particles that scatter and propagate likemoleculesin a gas, is extended by the unified phonon theory—also known as theWigner heat transport formulation59––, which provides a framework foranalyzingphonon transport inboth theparticle (Peierls transport) andwave(coherent transport) pictures.In addition to phonon-phonon interactions, other scatteringmechanisms and intrinsic factors can also play a significant role in thermaltransport. Electron-phonon interactions can be accurately analyzed usingfirst-principles methods60–62. Weak and strong impurity scatterings can beeffectively treated using the perturbative24 or T-matrix approaches63,64,respectively. Additionally, intrinsic structural fluctuations at finite tem-peratures, particularly in complex compounds, can be captured through acombination of cluster expansion and Monte Carlo simulations22,23,65.Although this current study employs a fundamental approach based onthree-phonon interactions within the relaxation time approximation, theresulting data provide a solid foundation for advanced calculationsincluding more complex scattering effects.With the advancement of computationalmethods, the development ofthermofunctional materials has been accelerated through the integration ofinformatics and data science. Early studies in this field employed high-throughput calculations with simplified models to identify materials withPeierls lattice thermal conductivities κp � 1:0 ½Wm�1K�1�66, and Bayesianoptimization techniques were used to discover materials withκp < 0:5 ½Wm�1K�1�67. However, access to anharmonic phonon propertydata remains limited. To circumvent this, researchers have used harmonicphonon properties68 and other material descriptors69, focusing on specificmaterials such as half-Heuslers70 and chalcogenides71, and have developedthermal conductivity databases based on approximations72, including theCallaway model73 and minimum thermal conductivity model74.In parallel, various techniques have emerged to estimate higher-orderforce constants at a practical computational cost as the number of dis-placement patterns required by the finite-displacement method increasesrapidly with the order of the force constants. Approaches such as com-pressive sensing56,75, projector-basedmethods for constructing orthonormalbasis sets76, and machine learning potentials77,78 have been explored. Fur-thermore, fine-tuned models79 derived from foundation models80 havedemonstrated improved accuracy. In addition to force constants, the ana-lysis of high-order anharmonic phonon properties—such as four-phononscattering and phonon renormalization—remains computationallyintensive81. To address this, machine learning approaches have beenintroduced, including transfer learning to estimate four-phonon scatteringrates using three-phonon scattering data82.Driven by this need for a first-principles-based anharmonic phononpropertydatabase andbuildingon recent advancements inphononanalysis,we developed an automated computational framework for first-principlesphonon calculations that streamlines the workflow and reduces computa-tional complexity. Using this framework, we constructed a large-scaledatabase comprising anharmonic phonon properties for over 6500 mate-rials, systematically capturing phonon transport characteristics across awide range of material classes. Leveraging this dataset, we applied machinelearning techniques to predict key anharmonic phonon properties,including Peierls lattice thermal conductivity and its spectral distribution.This integrated approach not only deepens our understanding of anhar-monic phonon behavior but also accelerates the data-driven discovery ofnovel functional materials across various application domains.ResultsAutomation of anharmonic phonon analysisWe developed automation software named “auto-kappa” (https://github.com/masato1122/auto-kappa) for performing first-principles anharmonicphonon calculations. Given the complexity of phonon analysis, the softwareautomatically addresses key challenges, including precise structure opti-mization tominimize residual stress and procedures to eliminate imaginaryfrequencies associated with unstable phonon modes. Specifically, theseinclude structure optimization using an equation of state and increasing thesupercell size for force calculations. The automated workflow for anhar-monic phonon calculations is summarized in Fig. 1a, with detailed com-putational procedures described in theMethods section. While remarkableefforts have been made to automate similar processes for analyzinganharmonic phonon properties83–85, several challenges were still encoun-tered in the high-throughput calculations of this study. These included theneed for automatic adjustment of VASP and ALAMODE parameters (e.g.,the cutoff length for force constants and the treatment of the non-analyticalcorrection), job parameters (e.g., the number of parallel processes and thetype of parallelization), and the complexity of obtaining relaxed structures,as illustrated in Fig. 1a. To overcome these challenges, we enhanced auto-kappa to resolve them automatically. Using the developed software, we havecalculated the Peierls lattice thermal conductivity (κp) based on therelaxation time approximation as well as the coherence lattice thermalconductivity (κc). Although the software includes an implementation of theself-consistent phonon approach to account for phonon renormalization,the dataset used in this study was generated using the conventional methodhttps://doi.org/10.1038/s41524-026-02033-w Articlenpj Computational Materials |          (2026) 12:150 2https://github.com/masato1122/auto-kappahttps://github.com/masato1122/auto-kappawww.nature.com/npjcompumatsFig. 1 | Automation of anharmonic phonon property calculations using a first-principles approach. a Automated workflow implemented in the developed soft-ware, auto-kappa. b Example output generated by auto-kappa for rock salt NaCl(mp-22862). The results include phonon dispersion with participation ratio andDOS, representative atomic distances for force constants (FCs), temperature- andgrain-size-dependent thermal conductivity, mode-dependent phonon scatteringrates and lifetimes, spectral and cumulative thermal conductivity as functions ofmean free path and frequency, and Grüneisen parameters. In addition, a compu-tational time chart, thermodynamic properties, and various text files—such asdisplacement–force datasets, force constants, and input/output scripts for simula-tions—are generated.https://doi.org/10.1038/s41524-026-02033-w Articlenpj Computational Materials |          (2026) 12:150 3www.nature.com/npjcompumatsbased on three-phonon interactions within the relaxation timeapproximation.Using the developed software, we constructed Phonix, a database ofanharmonic phonon interactions, comprising 6641materials and 7342 datapoints, including 701 duplicate material entries. The name Phonix high-lights its broader scope, extendingbeyond the phonon–phonon interactionsexamined here to future coverage of interactions with diverse quasiparticlesand nanostructures. The database includes input files, intermediate data,output results, and generated figures—as illustrated in Fig. 1b. Thesecomprise: phonon dispersion with participation ratio, density of states(DOS), the relationship between force constants and their representativeatomic distance (maximum distance among corresponding atoms for theanharmonic case), temperature- and grain-size-dependent thermal con-ductivity, mode-dependent phonon scattering rates and lifetimes, cumu-lative and spectral thermal conductivity as a function of mean free path andfrequency, Grüneisen parameters, thermodynamic properties (tempera-ture-dependent specific heat, entropy, and internal and free energies),harmonic and anharmonic force constants, displacement-force datasetsused for calculating the force constants, and input/output scripts for first-principles (VASP86) andphonon (ALAMODE45) calculations.Naturally, formaterials exhibiting imaginary frequencies, only harmonic property dataare included. The targetmaterials in this study consist of all entries from thePhonondb dataset (version 2018-04-07), comprising 10,034 materials, andnon-metallic, non-magnetic materials from the Materials Project (version2022.10.28), comprising 11,418 materials after excluding overlaps withPhonondb. In total, the dataset includes 21,452 unique materials. Althoughthe full phonon analysis has not yet been completed for every material—primarily due to the high computational cost associated with rigorousstructural optimization and the use of larger supercells (see Methods fordetails)—we have successfully calculated anharmonic phonon propertiesfor over6500materials.Whilewehave alsoobtained a significantly larger setof harmonic phonon properties, including those formaterials with unstablephononmodeswith imaginary frequencies, this study focuses exclusively onthe anharmonic phonon properties, which represent the more compellingaspect of our database. The complete database will be made available onARIM-mdx87. We would also like to emphasize that the database releasedwith this paper represents only the first version, and we are continuouslyworking to improve both the quality and quantity of the data.Database analysisFirst, we analyzed the crystal structures of the materials for which anhar-monic phonon properties were computed. As shown in Fig. 2a, the datasetencompasses a wide range of materials. Among the most populated spacegroups, as shown at the top of Fig. 2a, space group 14 includes quartz-likestructures such as SiO₂; space group 62 includes the anatase phase of TiO₂,commonly used as a photocatalyst; space group 166 contains well-knowntopological insulators and thermoelectric materials like Bi₂Te₃ and Bi₂Se₃;and space group 225 comprises rock salt structures such as NaCl, PbTe,and PbSe.Although the current dataset is limited to non-metallic and non-magnetic materials, it is not constrained by the size of the primitive cell, asshown in the bottompanel of Fig. 2a. Somematerials includemore than 100atoms, with the maximum reaching 160 atoms. Among these, five out ofseven materials with the highest atom counts belong to space group 62.However, most materials in the database contain fewer than 30 atoms, withhalf containing fewer than 16 atoms.Regarding elemental diversity, the Phonix materials contain elementsfrom a broad range of groups, as shown in Supplementary Fig. S1. Whiletransition metals appear less frequently—likely due to the exclusion ofmagnetic materials in the current version of Phonix—and group 18 ele-ments are present only as single-element systems, all elements from periods1 to 6 and groups 1 to 17 of the periodic table, except for Po and At, arerepresented in the Phonix materials. The broad diversity in space groupsand structural complexity highlights the versatility of the database as aplatform for exploring and developing a wide spectrum of inorganicmaterials. Notably, only 287 out of 6641 crystal structures (4.3%) satisfy thesearch criteria employed by the Microsoft database (MatterK31), while thespecificmaterials contained in their database are not publicly accessible.Webelieve that both types of databases play complementary and essential roles:databases basedonfirst-principles calculations are crucial for expandingourknowledge toward unexplored materials, while those based on machinelearning potentials are important for interpolating data within the knownmaterials space.Subsequently, the distribution of thermal conductivity was analyzed.Throughout this paper, we used the thermal conductivity at 300 K obtainedusing the densest q-mesh in auto-kappa—1500 q-points⋅Å3/atom—for alldiscussions. Lattice thermal conductivity (κlat) generally decreases withFig. 2 | Data analysis of the Phonix database, a database for anharmonic phononinteraction, comprising 6641 materials and 7342 data points, including 701duplicate material entries. aDistribution of space groups and crystal systems (top),and the number of atoms in the primitive cell (bottom) for the crystal structures inthe database. b Relationship between lattice thermal conductivity (κlat) and volumeper atom, alongwith the distribution of κlat at 300 K. cComparison of the Peierls (κp)and coherence (κc) contributions to κlat. Solid and dotted lines represent κc ¼ κp andκc ¼ 0:1 × κp , respectively.https://doi.org/10.1038/s41524-026-02033-w Articlenpj Computational Materials |          (2026) 12:150 4www.nature.com/npjcompumatsincreasing volume per atom (Vatom)67. According to Phonix, κlat, includingboth the Peierls (κp) and coherence (κc) contributions, at 300 K exhibitedthe following relationship: log10ðκlatÞ / αlog10ðVatomÞ, where the coefficientα was estimated to be −1.89, as illustrated in Fig. 2b. The average κlat at300 K was 2.4 Wm�1K�1, as shown in Fig. 2b. Half of the materialsexhibited κlat(300 K) values between 0.95 and 6.3Wm−1K−1, while 95% fellwithin the range of 0.15 to 39 Wm�1K�1. Among the high-thermal-conductivity (high-κ) materials, 0.17% (11 materials) exhibitedκlat > 1000Wm�1K�1, 0.38% (25) exceeded 500 Wm�1K�1, and 1.02%(67) exceeded 200 Wm�1K�1, as listed in Supplementary Table S1. In thelist of calculated materials exhibiting κlat > 200Wm�1K�1, shown in Sup-plementary Fig. S2, the majority (28 out of 67) were polymorphs of carbonor SiC. Meanwhile, to the best of our knowledge, the following materials,including their polymorphs, have not been synthesized experimentally andhave rarely been discussed as high-κmaterials: triclinic Hg(BiS2)2 (ID: mp-554921, space group: 12, κp;fxx;yy;zzg ¼ 292; 2:5, and 943Wm�1K�1), cubicHC (mp-1079612, 199, κp;ave ¼ 306Wm�1K�1), cubic BiB (mp-1006880,216, κp;ave ¼ 235Wm�1K�1), and trigonal CsHoS2 (mp-505158, 166,κp;fzz;xx yyð Þg ¼ 22 and 657 Wm�1K�1). It is also noteworthy that the tri-clinic Hg(BiS2)2 and trigonal CsHoS2 exhibit highly anisotropic heat con-duction with κp;zz=κp;yy ¼ 392 and κp;xx yyð Þ=κp;zz ¼ 30, respectively. Thevalue of Hg(BiS2)2 is comparable to—or even exceeds—that of graphite88.On the other hand, among low-κmaterials, 0.23% (15 materials) exhibitedκlat < 0:1Wm�1K�1 (see Supplementary Fig. S3), 15% (966) exhibitedκlat < 0:5Wm�1K�1, 28% (1815) exhibited κlat < 1:0Wm�1K�1, and 71%(4685) exhibited κlat < 5:0Wm�1K�1. Considering that finding materialswith κp � 0:5Wm�1K�1 was challenging in pioneering studies67, theobtained dataset provides a significant amount of information on low-κmaterials. While phonon renormalization and four-phonon scatteringshould be considered for accurately calculating small κlat, this analysissuggests that identifying low-κ materials may be relatively easier thanfinding high-κmaterials, which remains a greater challenge.Moreover, it is insightful to compare the Peierls and coherent con-tributions to the total lattice thermal conductivity. In most materials, par-ticularly high-κ materials, the coherent contribution is smaller than thePeierls contribution or sometimes even negligible. However, we observedthat a considerable number of materials exhibited a significant coherentcontribution: κc ≥ κp in 8.4% of materials (purple regions in the top andbottom panels of Fig. 2c, bounded by solid lines), and κc ≥ 0:1 × κp in 50%,nearly half of the dataset (bluish regions, bounded by dotted lines). Whilethe relative contribution of the coherent component is known to have asignificant effect when the Peierls contribution is small, a large κc wasobtained for SiC polymorphs, which are located in the top-right corner(κp � 200 to 500 and κc > 10Wm�1K�1) of the bottom panel of Fig. 2c.Although the relative contribution of κc remains small compared to thePeierls conductivity, it is interesting that high-κ materials, SiC89–91, mayexhibit a large coherent phonon conductivity ( > 10Wm�1K�1 andup to 60Wm�1K�1 at 300 K). Since SiC hasmore than 200 polymorphs92, and someof them contain a substantial number of atoms (>50), the densely packedphononbranches resulting from the largenumberof atoms lead to a largeκc,as shown in Supplementary Fig. S4. The developed database contains 15polymorphs of SiC, among which Si36C36 exhibits the highest κc of 65Wm�1K�1, while its κp reaches κp;xx yyð Þ ¼ 305 and κp;zz ¼ 11Wm�1K�1.Computational accuracyTo assess computational validity, we compared the results obtained in thisstudy with those in Phonondb16 as well as with experimental thermal con-ductivity data. As shown in Supplementary Fig. S5, the phonon dispersionscalculated in this work exhibit excellent agreement with those reported inPhonondb. The remaining discrepancies are likely attributable to differencesin the relaxed structures, particularly the lattice constants. Overall, this com-parison further supports the reliability of both datasets. In addition to har-monic properties, an anharmonic phonon property—namely, the latticethermal conductivity at roomtemperature—wascomparedwithexperimentaldata for 103 single-crystal compounds. While calculated data deviate fromexperimental data for certain materials, calculated data overall show goodagreement with the experimental results, as shown in Supplementary Fig. S6.To further reduce the discrepancies between computational and experimentalvalues, additional factors should be considered in the simulations, includingspin–orbit interaction, long-range interactions93–95, four-phonon scattering,and others. It is alsoworth noting that these discrepancies could potentially bereduced by employing machine-learned surrogate models, particularly incases with substantial deviations, as informed by our experience.Computational accuracy may be limited for materials with high latticethermal conductivity and should be interpreted with caution, as the studyprioritized generating a large dataset under constrained computationalresources. The automated calculations occasionally produce excessivelyhigh thermal conductivity values—exceeding several thousandWm�1K�1—which appear to be unrealistic at this point. These over-estimations typically arise from flat phonon bands or acoustic branches. Insome instances, phononmodes on flat optical branches exhibit abnormallylong lifetimes, while in others, low-frequency acoustic modes display eitherexcessively long lifetimes or unusually high group velocities, as illustrated inSupplementary Fig. S7. To achieve more accurate thermal conductivityestimates, larger supercell sizes (up to 200 atoms) and/or denser q-pointmeshes are required. Another crucial factor is the inclusion of four-phononinteractions, which are expected to reduce the overestimated phonon life-times. Although the direct calculation of four-phonon scattering rates iscomputationally demanding, employing machine learning techniques topredict their effects82 represents a promising future direction for enhancingthe database. In the subsequent machine learning analysis of anharmonicphonon properties, such implausible data have been excluded. Detailsregarding the computational accuracy of first-principles phonon analysis—including the effects of supercell size and the methods used to obtain forceconstants—are provided in Supplementary Section VIII.Deep learning scaling law for anharmonic phonon propertiesUsing the database developed in this study, we conducted machine learningpredictions for anharmonic phonon properties and investigated how pre-dictionaccuracy scaleswithdata size14,96–99.Ourdatabase enables themachinelearning prediction of spectral thermal conductivity, notmerely scalar valuessuch as κlat at room temperature (300 K). Since modal lattice thermal con-ductivity depends on mean free path (MFP) and phonon frequency, pre-dicting spectral thermal conductivity is essential for evaluating the effects ofnanostructuring100,101 and interactions with other particles and excitations,including electrons33, photons36,37, and magnons34,35. Here, we demonstratepredictions for Peierls thermal conductivity (κp) and cumulative Peierlsthermal conductivity (κcumul) as functions of MFP (Λ) at 300 K. Additionalexamples of spectral thermal conductivity predictions as functions of fre-quency and the maximum phonon frequency are provided in the Supple-mentary Information (see Supplementary Fig. S10 and related discussion).In this study, we employed the crystal graph convolutional neuralnetwork (CGCNN)102 to predict scalar quantities, such as thermal con-ductivity, and the Euclidean neural network (e3nn)103 to predict spectralfunctions. InCGCNN, atoms are represented by node features composed ofone-hot encodings of nine atomic properties, including group number,period number, and electronegativity, while interatomic distances areencodedas discretized edge features. In e3nn, atomic species are representedby 118-dimensional mass-weighted one-hot vectors, and interatomic rela-tions are described using relative position vectors. The e3nn frameworkincorporates the SE(3)-Transformer104—a state-of-the-art architecture forthree-dimensional point clouds and graphs—which is equivariant undercontinuous 3D roto-translations and rigorously accounts for structuralsymmetries, including mirror (O(3)) and rotational (SO(3)) symmetries,both of which are crucial for phonon analysis. This method has recentlybeen applied to the prediction of complex phonon properties, includingDOS105 and phonon dispersion106,107. Further methodological details areprovided in the Methods section.https://doi.org/10.1038/s41524-026-02033-w Articlenpj Computational Materials |          (2026) 12:150 5www.nature.com/npjcompumatsBy performing machine learning predictions for κp and normalizedκcumulðκnormcumulðΛÞÞ at 300K using various training dataset sizes (N train), weobserved clear scaling behavior with respect to data size, as shown in the leftpanels of Fig. 3a, b. These results clearly demonstrate the enhancement inprediction accuracy enabled by our database. The relationship betweenmean absolute error (MAE) and N train was fitted using the empiricalformula97: errorð Þ ¼ Nc=N train� �α ðNc; α > 0Þ, whereNC is a constant andαis the scaling factor indicating how effectively increased data improvespredictive accuracy. The scaling factors were 0.17 for κp and 0.14 for κcumul,as shown in Fig. 3a, b, and ranged from 0.075 to 0.28 for other properties, asillustrated in Supplementary Fig. S6. These values are comparable to thosefor large language models (0.095)97 and force prediction tasks in crystallinematerials (0.21)14 (see Supplementary Fig. S6e). As the database continues toexpand, the predictive accuracy of surrogatemodels for large-scalematerialsscreening is expected to improve further. For example, according to thefitted scaling law, theMAE for log10κp is expected to decrease to 0.15 as thetraining dataset size approaches 2:5 × 105. Nevertheless, brute-forcecalculations of anharmonic phonon properties for 105-order materials—particularly including higher-order effects such as four-phonon scatteringand phonon renormalization—remain impractical. Therefore, furtherexpansion of the database will require machine learning-based accelerationmethods, such as machine learning potentials31,77, to facilitate the efficientevaluation of phonon properties76,82,108.The right panels in Fig. 3a, b show representative test cases selectedfrom 50 ensembles for each data size, chosen as those with MAE valuesclosest to the average for the corresponding condition. For instance, whenNall � 1000, where Nall denotes the total number of data points used fortraining, validation, and testing, the average MAE of log10κp was 0.37, asshown in the left panelof Fig. 3a.Themiddle panel on the right side of Fig. 3adisplays a representative case with anMAE of 0.378. The prediction resultsin the rightpanel of Fig. 3a clearly demonstrate that thepredicteddatapointscluster more closely around the parity line as Nall increases. Similarly, theright panel of Fig. 3b shows that the fluctuations in the predicted curve arereduced with increasing Nall, and the predicted trend aligns more closelywith the first-principles results (grey line) for larger datasets.Fig. 3 | Deep learning scaling law for anharmonic phonon properties as a func-tion of training data size. a Peierls thermal conductivity (κp) and (b, c) normalizedcumulative Peierls thermal conductivity (κnormcumul) were predicted using graph neuralnetworks. The left panels in (a) and (b) show the reduction of mean absolute error(MAE) with increasing data size, demonstrating clear scaling behavior. MAEs wereevaluated using log10κp and κnormcumul, respectively. The fitted scaling curve is shown as agrey line, with the corresponding equation displayed at the bottom of each panel.Error bars represent the 90% confidence interval based on 50 ensembles. The rightpanels in (a) and (b) show prediction examples at different data sizes, selected basedon MAE values closest to the ensemble average. In panel (a), blue, green, and redmarkers represent training, validation, and test data, respectively. In panel (b),colored lines indicate predicted results, while grey lines show data from first-principles calculations. c Prediction results for κnormcumul using the entire dataset(Nall � 5000). The left panel presents the MAE distribution (dotted line) and itscumulative sum (solid line), color-coded by quartile. The right panels displaymultiple examples of predicted κnormcumul curves; colored lines indicate predictions, andblack lines represent reference calculations.https://doi.org/10.1038/s41524-026-02033-w Articlenpj Computational Materials |          (2026) 12:150 6www.nature.com/npjcompumatsThe exceptional predictive performance forκcumul is emphasized inFig.3c.As shown in the left panel, 50% (75%)of the test data yielded anMAE forlog10κnormcumul below 0.05 (0.08). This panel illustrates the MAE distribution,while the right panels provide prediction examples for individual materials.In the right panel, 50% of the predicted curves exhibit excellent agreement(green and blue regions) with the first-principles results (black line), while75%demonstrate good agreement (orange region). Even for the final group,whereMAE exceeds 0.08 (red region), although the initial value of κcumul—i.e., the κp contribution from phonons with MFPs shorter than 1 nm—shows adiscrepancy, theMFPrangewhereκcumul begins to increase remainsreasonably well predicted.Screening using the Phonix databaseUsing prediction models developed from our database, we screenedmaterials with high and low thermal conductivity from the GNoMEdatabase14, which contains 381,000 novel crystal structures. The Peierlsthermal conductivity (κp) for all materials was evaluated as the average of20 ensemble predictions. Magnetic materials, including those containingtransition metals, were included in the screening. Although magneticeffects can affect lattice thermal conductivity in three-dimensional sys-tems with Curie temperatures close to room temperature109, they aregenerally secondary to phonon–phonon scattering because of theabundance and strength of phonon–phonon interactions95,110. Eachmodel was trained on 3000 anharmonic phonon data points, divided into2400 for training, 300 for validation, and 300 for testing. Following thescreening, phonon properties, including κp, were computed for 169selected materials (148 with the highest κp and 21 with the lowest) usingthe auto-kappa workflow.An analysis of the validation results for the screened materialsrevealed several insights regarding prediction accuracy, as shown inFig. 4a. The predicted κp values for low-thermal-conductivity materialsin the GNoME database showed accuracy comparable to that of the fulldataset (MAE: 0.27 for log10κp), with low variability in the predictions, asillustrated in Fig. 3a. In contrast, the prediction accuracy for high-κpmaterials was notably lower (MAE: 0.68), and the predictions exhibitedgreater variability. Although definitive conclusions are limited by therelatively small number of computed data points, these results suggestthat high-κ predictions are more challenging. From a machine learningstandpoint, this difficulty likely stems from the simpler structural char-acteristics of high-κ materials, which typically contain fewer atoms andatomic species in their primitive cells. Consequently, these materialsoffer less structural information for learning compared to low-κ mate-rials, which often have complex frameworks, such as skutterudites andclathrates23,111,112. Predicting material properties from such sparsestructural information is inherently more difficult. From a physicalperspective, accurately estimating high-κ values demands rigoroustreatment of anharmonic phonon interactions and highly convergedcomputational parameters, such as dense q-point meshes, since evensmall errors in force constants can significantly impact the results.Nonetheless, the predicted candidates remain promising for high-κapplications.Fig. 4 | Screening of high- and low-thermal conductivity materials from theGNoME database14, which includes approximately14 381,000 novel structures.a Parity plot comparing predicted and calculated values of Peierls thermal con-ductivity (κp). Blue and red markers represent materials predicted to exhibit highand low thermal conductivity, respectively, using models trained on the constructedthe Phonix database. Error bars indicate the 90% confidence interval from 20ensemble predictions. The solid line denotes the parity line. b and (c) display135crystal structures with κ3php > 200Wm�1K�1 and the four lowest-κ structures. Foreach material, the chemical formula, space group number (in parentheses), andGNoME database ID are provided (d) and (e) present phonon properties of hex-agonal NpPH and trigonal Cs6Rb2SnPbI12, which exhibit the highest(κ3phð3þ4phÞlat � 280 ð80ÞWm�1K�1) and lowest (κlat � 0:15Wm�1K�1) latticethermal conductivities (κlat ¼ κp þ κc), respectively. In the case of high-κmaterials,both three-phonon (3 ph) and four-phonon (4 ph) scattering were taken intoaccount. The panels include phonon dispersion, total and partial DOS, phononlifetime (τ), spectral (green) and cumulative (blue) Peierls thermal conductivity foreach, as well as labels such as the chemical formula, space group (in parentheses),material ID, and lattice thermal conductivities (κp and κc) along different directionsin units ofWm�1K�1.While themaximumphonon frequency of the high-κmaterialin (d) exceeds 1000 cm−1, properties are shown up to 400 cm−1. Full-range phononproperties are available in Supplementary Fig. S11. Spectral and cumulative thermalconductivity are normalized by the maximum and total Peierls conductivities,respectively.https://doi.org/10.1038/s41524-026-02033-w Articlenpj Computational Materials |          (2026) 12:150 7www.nature.com/npjcompumatsBy screening materials with high and low κ, we identified three com-pounds with κ3phlat > 200Wm�1K�1 and nine with κ3phlat < 0:2Wm�1K�1, asshown in Supplementary Fig. S7, where the superscript “3 ph”denotes threephonon scattering. Among the predicted materials, the highest and lowestcalculated lattice thermal conductivities (κ3phlat ¼ κ3php þ κ3phc ) were 284Wm�1K�1 for the xx and yy components of hexagonal NpPH, and 0.14Wm�1K�1 for trigonal Cs6Rb2SnPbI12, respectively, whereκp;fxx=yyg; κc� �¼ ð0:031; 0:11Þ Wm�1K�1. Although we did not findmaterials that surpassed known record values, the results highlight thepotential for future discovery of record-breaking compounds. Importantly,the identified candidates offer valuable insights into the structural andcompositional characteristics of both high- and low-κmaterials. Discover-ing materials at the extremes of thermal conductivity is inherently chal-lenging, as machine learning models typically excel at interpolation butstruggle with extrapolation113–115. Therefore, further advancement in auto-mated high-throughput calculations will be critical for identifying suchextreme materials in future studies.In the three-phonon calculations, high thermal conductivity values(≳200Wm�1K�1) were observed in hydrogen-containing hexagonalternary compounds belonging to space group 194 (P63=mmc), such asNpPH (κ3php;zz ¼ 172, κ3php;xx=yy ¼ 277, κ3phc ¼ 6:9Wm�1K�1), PaPH(κ3php;zz ¼ 173, κ3php;xx=yy ¼ 264, κc ¼ 0:0037Wm�1K�1), and PuHS(κp;xx=yy=zz ¼ 216, κc ¼ 0:012Wm�1K�1), as shown in Fig. 4b andSupplementary Fig. S11a. When four-phonon scattering is taken intoaccount, the thermal conductivity is reduced to κ3þ4php;xx=yy ¼ 78; 59, and 51Wm�1K�1 for NpPH, PaPH, and PuHS, respectively. The origin of theirrelatively high thermal conductivity nevertheless remains to be eluci-dated. These materials are characterized by heavy atoms surrounded bylight atoms, including hydrogen. The phonon dispersion andDOS in Fig.4d clearly show that phonon modes associated with heavy atoms (Np)and those associated with light atoms (P andH) are completely separatedinto different frequency ranges: modes of heavy atoms appearing at lowfrequencies ( < 200 cm�1) and those of light atoms appearing at highfrequencies. This complete separation of phonon modes by differentatomic species in energy space is expected to suppress anharmonicinteractions between phonon modes within their respective frequencyranges, similar to other high-κmaterials such as BAs116,117. Consequently,the phonon lifetimes of acoustic modes primarily composed of heavyatoms remain long, contributing dominantly to the overall heat trans-port, as shown in the last two panels of Fig. 4d. In contrast, the crystalstructures of low-κ materials are significantly more complex, as illu-strated by the examples in Fig. 4c such as Cs6Rb2SnPBI12 (κp;zz ¼ 0:049,κp;xx=yy ¼ 0:032, κc ¼ 0:11Wm�1K�1), CsAgS6 (κp;xx=yy=zz ¼ 0:013,κc ¼ 0:141Wm�1K�1), K3AgSe13 (κp;xx=zz ¼ 0:030, κp;yy ¼ 0:048,κc ¼ 0:17Wm�1K�1), and Cs6K2SnPbI12 (κp;zz ¼ 0:057,κp;xx=yy ¼ 0:039, κc ¼ 0:11Wm�1K�1). Notably, six of the nine dis-covered low-κ materials contain cesium, whose alloy (α-CsPbBr₃) isknown for its intrinsically low thermal conductivity51. In these low-κmaterials, phonon modes—formed by a mixture of atomic species—aredistributed across a wide frequency range, as illustrated in Fig. 4e andSupplementary Fig. S11b, in stark contrast to the more localized modebehavior seen in high-κmaterials. Although several attempts have beenmade to synthesize related materials, including actinide hydrides118–120,the compounds identified in this screening—particularly those with highκlat—may present significant challenges for experimental synthesis.Nevertheless, the above discussion provides concrete insight into thesynthesis of highly thermally conductive materials. For example, rea-lizing similar phenomena with transition metals, rather than actinides,could enable high thermal conductivity in compounds that are moreamenable to experimental synthesis.DiscussionIn conclusion, we developed an automated software package, auto-kappa,and constructed a large-scale first-principles database for anharmonicphonon interactions (Phonix), encompassing more than 6500 materialswith diverse crystal structures. Using this database, we demonstrated a clearscaling law linking dataset size to predictive performance for key anhar-monic phonon properties, including lattice and spectral thermal con-ductivities. Furthermore, by screening a vast crystal structure database, weidentified promising candidates for both high and low thermal conductivityapplications. Although future improvements—such as the inclusion ofhigher-order anharmonic effects like four-phonon scattering and phononrenormalization—are necessary for more accurate assessments, this studyestablishes a strong foundation for data-driven discovery of thermofunc-tional materials with wide-ranging technological relevance, includingapplications in superconductivity, spintronics, and beyond.MethodsAutomated workflow for anharmonic phonon calculationsPhonon calculations based on first-principles methods involve a con-siderably more complex workflow than typical calculations of total energy,electronic band structures, or electronic conductivity within the constantrelaxation time approximation. To facilitate the construction of an anhar-monic phonon property database, we developed auto-kappa, a Python-based automation software for first-principles analysis of anharmonicphonon properties. Auto-kappa streamlines the intricate workflow—illu-strated in Fig. 1a—for computing anharmonic phonon properties by inte-grating the Vienna Ab Initio Simulation Package (VASP)86 for electronicstructure calculations and the phonon analysis software ALAMODE45.Through automated calculations, the auto-kappa software utilizesvarious existing libraries and packages in addition to VASP (≥6.3.2) andALAMODE (versions 1.4–1.5). Crystal structures were handled using theAtomic Simulation Environment (ASE)121 (≥3.22) and Pymatgen7(≥2023.8.10). SymmetryoperationswereperformedusingSpglib122 (≥2.3.1),Pymatgen7, and modules from Phonopy52,123 (≥2.20). VASP calculations,including input file generation and job submission, were managed usingASE and theCustodian package7 (≥2023.10.9). The phonondispersion pathwas determined using the SeeK-path library122,124.The integration of various libraries—such as those listed above—enables researchers to perform first-principles phonon calculations withsignificantly reduced manual effort. Using auto-kappa, the database wasgenerated through the following procedure, which follows the workflowillustrated in Fig. 1a.Step 1: Symmetry analysis of the crystal structureThe primitive, conventional, and supercells of the input crystal structurewere first determined. The conventional cell was selected to have a compactshape while maintaining resemblance to a regular hexahedron. Thesupercell was then generated from the conventional cell, with a target ofmaximizing the number of atoms (up to a limit of 150 atoms) whilemaintaining geometric similarity to a regular hexahedron. The resultingsupercell was used for force calculations required for both harmonic andcubic force constants—steps iv and vi, respectively. However, when ima-ginary frequencies appeared, larger supercellswere employed specifically forthe harmonic force constant calculations.Step 2: Structure optimizationThe accurate calculation of atomic forces using supercells in a later step iscrucial for obtaining a reliable phonon analysis. Therefore, the shape andatomic positions in the crystal structure were carefully optimized through arigorous procedure. Although both primitive and conventional cells can beused for this purpose,we chose the conventional cell to ensure consistency inthe basis wavefunctions with those used in the supercell-based phononcalculations. While the primitive cell offers computational efficiency andbetter symmetry preservation, the conventional cell provides a more con-sistent basis set across all simulation steps.https://doi.org/10.1038/s41524-026-02033-w Articlenpj Computational Materials |          (2026) 12:150 8www.nature.com/npjcompumatsThe structure optimization was performed in three steps: two succes-sive full relaxations—allowing for optimization of both the cell shape/volumeandatomicpositions—followedbyafinal atomic relaxationwith thecell shape and volume fixed. Because changes in the cell can affect theoptimal basis set of wavefunctions, performing two full relaxations helpsmitigate the impact of basis fluctuations. Once the cell shape and size weredetermined, the atomic positions were further relaxed in a single-stepcalculation.Step 3: Calculation of Born effective chargesThe Born effective charges were calculated using a first-principlesapproach to apply non-analytical corrections in subsequent phononanalyses. For harmonic phonon properties, such as phonon dispersionand DOS, the non-analytic correction was initially applied using themixed-space approach125. This correction primarily affects the splittingbetween longitudinal optical (LO) and transverse optical (TO) modes(LO–TO splitting), but in some cases, it also influences the phononstability of certain materials. When imaginary phonon frequencies wereobserved, the method for applying the non-analytic correction wasmodified—first by using the damping method126 and, if necessary,switching to the Ewald method127.Step 4: Calculation of harmonic force constantsHarmonic interatomic force constants were calculated using the finite-displacement method (also known as the brute-force method), in whichatomic displacement patterns were generated in a supercell, and theresulting atomic forces were computed for each pattern. For these calcula-tions, a single atomwas displacedwithin the supercell, and the displacementpatterns were determined based on crystal symmetry. The number of dis-placement patterns required for harmonic force constants is relatively smallcompared to those needed for higher-order force constants, allowing thefinite-displacement method to be directly applied. The displacement mag-nitude was set to a small value (0.01 Å) to minimize the influence ofanharmonic effects. Harmonic force constants were then obtained using aleast-squares fitting procedure. If the fitting error exceeded 10%, the datawere excluded from the analyses presented in this paper. No cutoff distancewas imposed on the harmonic force constants in order to account for allpossible atomic interactions within the supercell.To ensure accurate force calculations within the first-principles fra-mework, it is important to evaluate the nonlocal part of the pseudopotentialin reciprocal space rather than in real space.While using projector operatorsin real space can reduce computational cost for large supercells, it introducesaliasing errors due to wavefunction projection. Therefore, in our developedsoftware, projector operators are consistently evaluated in reciprocal spaceby setting ‘LREAL=FALSE’ in the VASP calculations.Step 5: Analysis for harmonic phonon propertiesUsing harmonic force constants, harmonic phonon properties—includingphonon dispersion and DOS—were calculated. As described in the sectionon the Born effective charge, different approaches were applied to includenon-analytic corrections when necessary to eliminate imaginary fre-quencies. For theDOS calculation, the reciprocal spacemesh density for thephonon wavevector (q-mesh) was set to 1500 q-points per reciprocal atom(q-points Å3/atom). For example, the q-mesh for diamond-structured sili-con was set to 21 × 21 × 21.Step 6: Calculation of cubic force constantsIf the structure exhibited no imaginary frequencies, the calculation of cubicforce constants was performed following the harmonic phonon propertyanalysis. To obtain cubic force constants, the finite-displacement methodtypically requires a significantly larger number of displacement patterns—on average, ~100 times more than those needed for harmonic force con-stants. Therefore, a cutoff distance was imposed for the cubic force con-stants, whichwas set to the larger of 4.3 Å and the third-shortest interatomicdistance. Additionally, while the finite-displacement and least-squaresmethodswere usedwhen thenumber of requireddisplacement patternswasbelow a predefined threshold (set to 100 patterns), the least absoluteshrinkage and selection operator (LASSO) regression128 was employed toestimate cubic force constants from randomly generated displacementpatterns. The harmonic force constants were fixed to the values obtainedfrom the previous calculation (step iv) during the LASSO regression. If thefitting error for the least-squaresmethodor the residual force for the LASSOregression exceeded10%, the datawere excluded from thediscussion, aswasdone for harmonic force constants.The number of generated random displacement patterns was deter-mined using the formula Nrandpattern ¼ αNFC3=Nscatom, where NFC3 is thenumber of unique cubic force constants,Nscatom is the number of atoms in thesupercell, and α is a coefficient greater than 1=3; in this study, it was set to1:0. To generate a random displacement pattern, a random displacementwas applied to each atom. The displacement magnitude for cubic calcula-tions was set to 0.01 or 0.03 Å per atom for both the finite-displacementmethod and the LASSO approach, which is larger than the value used forharmonic calculations.Step 7: Analysis for anharmonic phonon propertiesUsing the cubic force constants obtained in the previous step, we analyzedanharmonic phonon properties. To assess convergence with respect to theq-mesh size, the q-mesh density was varied from 500 to 1000 to 1500 q-points⋅Å3/ atom. The effect of three-phonon scattering was estimated bysolving thephonon transportBoltzmannequationunder the relaxation timeapproximation. Phonon scattering by natural isotopes was also consideredand incorporated using Matthiessen’s rule. Finally, various anharmonicphonon properties were obtained, including mode-dependent lifetimes;spectral and cumulative thermal conductivities (κspec and κcumul) as func-tions of frequency andmean free path; and temperature-dependent thermalconductivities for both Peierls (κp) and coherence (κc)59 contributions, asillustrated in Fig. 1b. For details, please refer to Section I of the Supple-mentary Information.Step 8: Strict structure optimizationIf imaginary frequencies were observed in the harmonic phonon analysisduring process (iv), a strict structural optimization was performed. In thisstep, the volume of the crystal structure was modified by applying hydro-static strain, and the corresponding structural energies were calculated.After evaluating energies at different volumes, the Birch-Murnaghanequation of state129,130 was used to determine the volume thatminimized thestructural energy. Once the newly optimized structure was obtained, theprocedure was restarted from process (iii).Step 9: Use of larger supercell for harmonic force constantsIf the strictly optimized structure still exhibited imaginary frequencies, alarger supercell was used for calculating harmonic force constants. Themaximum limit for this second harmonic force constant analysis was set to200 atoms—an increase of 50 atoms from the original setting. If this stepsuccessfully eliminated imaginary frequencies, cubic force constants werethen calculated. While a larger supercell was used for harmonic forceconstants in this case, the original supercell size (fewer than 150 atoms) wasretained for estimating cubic force constants. The harmonic force constantsobtained using the original supercell were kept fixed during the estimationof cubic force constants.Step 10: Phonon renormalizationThe process for phonon renormalization using self-consistent phonon(SCP) theory55,56 was also implemented in auto-kappa, although this processwas not performed in the present study. Using the SCP approach,temperature-dependent effective harmonic force constants can be calcu-lated by incorporating the effects of phonon renormalization due to thefourth-order potential. Phonon renormalization can eliminate imaginaryfrequencies in certain cases56,112, and should also be considered for accuratelyestimating low thermal conductivity.https://doi.org/10.1038/s41524-026-02033-w Articlenpj Computational Materials |          (2026) 12:150 9www.nature.com/npjcompumatsParameters for first-principles calculationsFor allfirst-principles simulations described above, the following conditionswere applied. The k-mesh was determined by Ni ¼ max½1; intðlk � jbijÞ�,following themethod recommended byVASP.Here, lk is a length scale thatdetermines the number of subdivisions along each reciprocal lattice direc-tion and is set to 20 Å, and bi is the reciprocal lattice vector along the idirection (i ¼ kx; ky; kz). The Γ-centered scheme was used to generate thek-mesh. The Perdew-Burke-Ernzerhof exchange-correlation functionalrevised for solids (PBEsol)131 with the projector augmented wave (PAW)potential132,133 was employed. The cutoff energy for VASP calculations wasset to 1.3 times the recommended value provided in the VASPpseudopotential files.Machine learning prediction of phonon propertiesWe employed the crystal graph convolutional neural network (CGCNN)102to predict the Peierls conductivity (κp) and the graph neural network basedon the Euclidean neural network (e3nn)103,105 to predict spectral functionsand cumulative Peierls conductivity (κcumul) as a function of the phononmean free path (Λ). In both graph neural network approaches, nodes andedges correspond to atoms and bonds within the crystal, respectively.The node descriptors in CGCNN consist of a one-hot encodings ofnine atomic properties, including group number, period number, electro-negativity, and covalent radius, as also described in the main text. In con-trast, the e3nn approach employs a simpler node descriptor: a 118-dimensional mass-weighted one-hot encoding based solely on atomicspecies and their masses. For edge descriptors, CGCNN utilizes a 10-dimensional encoding based on interatomic distances categorized intodiscrete intervals, whereas e3nn encodes edges using full three-dimensionalrelative position vectors between neighboring atoms, explicitly capturingboth geometric and directional information. The cutoff bond lengths wereset to 6.0 Å and 4.3 Å for CGCNN and e3nn, respectively.Both graph neural networks employ multiple convolutional layers toupdate atomic features by aggregating local atomic environments. InCGCNN, three graph convolutional layers sequentially update node fea-tures using information from up to 12 nearest neighbors. A pooling layeraggregates atomic-level features into a global crystal representation,which issubsequently mapped to scalar material properties through fully connectedlayers. The e3nn approach utilizes convolutional layers constructed fromspherical harmonics and learnable radial basis functions, designed to ensureequivariance under rotations, translations, and inversions. The networktypically includes two equivariant convolutional layers followed by gatednonlinearity blocks tailored for tensorial data. After convolution and acti-vation, atomic features are aggregated to form a global descriptor, which isdirectly mapped to continuous spectral functions, namely the cumulative(κnormcumul) and spectral (κnormspec ) thermal conductivities.The neural networks were trained using the Adam optimizer134. ForCGCNN, the learning rate was set to 0.0001, and early stoppingwas appliedwith a patience of 50 epochs.While the prediction performance of CGCNNwas relatively insensitive to hyperparameter choices, the hyperparametersfor the e3nn approach—particularly the learning rate—were carefullytuned. The initial learning rate was set to 5:0=Nall and decayed by a factor of0.95 per epoch until it reached a minimum of 1:5=Nall, where Nall denotesthe total number of data points, including training, validation, and test sets.Early stopping was applied with a patience of 100 epochs during e3nntraining.In both cases, the simulation dataset was split into training (80%),validation (10%), and test (10%) sets based on materials, as the Phonixdatabase contains duplicated material entries. This material-based splittingwas adopted to prevent data leakage in model training. The training datawere used to develop the prediction model, while the validation data wereused to tune hyperparameters and prevent overfitting. The test data wereemployed to evaluate the prediction error. The size of the simulation datasetwas varied from 100 to the full dataset (~5000 samples), and 20 ensembleswere generated to assess the fluctuation in prediction performance. Logscaling and normalization were applied to the target values for κp andκcumulðΛÞ, respectively. Therefore, if the absolute value of κcumulðΛÞ isrequired, it can be reconstructed by combining the two predictions.For the prediction of κcumul, the data were prepared over a range from1 nm to 100 µm, sampled at 51 logarithmically spaced points. The perfor-mance of the predictionmodelwas evaluated using themean absolute error(MAE). The MAE for each material was computed as jκcalcp � κpredp j for κp,and asPΛjκcalccumul Λð Þ � κpredcumulðΛÞj for κcumulðΛÞ, where the superscripts“calc” and “pred” refer to the calculated and predicted values, respectively.The final MAE was obtained by averaging over the entire test dataset. Aftercalculating the MAE for various training data sizes (N train), the scaling lawwas determined by fitting the relationship using the function(MAEÞ ¼ Nc=N train� �α ðNc; α > 0Þ, where Nc is a constant and α is thescaling factor indicating how efficiently increasing the data size improvesprediction accuracy.For data curation, we removed data with i) excessively high thermalconductivity (>2000 Wm�1K�1), ii) both a large phonon gap (>10 cm�1)andhigh thermal conductivity (>500Wm�1K�1), and iii) largefitting errorsin the harmonic and cubic force constants (>10%) for the analysis using thee3nn model. For the analysis using the CGCNN model, only the first cri-terion was applied. This difference in the applied criteria explains why thenumbers of available data points differ between the twomodels (7308 for theCGCNN model and 7244 for the e3nn model), as shown in Fig. 3 andSupplementary Fig. S8. The second criterion was applied, as high thermalconductivity is often suppressed by four-phonon scattering.Data availabilityThe dataset used for machine learning prediction, along with the Pythonscripts employed in this study, is available in the GitHub repository athttps://github.com/masato1122/phonon_e3nn. Phonix—a database foranharmonic phonon interactions—will bemade available onARIM-mdx athttps://phonix-db.org.Code availabilitySoftware for the automated calculation of anharmonic phonon properties(auto-kappa), as well as for the machine learning prediction of theseproperties, will bemade available in theGitHub repository at https://github.com/masato1122/auto-kappa.Received: 29 April 2025; Accepted: 26 February 2026;References1. Nishijima, M. et al. Accelerated discovery of cathode materials withprolonged cycle life for lithium-ion battery. Nat. Commun. 5, 4553(2014).2. Ling, C. A review of the recent progress in battery informatics. npjComput. Mater. 8, 33 (2022).3. Wang, Y. et al. Design principles for solid-state lithium superionicconductors. Nat. Mater. 14, 1026–1031 (2015).4. Zavyalova, U., Holena, M., Schlögl, R. & Baerns, M. Statisticalanalysis of past catalytic dataonoxidativemethanecoupling for newinsights into the composition of high-performance catalysts.ChemCatChem 3, 1935–1947 (2011).5. Kusne, A. G. et al. On-the-fly machine-learning for high-throughputexperiments: search for rare-earth-free permanent magnets. Sci.Rep. 4, 6367 (2014).6. Jain, A. et al. Commentary: the Materials Project: a materialsgenome approach to accelerating materials innovation. APL Mater1, 011002 (2013).7. Ong, S. P. et al. Python Materials Genomics (pymatgen): a robust,open-source python library for materials analysis. Comput. Mater.Sci. 68, 314–319 (2013).https://doi.org/10.1038/s41524-026-02033-w Articlenpj Computational Materials |          (2026) 12:150 10https://github.com/masato1122/phonon_e3nnhttps://phonix-db.orghttps://github.com/masato1122/auto-kappahttps://github.com/masato1122/auto-kappawww.nature.com/npjcompumats8. Ong, S. P. et al. The Materials Application Programming Interface(API): a simple, flexible and efficient API for materials data based onREpresentational State Transfer (REST) principles. Comput. Mater.Sci. 97, 209–215 (2015).9. Saal, J. E., Kirklin, S., Aykol, M., Meredig, B. & Wolverton, C.Materials design and discovery with high-throughput densityfunctional theory: the open quantum materials database (OQMD).JOM 65, 1501–1509 (2013).10. Kirklin, S. et al. The Open Quantum Materials Database (OQMD):assessing the accuracy of DFT formation energies. npj Comput.Mater. 1, 15010 (2015).11. Taylor, R. H. et al. A RESTful API for exchangingmaterials data in theAFLOWLIB.org consortium. Comput. Mater. Sci. 93, 178–192(2014).12. Dan, Y. et al. Generative adversarial networks (GAN) based efficientsampling of chemical composition space for inverse design ofinorganic materials. npj Comput. Mater. 6, 84 (2020).13. Zhao, Y. et al. High-throughput discovery of novel cubic crystalmaterials using deep generative neural networks. Adv. Sci. 8,2100566 (2021).14. Merchant, A. et al. Scaling deep learning for materials discovery.Nature 624, 80–85 (2023).15. Barroso-Luque, L. et al. Open Materials 2024 (OMat24) inorganicmaterials dataset and models. arXiv https://doi.org/10.48550/arxiv.2410.12771 (2024).16. Togo, A. Phonondb. https://github.com/atztogo/phonondb.17. Toher, C. et al. High-throughput computational screening of thermalconductivity, Debye temperature, and Grüneisen parameter using aquasiharmonic Debye model. Phys. Rev. B 90, 174107 (2014).18. Katsura, Y. et al. Data-driven analysis of electron relaxation times inPbTe-type thermoelectric materials. Sci. Technol. Adv. Mater. 20,511–520 (2019).19. Xu, Y., Yamazaki, M. & Villars, P. Inorganic materials database forexploring the nature of material. Jpn. J. Appl. Phys. 50, 11RH02(2011).20. Poudel, B. et al. High-thermoelectric performanceof nanostructuredbismuth antimony Telluride bulk alloys. Science 320, 634–638(2008).21. Miura, A., Zhou, S. & Nozaki, T. Crystalline–amorphous siliconnanocomposites with reduced thermal conductivity for bulkthermoelectrics. ACS Appl. Mater. Interfaces 7, 13484–13489(2015).22. Ångqvist, M. & Erhart, P. Understanding chemical ordering inintermetallic clathrates fromatomic scale simulations.Chem.Mater.29, 7554–7562 (2017).23. Ohnishi, M. et al. Enhancing the thermoelectric performance of Si-based clathrates via carrier optimization considering finitetemperature effects. Chem. Mater. 36, 10595–10604 (2024).24. Tamura, S. Isotope scattering of dispersive phonons in Ge. Phys.Rev. B 27, 858–866 (1983).25. Protik, N. H. & Draxl, C. Beyond the Tamura model of phonon-isotope scattering. Phys. Rev. B 109, 165201 (2024).26. Ohnishi, M., Shiga, T. & Shiomi, J. Effects of defects onthermoelectric properties of carbon nanotubes. Phys. Rev. B 95,155405 (2017).27. Yamawaki, M., Ohnishi, M., Ju, S. & Shiomi, J. Multifunctionalstructural design of graphene thermoelectrics by Bayesianoptimization. Sci. Adv 4, eaar4192 (2018).28. Ohnishi, M. & Shiomi, J. Strain-induced band moudlation of thermalphonons in carbon nanotubes. Phys. Rev. B 104, 014306 (2021).29. Kodama, T. et al.Modulationof thermal and thermoelectric transportin individual carbon nanotubes by fullerene encapsulation. Nat.Mater. 16, 892–897 (2017).30. Heremans, J. P. & Martin, J. Thermoelectric measurements. Nat.Mater. 23, 18–19 (2024).31. Li, J. et al. Probing the limit of heat transfer in inorganic crystals withdeep learning. arXiv https://doi.org/10.48550/arxiv.2503.11568(2025).32. Yang, H. et al. MatterSim: A deep learning atomistic model acrosselements, temperatures and pressures. arXiv https://doi.org/10.48550/arxiv.2405.04967 (2024).33. Ziman, J. M. Electrons and Phonons: The Theory of TransportPhenomena inSolids (OxfordUniversityPress, 2001) https://doi.org/10.1093/acprof:oso/9780198507796.001.0001.34. Uchida, K. et al. Observation of the spin seebeck effect.Nature 455,778–781 (2008).35. Maekawa, S.,Maekawa, S., Valenzuela, S.O., Saitoh, E. &Kimura, T.Spin Current (Oxford University Press, 2012) https://doi.org/10.1093/acprof:oso/9780199600380.001.0001.36. Huang, K. & Rhys, A. Theory of light absorption and non-radiativetransitions in F-centres. Proc. R. Soc. Lond. Ser. A Math. Phys. Sci204, 406–423 (1950).37. Liang, F. et al.Multiphonon-assisted lasingbeyond the fluorescencespectrum. Nat. Phys. 18, 1312–1316 (2022).38. Törmä,P. &Barnes,W. L. Strong couplingbetween surfaceplasmonpolaritons and emitters: a review. Rep. Prog. Phys. 78, 013901(2015).39. Yang, F., Sambles, J. R. & Bradberry, G. W. Long-range surfacemodes supported by thin films. Phys. Rev. B 44, 5855–5872 (1991).40. Chen, D.-Z. A., Narayanaswamy, A. & Chen, G. Surface phonon-polariton mediated thermal conductivity enhancement ofamorphous thin films. Phys. Rev. B 72, 155435 (2005).41. Broido, D. A., Malorny, M., Birner, G., Mingo, N. & Stewart, D. A.Intrinsic lattice thermal conductivity of semiconductors from firstprinciples. Appl. Phys. Lett. 91, 231922 (2007).42. Esfarjani, K. & Stokes, H. T. Method to extract anharmonic forceconstants from first principles calculations.Phys. Rev. B 77, 144112(2008).43. Esfarjani, K., Chen, G. & Stokes, H. T. Heat transport in silicon fromfirst-principles calculations. Phys. Rev. B 84, 085204 (2011).44. Togo, A., Chaput, L. & Tanaka, I. Distributions of phonon lifetimes inBrillouin zones. Phys. Rev. B 91, 094306 (2015).45. Tadano, T., Gohda, Y. & Tsuneyuki, S. Anharmonic force constantsextracted from first-principles molecular dynamics: applications toheat transfer simulations. J. Phys. Condens. Matter 26, 225402(2014).46. Li, W., Carrete, J., Katcho, N. A. & Mingo, N. ShengBTE: a solver ofthe Boltzmann transport equation for phonons. Comput. Phys.Commun. 185, 1747–1758 (2014).47. Esfarjani, K. et al. ALATDYN:a set of AnharmonicLATticeDYNamicscodes to compute thermodynamic and thermal transport propertiesof crystalline solids. Comput. Phys. Commun. 312, 109575 (2025).48. McGaughey, A. J. H., Jain, A., Kim, H.-Y. & Fu, B. Phonon propertiesand thermal conductivity from first principles, lattice dynamics, andthe Boltzmann transport equation. J. Appl. Phys. 125, 011101(2019).49. Omini, M. & Sparavigna, A. An iterative approach to the phononBoltzmann equation in the theory of thermal conductivity. Phys. BCondens. Matter 212, 101–112 (1995).50. Ward, A., Broido, D. A., Stewart, D. A. &Deinzer,G. Ab initio theory ofthe lattice thermal conductivity in diamond.Phys. Rev. B 80, 125203(2009).51. Chaput, L. Direct solution to the linearized phonon Boltzmannequation. Phys. Rev. Lett. 110, 265506 (2013).52. Togo, A., Chaput, L., Tadano, T. & Tanaka, I. Implementationstrategies in phonopy and phono3py. J. Phys. Condens. Matter 35,353001 (2023).53. Feng, T. & Ruan, X. Quantummechanical prediction of four-phononscattering rates and reduced thermal conductivity of solids. Phys.Rev. B 93, 045202 (2016).https://doi.org/10.1038/s41524-026-02033-w Articlenpj Computational Materials |          (2026) 12:150 11https://doi.org/10.48550/arxiv.2410.12771https://doi.org/10.48550/arxiv.2410.12771https://doi.org/10.48550/arxiv.2410.12771https://github.com/atztogo/phonondbhttps://github.com/atztogo/phonondbhttps://doi.org/10.48550/arxiv.2503.11568https://doi.org/10.48550/arxiv.2503.11568https://doi.org/10.48550/arxiv.2405.04967https://doi.org/10.48550/arxiv.2405.04967https://doi.org/10.48550/arxiv.2405.04967https://doi.org/10.1093/acprof:oso/9780198507796.001.0001https://doi.org/10.1093/acprof:oso/9780198507796.001.0001https://doi.org/10.1093/acprof:oso/9780198507796.001.0001https://doi.org/10.1093/acprof:oso/9780199600380.001.0001https://doi.org/10.1093/acprof:oso/9780199600380.001.0001https://doi.org/10.1093/acprof:oso/9780199600380.001.0001www.nature.com/npjcompumats54. Feng, T., Lindsay, L. & Ruan, X. Four-phonon scattering significantlyreduces intrinsic thermal conductivity of solids. Phys. Rev. B 96,161201 (2017).55. Werthamer, N. R. Self-consistent phonon formulationof anharmoniclattice dynamics. Phys. Rev. B 1, 572–581 (1970).56. Tadano, T. & Tsuneyuki, S. Self-consistent phonon calculations oflattice dynamical properties in cubic SrTiO3 with first-principlesanharmonic force constants. Phys. Rev. B 92, 054301 (2015).57. Eriksson, F., Fransson, E. & Erhart, P. The hiphive package for theextraction of high-order force constants by machine learning. Adv.Theory Simul. 2, (2019).58. Tadano, T. & Saidi, W. A. First-principles phonon quasiparticletheory applied to a strongly anharmonic halide perovskite. Phys.Rev. Lett. 129, 185901 (2022).59. Simoncelli, M., Marzari, N. & Mauri, F. Unified theory of thermaltransport in crystals and glasses. Nat. Phys. 395, 1–813 (2019).60. Zhou, J. et al. Ab initio optimization of phonon drag effect for lower-temperature thermoelectric energy conversion. Proc. Natl. Acad.Sci. USA 112, 14777–14782 (2015).61. Liao, B. et al. Significant reduction of lattice thermal conductivity bythe electron-phonon interaction in silicon with high carrierconcentrations: a first-principles study.Phys.Rev. Lett.114, 115901(2015).62. Cepellotti, A.,Coulter, J., Johansson,A., Fedorova,N.S.&Kozinsky,B. Phoebe: a high-performance framework for solving phonon andelectron Boltzmann transport equations. J. Phys.: Mater. 5, 035003(2022).63. Mingo, N., Esfarjani, K., Broido, D. A. & Stewart, D. A. Clusterscattering effects on phonon conduction in graphene. Phys. Rev. B81, 045408 (2010).64. Katcho, N. A., Carrete, J., Li, W. & Mingo, N. Effect of nitrogen andvacancy defects on the thermal conductivity of diamond: an ab initioGreen’s function approach. Phys. Rev. B 90, 094117 (2014).65. Ångqvist, M. et al. ICET – a Python library for constructing andsampling alloy cluster expansions. Adv. Theory Simul. 2, 1900015(2019).66. Carrete, J., Li, W., Mingo, N., Wang, S. & Curtarolo, S. Findingunprecedentedly low-thermal-conductivity half-Heuslersemiconductors via high-throughputmaterialsmodeling.Phys. Rev.X 4, 011019 (2014).67. Seko, A. et al. Prediction of low-thermal-conductivity compoundswith first-principles anharmonic lattice-dynamics calculations andBayesian optimization. Phys. Rev. Lett. 115, 205901 (2015).68. Ju, S. et al. Exploring diamondlike lattice thermal conductivitycrystals via feature-based transfer learning. Phys. Rev. Mater. 5,053801 (2021).69. Qin, G. et al. Predicting lattice thermal conductivity fromfundamentalmaterial properties usingmachine learning techniques.J. Mater. Chem. A 11, 5801–5810 (2023).70. Miyazaki, H. et al. Machine learning based prediction of latticethermal conductivity for half-Heusler compounds using atomicinformation. Sci. Rep. 11, 13410 (2021).71. Zhu, T. et al. Charting lattice thermal conductivity for inorganiccrystals and discovering rare earth chalcogenides forthermoelectrics. Energy Environ. Sci. 14, 3559–3566 (2021).72. Yan, J. et al. Material descriptors for predicting thermoelectricperformance. Energy Environ. Sci. 8, 983–994 (2014).73. Callaway, J. Model for lattice thermal conductivity at lowtemperatures. Phys. Rev. 113, 1046–1051 (1958).74. Cahill, D. G., Watson, S. K. & Pohl, R. O. Lower limit to the thermalconductivity of disordered crystals. Phys. Rev. B 46, 6131–6140(1992).75. Zhou, F., Nielson,W., Xia, Y. & Ozoliņš, V. Lattice anharmonicity andthermal conductivity from compressive sensing of first-principlescalculations. Phys. Rev. Lett. 113, 185501 (2014).76. Seko, A. & Togo, A. Projector-based efficient estimation of forceconstants. Phys. Rev. B 110, 214302 (2024).77. Togo, A. & Seko, A. On-the-fly training of polynomial machinelearning potentials in computing lattice thermal conductivity. J.Chem. Phys. 160, 211001 (2024).78. Chen, C. & Ong, S. P. A universal graph deep learning interatomicpotential for the periodic table.Nat. Comput. Sci. 2, 718–728 (2022).79. Simoncelli, M., Marzari, N. & Mauri, F. Wigner formulation of thermaltransport in solids. Phys. Rev. X 12, 041011 (2022).80. Póta, B., Ahlawat, P., Csányi, G. & Simoncelli, M. Thermalconductivity predictions with foundation atomistic models. arXivhttps://doi.org/10.48550/arxiv.2408.00755 (2024).81. Kielar, S. et al. Anomalous lattice thermal conductivity increase withtemperature in cubic GeTe correlatedwith strengthening of second-nearest neighbor bonds. Nat. Commun. 15, 6981 (2024).82. Guo, Z. et al. Fast and accurate machine learning prediction ofphonon scattering rates and lattice thermal conductivity. npjComput. Mater. 9, 95 (2023).83. Plata, J. J., Posligua, V., Márquez, A. M., Sanz, J. F. & Grau-Crespo,R. Charting the lattice thermal conductivities of I–III–VI2 chalcopyritesemiconductors. Chem. Mater. 34, 2833–2841 (2022).84. Xia, Y. et al. High-throughput study of lattice thermal conductivity inbinary rocksalt and zinc blende compounds including higher-orderanharmonicity. Phys Rev X 10, 041029 (2020).85. Li, Z., Lee, H., Wolverton, C. & Xia, Y. High-throughputcomputational framework for high-order anharmonicthermaltransport in cubic and tetragonal crystals. npj Comput. Mater 12, 51(2026).86. Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initiototal-energy calculations using a plane-wave basis set.Phys. Rev. B54, 11169–11186 (1996).87. Hanai, M. et al. ARIM-mdx data system: towards a nationwide dataplatform for materials science. IEEE Int. Conf. Big Data 00,2326–2333 (2024).88. Slack, G. A. Anisotropic thermal conductivity of pyrolytic graphite.Phys. Rev. 127, 694–701 (1962).89. Protik, N. H. et al. Phonon thermal transport in 2H, 4H and 6H siliconcarbide from first principles.Mater. Today Phys 1, 31–38 (2017).90. Cheng,Z. et al.High thermal conductivity inwafer-scale cubic siliconcarbide crystals. Nat. Commun. 13, 7201 (2022).91. Zheng, Q. et al. Thermal conductivity of GaN, GaN71, and SiC from150 K to 850 K. Phys. Rev. Mater. 3, 014601 (2019).92. Fisher, G. R. & Barnes, P. Towards a unified view of polytypism insilicon carbide. Philos. Mag. Part B 61, 217–236 (1990).93. Zhang, Y., Ke, X., Chen,C., Yang, J. &Kent, P. R. C. Thermodynamicproperties of PbTe, PbSe, and PbS: First-principles study. Phys.Rev. B 80, 024304 (2009).94. Tian, Z. et al. Phonon conduction in PbSe, PbTe, and PbTe 1−xSexfrom first-principles calculations. Phys. Rev. B 85, 184303 (2012).95. Ju, S., Shiga, T., Feng, L. &Shiomi, J. Revisiting PbTe to identify howthermal conductivity is really limited.Phys. Rev. B97, 184305 (2018).96. Hestness, J. et al. Deep Learning Scaling is Predictable, Empirically.arXiv https://doi.org/10.48550/arxiv.1712.00409 (2017).97. Kaplan, J. et al. Scaling laws for neural language models. arXivhttps://doi.org/10.48550/arxiv.2001.08361 (2020).98. Minami, S. et al. Scaling law of Sim2Real transfer learning inexpanding computational materials databases for real-worldpredictions. arXiv https://doi.org/10.48550/arxiv.2408.04042(2024).99. Mikami, H. et al. Machine learning and knowledge discovery indatabases. In European Conference, ECML PKDD 2022,Proceedings, part III, 477–492, https://inspirehep.net/literature/2818406 (2023).100. Ohnishi, M. & Shiomi, J. Towards ultimate impedance of phonontransport by nanostructure interface. APL Mater 7, 013102 (2019).https://doi.org/10.1038/s41524-026-02033-w Articlenpj Computational Materials |          (2026) 12:150 12https://doi.org/10.48550/arxiv.2408.00755https://doi.org/10.48550/arxiv.2408.00755https://doi.org/10.48550/arxiv.1712.00409https://doi.org/10.48550/arxiv.1712.00409https://doi.org/10.48550/arxiv.2001.08361https://doi.org/10.48550/arxiv.2001.08361https://doi.org/10.48550/arxiv.2408.04042https://doi.org/10.48550/arxiv.2408.04042https://inspirehep.net/literature/2818406https://inspirehep.net/literature/2818406https://inspirehep.net/literature/2818406www.nature.com/npjcompumats101. Qian, X., Zhou, J. & Chen, G. Phonon-engineered extreme thermalconductivity materials. Nat. Mater. 20, 1188–1202 (2021).102. Xie, T. & Grossman, J. C. Crystal graph convolutional neuralnetworks for an accurate and interpretable prediction of materialproperties. Phys. Rev. Lett. 120, 145301 (2018).103. Geiger, M. & Smidt, T. e3nn: Euclidean Neural Networks. arXivhttps://doi.org/10.48550/arxiv.2207.09453 (2022).104. Fuchs, F. B., Worrall, D. E., Fischer, V. & Welling, M. SE(3)-transformers: 3D roto-translation equivariant attention networks.arXiv https://doi.org/10.48550/arxiv.2006.10503 (2020).105. Chen, Z. et al. Direct prediction of phonon density of states witheuclidean neural networks. Adv. Sci. 8, 2004214 (2021).106. Okabe, R. et al. Virtual node graph neural network for full phononprediction. Nat. Comput. Sci. 4, 522–531 (2024).107. Fang, S., Geiger, M., Checkelsky, J. G. & Smidt, T. Phononpredictions with E(3)-equivariant graph neural networks. arXivhttps://doi.org/10.48550/arxiv.2403.11347 (2024).108. Srivastava, Y. & Jain, A. Accelerating prediction of phonon thermalconductivity by an order of magnitude through machine learningassisted extraction of anharmonic force constants. Phys. Rev. B110, 165202 (2024).109. Zhang, F. et al. Room-temperature magnetic thermal switching bysuppressing phonon-magnon scattering. Phys. Rev. B 109, 184411(2024).110. Shao, H. et al. Phonon transport in Cu2GeSe3: effects of spin-orbitcoupling and higher-order phonon-phonon scattering. Phys. Rev. B107, 085202 (2023).111. Tadano, T., Gohda, Y. & Tsuneyuki, S. Impact of rattlers on thermalconductivity of a thermoelectric clathrate: a first-principles study.Phys. Rev. Lett. 114, 095501 (2015).112. Ohnishi, M., Tadano, T., Tsuneyuki, S. & Shiomi, J. Anharmonicphonon renormalization and thermal transport in the type-IBa8Ga16Sn30 clathrate from first principles. Phys. Rev. B 106,024303 (2022).113. Meredig, B. et al. Can machine learning identify the next high-temperature superconductor? Examining extrapolation performancefor materials discovery.Mol. Syst. Des. Eng. 3, 819–825 (2018).114. Xu, K. et al. How neural networks extrapolate: from feedforward tograph neural networks. arXiv https://doi.org/10.48550/arxiv.2009.11848 (2020).115. Noda, K., Wakiuchi, A., Hayashi, Y. & Yoshida, R. Advancingextrapolative predictions of material properties through learning tolearn. arXiv https://doi.org/10.48550/arxiv.2404.08657 (2024).116. Lindsay, L., Broido, D. A. & Reinecke, T. L. First-principlesdetermination of ultrahigh thermal conductivity of boron arsenide: acompetitor for diamond? Phys. Rev. Lett. 111, 025901 (2013).117. Qin, G., Xu, J., Wang, H., Qin, Z. & Hu, M. Activated lone-pairelectrons lead to low lattice thermal conductivity: a case study ofboron arsenide. J. Phys. Chem. Lett. 14, 139–147 (2023).118. Semenok, D. V. et al. Superconductivity at 161 K in thorium hydrideThH10: Synthesis and properties.Mater. Today 33, 36–44 (2020).119. Cort, B., Ward, J. W., Vigil, F. A. & Haire, R. G. Resistivity studies ofcubic americium hydrides from 20 to 300 K. J. Alloy. Compd. 224,237–240 (1995).120. Cendrowski-Guillaume, S. M., Lance, M., Nierlich, M., Vigner, J. &Ephritikhine, M. New actinide hydrogen transition metalcompounds. Synthesis of [K(C 12 H 24O 6)][(η-C 5 Me 5) 2 (Cl)UH 6Re(PPh 3) 2] and the crystal structure of its benzene solvate. J.Chem. Soc. Chem. Commun 0, 1655–1656 (1994).121. Larsen, A. H. et al. The atomic simulation environment—a Pythonlibrary for working with atoms. J. Phys. Condens. Matter 29, 273002(2017).122. Togo, A., Shinohara, K. & Tanaka, I. Spglib: a software library forcrystal symmetry search. Sci. Technol. Adv. Mater. Methods 4,2384822 (2024).123. Togo, A. First-principles phonon calculations with phonopy andphono3py. J. Phys. Soc. Jpn. 92, 012001 (2023).124. Hinuma, Y., Pizzi, G., Kumagai, Y., Oba, F. & Tanaka, I. Bandstructure diagram paths based on crystallography. Comput. Mater.Sci. 128, 140–184 (2017).125. Wang, Y. et al. A mixed-space approach to first-principlescalculations of phonon frequencies for polar materials. J. Phys.Condens. Matter 22, 202201 (2010).126. Parlinski, K., Li, Z. Q. & Kawazoe, Y. Parlinski, Li, and KawazoeReply. Phys. Rev. Lett. 81, 3298–3298 (1998).127. Gonze, X. & Lee, C. Dynamical matrices, Born effective charges,dielectric permittivity tensors, and interatomic force constants fromdensity-functional perturbation theory. Phys. Rev. B 55,10355–10368 (1997).128. Zou, H. The adaptive lasso and its oracle properties. J. Am. Stat.Assoc. 101, 1418–1429 (2006).129. Birch, F. Finite elastic strainof cubic crystals.Phys.Rev.71, 809–824(1947).130. Murnaghan, F. D. The compressibility of media under extremepressures. Proc. Natl. Acad. Sci. 30, 244–247 (1944).131. Perdew, J. P. et al. Restoring the density-gradient expansion forexchange in solids and surfaces. Phys. Rev. Lett. 100, 136406 (2008).132. Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 50,17953–17979 (1994).133. Kresse, G. & Joubert, D. From ultrasoft pseudopotentials to theprojector augmented-wave method. Phys. Rev. B 59, 1758–1775(1999).134. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization.arXiv https://doi.org/10.48550/arxiv.1412.6980 (2014).135. Momma, K. & Izumi, F. VESTA 3 for three-dimensional visualizationof crystal, volumetric and morphology data. J. Appl. Crystallogr. 44,1272–1276 (2011).AcknowledgementsThe authors thank C. Dames and Y. Sun for co-organizing the Workshop“Thermal Transport, Materials Informatics, and Quantum Computing” sup-ported by National Science Foundation (NSF) and Japan Science and Tech-nologyAgency (JST),where thisprojectwasconceptualized. Theauthorsalsothank C. Wolverton, A. Togo, K. Esfarjani, and M. Kawamura for fruitful dis-cussions. Numerical calculations were performed using the following super-computers through the HPCI System Research Project (Project IDs:hp220151, jh230065, and hp240194): Grand Chariot at the InformationInitiative Center, Hokkaido University; OCTOPUS and SQUID at the D3Center, Osaka University; Oakbridge-CX and Wisteria/BDEC-01 at theSupercomputing Division, Information Technology Center, The University ofTokyo; and AOBA-B at the Cyberscience Center, Tohoku University. Addi-tional resources were provided by the Supercomputer Center, Institute forSolid State Physics, The University of Tokyo, and MASAMUNE-IMR at theCenter for Computational Materials Science, Institute for Materials Research,Tohoku University. This work was partially supported by CREST Grants No.JPMJCR21O2andNo.JPMJCR19I2 fromtheJapanScienceandTechnologyAgency (JST), JSPSKAKENHIGrantsNo. 22H04950andNo. 24K07354 fromtheJapanSociety for thePromotionofScience (JSPS), andagrant-in-aid fromthe Thermal and Electric Energy Technology Foundation. K.H. acknowledgesfunding from the MAT-GDT Program at A*STAR via the AME ProgrammaticFund by the Agency for Science, Technology and Research under Grant No.M24N4b0034. L.L. acknowledges supported for vibrational property calcu-lations and database discussions from the U.S. Department of Energy, Officeof Science, Office of Basic Energy Sciences, Material Sciences and Engi-neering Division. T.D. acknowledges the financial support from NationalNatural Science Foundation of China (Grant No. 62204218) and ZhejiangProvincial Natural Science Foundation of China (No. LJXSZ26A040002), andcomputational resources from the National Supercomputer Center in Tianjin.P. T. acknowledges the financial support from the Catalan Governmentthrough the funding grant ACCIÓ-Eurecat (Project TRAÇA SMART-MAT).https://doi.org/10.1038/s41524-026-02033-w Articlenpj Computational Materials |          (2026) 12:150 13https://doi.org/10.48550/arxiv.2207.09453https://doi.org/10.48550/arxiv.2207.09453https://doi.org/10.48550/arxiv.2006.10503https://doi.org/10.48550/arxiv.2006.10503https://doi.org/10.48550/arxiv.2403.11347https://doi.org/10.48550/arxiv.2403.11347https://doi.org/10.48550/arxiv.2009.11848https://doi.org/10.48550/arxiv.2009.11848https://doi.org/10.48550/arxiv.2009.11848https://doi.org/10.48550/arxiv.2404.08657https://doi.org/10.48550/arxiv.2404.08657https://doi.org/10.48550/arxiv.1412.6980https://doi.org/10.48550/arxiv.1412.6980www.nature.com/npjcompumatsAuthor contributionsThe project was conceptualized by T.L. and J.S. (together withChris Damesand Ying Sun), and managed by M.O. and J.S. M.O., T.T., T.D., P.T., Z.X.,Z.W., and M.M. contributed to code development. M.O., T.D., P.T., Z.X.,H.Z.,W.N., Z.W., M.M. generated phonon property data through automatedcalculations. M.O., R.Y., and J.S. contributed to data analysis. M.O., M.H.,Z.W., T.S., R.Y., and J.S. contributed to the machine learning and databaseconstruction. M.O., Z.W., and J.S. wrote the original manuscript, and allauthors contributed to revising the manuscript.Competing interestsThe authors declare no competing interests.Additional informationSupplementary information The online version containssupplementary material available athttps://doi.org/10.1038/s41524-026-02033-w.Correspondence and requests for materials should be addressed toMasato Ohnishi or Junichiro Shiomi.Reprints and permissions information is available athttp://www.nature.com/reprintsPublisher’s note Springer Nature remains neutral with regard tojurisdictional claims in published maps and institutional affiliations.Open Access This article is licensed under a Creative CommonsAttribution 4.0 International License, which permits use, sharing,adaptation, distribution and reproduction in anymedium or format, as longas you give appropriate credit to the original author(s) and the source,provide a link to the Creative Commons licence, and indicate if changeswere made. The images or other third party material in this article areincluded in the article’s Creative Commons licence, unless indicatedotherwise in a credit line to the material. If material is not included in thearticle’sCreativeCommons licence and your intended use is not permittedby statutory regulation or exceeds the permitted use, you will need toobtain permission directly from the copyright holder. To view a copy of thislicence, visit http://creativecommons.org/licenses/by/4.0/.© The Author(s) 20261InstituteofEngineering Innovation,TheUniversityofTokyo,Tokyo,Japan. 2The InstituteofStatisticalMathematics,ResearchOrganizationof InformationandSystems,Tachikawa, Tokyo, Japan. 3State Key Laboratory of Silicon andAdvancedSemiconductorMaterials, School ofMaterials Science and Engineering, ZhejiangUniversity,Hangzhou, China. 4Institute of Advanced Semiconductors, ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University,Hangzhou, China. 5Eurecat, Technology Centre of Catalonia, Unit of Applied Artificial Intelligence, Cerdanyola del Vallès, Spain. 6Department of Aerospace andMechanical Engineering, University of Notre Dame, Notre Dame, IN, USA. 7Research Center for Magnetic and Spintronic Materials, National Institute for MaterialsScience, Tsukuba, Japan. 8School of Materials Science and Engineering, Nanyang Technological University, Singapore, Singapore. 9Information Technology Center,TheUniversity of Tokyo, Tokyo, Japan. 10Department ofMechanical Engineering, TheUniversity of Tokyo, Tokyo, Japan. 11Sibley School ofMechanical andAerospaceEngineering, Cornell University, Ithaca, NY, USA. 12Department of Mechanical Engineering, University of South Carolina, Columbia, SC, USA. 13School of MechanicalEngineering and Birck Nanotechnology Center, Purdue University, West Lafayette, IN 47907, USA. 14The Graduate University for Advanced Studies, SOKENDAI,Tachikawa, Tokyo, Japan. 15Advanced General Intelligence for Science Program (AGIS), RIKEN-TRIP, Wako, Saitama, Japan. 16Materials Science and TechnologyDivision,OakRidgeNational Laboratory,OakRidge,TN,USA. 17DepartmentofMechanicalEngineering,CarnegieMellonUniversity,Pittsburgh,PA,USA. 18Departmentof Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, IN, USA. 19Institute of Materials Research and Engineering, Agency for ScienceTechnology and Research, Innovis, Singapore, Singapore. 20RIKEN Center for Advanced Intelligence Project, Tokyo, Japan.e-mail: masato.ohnishi.ac@gmail.com; shiomi@photon.t.u-tokyo.ac.jphttps://doi.org/10.1038/s41524-026-02033-w Articlenpj Computational Materials |          (2026) 12:150 14https://doi.org/10.1038/s41524-026-02033-whttp://www.nature.com/reprintshttp://creativecommons.org/licenses/by/4.0/mailto:masato.ohnishi.ac@gmail.commailto:shiomi@photon.t.u-tokyo.ac.jpwww.nature.com/npjcompumats Database and deep-learning scalability of anharmonic phonon properties by automated brute-force first-principles calculations Results Automation of anharmonic phonon analysis Database analysis Computational accuracy Deep learning scaling law for anharmonic phonon properties Screening using the Phonix database Discussion Methods Automated workflow for anharmonic phonon calculations Step 1: Symmetry analysis of the crystal structure Step 2: Structure optimization Step 3: Calculation of Born effective charges Step 4: Calculation of harmonic force constants Step 5: Analysis for harmonic phonon properties Step 6: Calculation of cubic force constants Step 7: Analysis for anharmonic phonon properties Step 8: Strict structure optimization Step 9: Use of larger supercell for harmonic force constants Step 10: Phonon renormalization Parameters for first-principles calculations Machine learning prediction of phonon properties Data availability Code availability References Acknowledgements Author contributions Competing interests Additional information