# Fileset

[jpsj.87.113801 (1).pdf](https://mdr.nims.go.jp/filesets/188fcefe-ac90-4ace-8858-769ae5540356/download)

## Creator

Hieu Chi Dam, Viet Cuong Nguyen, Tien Lam Pham, Anh Tuan Nguyen, Kiyoyuki Terakura, [Takashi Miyake](https://orcid.org/0000-0003-2658-3470), [Hiori Kino](https://orcid.org/0000-0002-8912-686X)

## Rights

Creative Commons BY Attribution 4.0 International[Creative Commons BY Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/)

## Other metadata

[Important Descriptors and Descriptor Groups of Curie Temperatures of Rare-earth Transition-metal Binary Alloys](https://mdr.nims.go.jp/datasets/81093914-76bf-4fab-b30e-1b0a7d1af4fb)

## Fulltext

Important Descriptors and Descriptor Groups of Curie Temperatures of Rare-earth Transition-metal Binary AlloysImportant Descriptors and Descriptor Groups of Curie Temperaturesof Rare-earth Transition-metal Binary AlloysHieu Chi Dam1,2,3, Viet Cuong Nguyen4, Tien Lam Pham1,5, Anh Tuan Nguyen6,Kiyoyuki Terakura1,2, Takashi Miyake2,5,7, and Hiori Kino2,51Japan Advanced Institute of Science and Technology, Nomi, Ishikawa 923-1292, Japan2CMI2, MaDIS, NIMS, Tsukuba, Ibaraki 305-0047, Japan3JST, PRESTO, Kawaguchi, Saitama 332-0012, Japan4HPC Systems Inc., Minato, Tokyo 108-0022, Japan5ESICMM, NIMS, Tsukuba, Ibaraki 305-0047, Japan6Hanoi Metropolitan University, 98 Duong Quang Ham, Cau Giay, Hanoi, Vietnam7CD-FMat, AIST, Tsukuba, Ibaraki 305-8568, Japan(Received September 8, 2018; accepted October 2, 2018; published online October 29, 2018)We analyze the Curie temperatures of rare-earth transition metal binary alloys using machine learning. In order toselect important descriptors and descriptor groups, we introduce a newly developed subgroup relevance analysis andadopt hierarchical clustering in the representation. We execute exhaustive search and demonstrate that our approachresults in the successful selection of important descriptors and descriptor groups. It helps us to choose the combination ofdescriptors and to understand the meaning of the selected combination of descriptors.Magnets are now widely used and play an important role inenergy savings.1,2) One of the most important applications ofmagnets is electric motors, whose performance significantlydepends on the performance of magnets. Nd–Fe–B basedrare-earth magnets are the strongest among the existingpermanent magnets, and are almost the only type ofpermanent magnets that meets the stringent performancerequirements of the recent electric motors. However one ofthe problems with Nd–Fe–B magnets is the relatively lowCurie temperature compared to the operation temperatures ofthe motors. Therefore, many researchers have carried outstudies to overcome this drawback, including the explorationof new magnets.The Curie temperature (TC) is one of the most importantphysical quantities of magnets, but unfortunately, it is one ofthe most difficult physical quantities to predict correctly.There are several theory-driven methods for evaluating theTC of magnetic materials.3) One of the basic approaches is tosolve an (extended) Hubbard model by using various low-energy solvers. In principle, this method is expected to beaccurate. However Anisimov et al. showed that the resultsare sensitive to the effective parameters and details of the lowenergy solver.4–6) Therefore, this approach is still at the levelof testing the formalism for simple systems like puretransition-metal magnets.Atomistic spin model is the most common choice forpractical application to more complex systems.3) The spinmodel is constructed from the magnetic moment at eachatomic site and the intersite magnetic exchange-couplingsbased on the assumption of fixed magnitude of spinmoments. The parameters are evaluated using the first-principles calculations.3) This method can be applied to rare-earth magnets. Usually, the model is simplified further, andis restricted to the TM-3d and RE-4f spins. Then, TCs isevaluated, usually in the mean field approximation. The meanfield approximation, however, usually overestimates TCs.Thus, there exist many sources of error in the TC evaluationusing the atomistic spin model. The development oftheoretical methods for the estimation of the TC is stillunderway.In contrast to the deductive approaches described so far,there is now a movement toward utilizing inductiveapproaches, i.e., data-driven methods for estimating TC, andthere have been many reports of successful prediction of thephysical quantities using such methods.7–12) The data-drivenapproach accumulates data, prepares descriptors, makes amodel with the descriptors, and finally predicts the values ofphysical quantities of new materials. One of the key points tobe considered for successful prediction is the choice ofdescriptors. A typical example of descriptor selection can beseen in the work by Ghiringhelli et al., where a regressionmodel is used to predict the energy difference between zincblende or wurtzite and rocksalt structures.13) They used alinear regression model, and first prepared basic descriptors.However, a linear regression model with only the basicdescriptors has low description power. Then, they performedvarious operations on the basic descriptors and produced anumber of nonlinear combinations among the basic descrip-tors. This resulted in an increase in the prediction power.They shrank the number of descriptors using LASSO andfinally employed exhaustive search to find the best linearregression model. Their work shows that the combination ofdescriptors is important for increasing the accuracy of theregression model.Usually, we select the best regression model and discardall the others (performance-optimized model). However weknow that there exist many regression models, where thecombination of the descriptors is different from the one thathas the best score, but the score of which is as good as thebest one indicated by the exhaustive search method. (The bestscore means, for example, the largest R2 value in theregression model.) There exists another strategy where wechoose the regression model the score of which is not thebest, but is high. For example, we can choose low costdescriptors, where “low cost” means easy or literally low costto evaluate through experiments or calculations. This modelis usually referred to an operation-optimized model. Okadaet al. devoted considerable effort to the latter problem. Theyshowed the scores of regression models as the density ofstates to understand the overall structure in one way, andJournal of the Physical Society of Japan 87, 113801 (2018)https://doi.org/10.7566/JPSJ.87.113801Letters113801-1 ©2018 The Physical Society of Japanmaintain attribution to the author(s) and the title of the article, journal citation, and DOI.©2018 The Author(s)This article is published by the Physical Society of Japan under the terms of the Creative Commons Attribution 4.0 License. Any further distribution of this work mustJ. Phys. Soc. Jpn.Downloaded from journals.jps.jp by （研）物質・材料研究機構 on 02/08/23https://doi.org/10.7566/JPSJ.87.113801http://creativecommons.org/licenses/by/4.0/http://crossmark.crossref.org/dialog/?doi=10.7566%2FJPSJ.87.113801&domain=pdf&date_stamp=2018-10-29plotted the best scores as a function of the combinationsin another way, such as the indicator diagram, to selectthe best combinations depending on the purpose of theanalysis.14–16)Yet, it is not easy to understand the relationship andstructures among descriptors from a huge list of scores anddescriptors. Informatics treatment usually ignore the impor-tance of the meaning of the descriptors, though they arephysical parameters that physicists regard as important.However we hope that we can extract more information fromthe huge data. In the present work, we introduce a well-defined subgroup concept to clarify the relationship amongdescriptors. Our method can also elucidate how to choosecombination of descriptors systematically as well as how tounderstand the meaning of descriptors.Our target variable is the experimental TC of the rare-earthtransition-metal binary stoichiometry alloys considered inthis study.17) We select the descriptors from the elementdependent categories (R for rare-earth elements and T fortransition metal elements), and utilize the knowledge of theconventional theory-driven method. The key parameters ofthe effective theory-driven models are related to the proper-ties of the constituent elements and=or structural parameters.For example, the orbital energy level increases (becomesdeeper) as the atomic number Z increases. The electroninteraction becomes stronger as the atomic orbital becomesmore localized. The magnetic exchange-couplings areassociated with the strength of the electron interaction andtransfer integrals. The coupling strength between TM-3d andRE-4f (through RE-5d) is crucial for discussing the REdependence of magnetism. This strength is proportional tothe 3d-4f effective exchange coupling and the 4f total spinprojected onto the 4f total angular moment J4f. The latterquantity is given by J4f ð1 � gJÞ, with gJ being the Landég-factor. We also add the descriptors from the structure-related category (S) to describe the ratio of the elements aswell as the real volume or spatial dependent simple variablesto distinguish, e.g., Th2Zn17 and Th2Ni17 polytypes. We listthe descriptors in Table I, and give their detailed explanationsin the supporting information.18)As a regression model, we employ kernel ridge regressionwith the radial basis function kernel. Kernel ridge regressioncan include the non-linear effects of the descriptors and hasmuch stronger power to fit the target functions with thedescriptors, though there exist a demerit of taking much moretime to fit=predict the regression models than the linearregression does. We used Python scripts with mpi4py, scipyand scikit-learn.19–21) Our scores in the regression models arethe R2 values, which we evaluate in the leave-one-out crossvalidation.First, we analyze the descriptors. We take Pearson’scorrelation coefficient between the descriptors. For the Tcategory, the absolute values of Pearson’s correlationcoefficient among the three descriptors, ZT, rT, and S3d, arethe same, namely 1, which means that their contributions arethe same in the regression model after the normalizationprocedure. Therefore, the number of independent descriptorsis reduced from 27 to 25. Then, we perform exhaustivesearch for 225 � 1 ¼ 3:3 � 107 regression models where thecombinations of descriptors are different, and evaluate theiraccuracy values (scores).Usually, we evaluate the score of the regression model;however, we want to evaluate the importance of thedescriptors. Therefore, we change the viewpoint from theregression model to the descriptor in order to discuss theimportance of the latter. We use relevance analysis,22,23)which roughly corresponds to the linear response theorywith respect to the descriptors. (We explain the scores andrelevance analysis in the supporting information.18)) Itoriginally utilizes the change in values when we remove=add a descriptor. The former corresponds to the leave-one-out experiment, while the latter corresponds to the add-one-inexperiment. The descriptor is strongly or weakly relevantwhen its accuracy score changes meaningfully in the leave-one-out or the add-one-in experiment, respectively.Our first relevance analysis is based on strong relevance.We found that only the descriptor, CR, is strongly relevant.We can verify the importance of CR when we plot CR vs TC.Almost all the points are placed in the bottom-left side of theright panel of Fig. 1. Thus, it is clear that CR has aconsiderable influence on the TC. It should be noted that weTable I. Transition metal, rare-earth, and structural descriptors. See alsothe supporting information.18)Category DescriptorsAtomic propertiesof transition metals (T)ZT, rT, rcvT , IPT, �T, S3d, L3d, J3dAtomic propertiesof rare-earth metals (R)ZR, rR, rcvR , IPR, �R, S4f, L4f, J4f,gJ, J4fgJ, J4fð1 � gJÞStructural information (S)CT, CR, dT�T, dT�R, dR�R, NT�R,NR�R, NR�Tallleave-CR-outFig. 1. (Color online) Top panel: The blue line shows the best score foreach number of descriptors. The orange dotted line shows the score when CRis removed. Bottom panel: CR (Å−3) vs TC (°C).J. Phys. Soc. Jpn. 87, 113801 (2018) Letters H. C. Dam et al.113801-2 ©2018 The Physical Society of Japan©2018 The Author(s)J. Phys. Soc. Jpn.Downloaded from journals.jps.jp by （研）物質・材料研究機構 on 02/08/23will not able to find such a relationship if we simply executethe regressions.We notice that relevance analysis can be done not only fora descriptor, but also for a subgroup of descriptors. Wedefine groups and subgroups in this paragraph. The secondrelevance analysis is based on weak relevance, where, in theoriginal prescription, we add another descriptor to the set ofdescriptors, which we must define. We define the groups andsubgroups here, and make use of them in the relevanceanalysis. We utilize hierarchal clustering analysis, where thedistance between descriptors is one minus the absolute valuesof Pearson’s correlation coefficient. We can define the groupsor subgroups of descriptors that are clustered based on thecriteria of them being within distance, d, of each other. Forexample, we can define four groups at d ¼ 0:5. Two of themhave the same descriptors as those of the T and R categories,while the other two have that of the original S category. (Wecall the original cluster as category and the cluster by thehierarchical analysis as group.) The dTR constitutes a group,while the other S category descriptors constitute the other. Itis not surprising that the grouping at d ¼ 0:5 is almost thesame as the categories defined a priori as T, R, and S whenwe remember the definition of the descriptors of thematerials. Here, we successfully defined the groups andsubgroups, where the groups are almost the same as theoriginal category but are clustered from the data themselves.(We redefine the group S as a result of this clustering. Thegroup S that does not include dTR is different from thecategory S.)We can make further advances in this grouping. We noticethat the definition of the value of d is unnecessary, but weonly have to define the vertical line of the decomposition treeto define the subgroups because the child nodes below thevertical line is the same. (See also Fig. 2. The vertical axiscorresponds to d.) Thus, we are able to define manysubgroups of the descriptors as sets of the child nodes ofthe dendrogram.We apply the relevance analysis not to a descriptor but to asubgroup=group. We call this method subgroup relevanceanalysis. We plotted the result in Fig. 2. The horizontal scoreis evaluated in the leave-one-out experiment and is related tothe strong relevance, while the vertical scores are evaluatedin the add-one-in experiment and is related to the weakrelevance. Note that the score of a subgroup belonging to agroup is evaluated under the condition that we must use atleast one descriptor in the subgroup, and any descriptorsbelonging to the other groups can be added in the weakrelevance analysis.In Fig. 2, the weak relevance values, or add-one-in values,are written as vertical values. The subgroup containing onlyrR has the score, 0.89467, which is the highest score in thecondition that we must take the subgroup rR in the group Rand we can take any descriptors in the other groups. (Asubgroup which has a descriptor is also a subgroup.) Thesubgroup containing rR, ZR, and rcvR has the score, 0.95445,which is the highest score in the condition that we must takeat least one descriptor in the subgroup rR, ZR, and rcvR of thegroup R and we can take any descriptors in the other groupsas explained in the previous paragraph.The sole descriptor ZR in the group R has the highest score(0.95445). It means that ZR can solely represent the group R.This is also the case for the CR subgroup in the group S.0.954450.873750.898890.906550.932960.932960.932960.911400.885540.851640.888320.948600.894670.949930.946720.953820.890970.953020.909170.651660.948470.924270.946500.948760.954450.880250.779020.930180.925430.817840.917290.95445d0.87587 0.95445 0.58868 0.50682Fig. 2. (Color online) R2 scores of the subgroup relevance analysis on the hierarchical clustering of the descriptors. We include TC in the dendrogram. Thegroup R (green) is from L4f to rcvR . The group T (red) is from IPT to rT. The group S (cyan) is from dTT to CT. The group dTR is made of the descriptor dTR. Thehorizontal values are strong relevance values and the tilted values are weak relevance values. The vertical axis shows the distance, d, and the values are oneminus the absolute values of Pearson’s correlation coefficient. The paths of the highest value (0.95445) are colored in yellow dashed lines. See details in themain body also.J. Phys. Soc. Jpn. 87, 113801 (2018) Letters H. C. Dam et al.113801-3 ©2018 The Physical Society of Japan©2018 The Author(s)J. Phys. Soc. Jpn.Downloaded from journals.jps.jp by （研）物質・材料研究機構 on 02/08/23However the structure of the group T is different from thoseof the groups R and S. The subgroup made of J3d, �T, rcvT , ZT(and rT and S3d) has the highest score (0.94876), but its childsubgroup descriptors have smaller scores (0.92427 and0.94650). It means that there exists no single descriptor thatcan represent the overall nature of the group T. When weexamine all the combinations made of J3d, �T, rcvT , ZT, wefind that ZT takes the best score (0.95450) if we choose onlyone of the descriptors among them, a set of ZT and J3d is thebest (0.95339) for two descriptors, and a set of ZT, J3d, andL3d is the best (0.95445) for three descriptors. We note thatthe descriptor ZT has the same effect as S3d. We discussinterpretation of the result later.We can also obtain the importance of the groups from thehorizontal values above the yellow solid line in Fig. 2. Theyare the strong relevance values, or leave-one-out values of thegroups T, R, and S. For example, the group R has the value,0.87587, which is the best score when we remove all thedescriptors of the group R. The better the score is, the lessimportant the group is. The value, 0.50682, is the smallestamong them, which means that the group S is the mostimportant among the groups. On the other hand, the leastimportant group is R, the value of which is 0.87587. It meansthat the score still holds a high value even if we exclude allthe descriptors in the group R. Therefore, the importance ofgroup R is the lowest among T, S, and R.We have added additional explanation in Fig. 2. Thedescriptor J4f ð1 � gJÞ can represent the subgroup containinggJ; . . . ; J4f gJ, but the score is 0.93296, which is lower thanthe score 0.95445 of ZR. We have also added a comment onthe group of dTR. The strong relevance value is 0.95445 andthe weak relevance value is 0.95382. The facts that theirdifference is small and that the weak relevance value issmaller than the strong relevance value mean that theexistence of the group dTR makes the regression modelworse.Here, we compare the result of the subgroup relevanceanalysis shown in Fig. 2 with the best score having ndescriptors without the subgroup relevance analysis, which isshown in Table II. The set of CR, ZR, and ZT has the bestscore (0.94222) for n ¼ 3. The set of CR, ZR, ZT, and JR hasthe best score (0.95339) for n ¼ 4. The set of CR, ZR, ZT, JR,and L3d has the best score (0.95429) for n ¼ 5. Thedescriptor sets are made of the most important descriptorsin group R (ZR), group S (CR), and group T (ZT when wechoose a descriptor; J3d and ZT when we choose twodescriptors; and J3d, L3d, and ZT when we choose threedescriptors.) These combinations are the same as the analysisin the previous paragraph. Thus, the subgroup relevanceanalysis successfully illustrates the structure among thedescriptors and their importance.One may think that the difference in the scores are quitetiny. For example, 99.0% value of the global best score is0.944, which roughly corresponds to the best score with 12descriptors (see also Table I in the supporting informa-tion).18) However the predicting ability changes drastically.We plot the “RMSE” between the best models with ndescriptors in Fig. 2 in the supporting information.18) It canbe clearly seen that the prediction abilities for n ¼ 3 to 8 isqualitatively different from those for n � 9, but the differenceof the score of the best model with 9 (10) descriptors to theglobal best model is only 0.1% (0.4%). The difference in thescore looks tiny at a glance, but is meaningful in this data andregression model. (One must also discuss the total density ofstate of the scores to discuss the meaningful difference of thescores, but it is beyond the scope of this study.14–16))The ordering of the scores of the models (combinationsof descriptors) can be changed according to the details ofthe regression scheme and noise in the data, because thedifferences in the scores are quite small (Table II in the mainbody and Table I in the supporting information).18) Thus, justshowing the best models with n descriptors may give uswrong information. However the relevance analysis can giveus more significant differences. The dendrogram, or group-ing, does not depend on the scores of the models because it ismade only of the distances between the descriptors. Even ifthere exists noise in the data, which may affect the scores ofthe model, we can expect that similar descriptors will givesimilar scores. The subgroup relevance analysis can illustratehow the distances, or the similarities, between the descriptorsaffect to the models.Here, we further explain the advantage of the expressionwith the dendrogram. For example, we can easily choose rcvRif we do not want to use ZR if the importance is expressed asin Fig. 2. It enables us to find the next best route, that is, to goupward and try a new branch downward in the tree structure.We believe that this expression is much better than simplyproviding a list, and it is much easier to find out theoperation-optimized regression models.We can conclude that the descriptor CR is strongly relevantwhen we define the subgroups at d � 0 and execute theleave-one-out experiment. The original relevance analysis isthe special case of the subgroup relevance analysis. There-fore, the subgroup relevance analysis is a natural extension ofthe original relevance analysis.Here, we note the possible interpretation of the regressionmodel in the context of condensed matter physics, wherewe know that physics should depend not on J4 f but onJ4fð1 � gJÞ in the effective model Hamiltonian. We, however,found more important descriptors, e.g., ZR and rcvR in thegroup R and J3d in the group T. It is more plausible thatthe regression model found a relationship similar to thegeneralized Slater–Pauling curve for Curie temperature as afunction of CR and ZT and ZR, and that the other effects areonly marginal.24) We introduced many descriptors that cannotappear in the atomic-scale effective model Hamiltonian, andthe regression model simply selected the inter-scale regres-sion model including the macro scale parameter CR first andZT and ZR next, which do not directly appear in the effectivemodel Hamiltonian because their relationships are moreTable II. The best R2 score and descriptors as a function of the number ofdescriptors n.n Score Descriptor(s)2 0.87015 CR, ZT3 0.94222 CR, ZR, ZT4 0.95339 J3d, CR, ZR, ZT5 0.95429 L3d, J3d, CR, ZR, ZT6 0.95439 L3d, J3d, �T, CR, ZR, ZT7 0.95445 L3d, J3d, �T, CR, ZR, ZT, rcvT8 0.95445 L3d, J3d, �T, IPT, CR, ZR, ZT, rcvTJ. Phys. Soc. Jpn. 87, 113801 (2018) Letters H. C. Dam et al.113801-4 ©2018 The Physical Society of Japan©2018 The Author(s)J. Phys. Soc. Jpn.Downloaded from journals.jps.jp by （研）物質・材料研究機構 on 02/08/23apparent. It should be noted that the number of data, onlyabout a hundred, is too few to discuss the details because itcan easily change the prediction accuracy as discussed in thesupporting information.18)We cannot avoid errors in TCs because of experimentalerrors and human errors. The latter is mainly becauseAtomWork does not allow web scraping. We examine thepossibility of outlier detection using machine learning. Weshow a plot of experimental TCs versus predicted ones in thesupporting information.18) The overall coincidence is goodfrom 0K to ∼1300K, but there exist a few outliers. Wemainly check the outliers of TCs and fix the errors again andagain if there are any. We found three major errors and aminor error. After fixing these errors, we evaluated the cross-validation test scores again for the best n descriptors of theoriginal regression model. The best R2 was 0.96688. Byusing machine learning, it may be able possible to find dataerrors efficiently; however, it cannot detect data prediction ofwhich appears consistent with the experimental valuesaccidentally.We employed Pearson’s correlation coefficient to definethe distance in this study. However, there exist many choicesfor the distance. It depends on the problem whose repre-sentation is the most appropriate in the unsupervised learningpart. We use the similarity, or distance, between materials tofind the regression model, but usually discard the similaritybetween descriptors to make the regression model. We,however, utilized the latter similarity, and therefore took fulladvantage of the similarity of the data in this prescription.We showed that the distances between the descriptorsare useful to illustrate the importance of descriptors anddescriptor groups. This result is not strange when thedescriptors have some physical meaning. There exists,however, minor discrepancies in the subgroup containingZR, J3d, and L3d in the dendrogram. This is a limitation of thistheory; however, it is possible to overcome this difficulty. Weused the distance between the descriptors to explain thescores of the relevance analysis, but its inverse problem isalso possible. We can set the value of distances between thedescriptors, or the structures of the dendrogram, to be moreconsistent with the scores of the relevance analysis.We can consider many variants of the subgroup relevanceanalysis. We took the best descriptor from the subgroupshown in yellow in Fig. 2. Thus, we were able to show thebest descriptors in the subgroup. Another method is to takethe best subgroup in the downstream to a specified subgroup.Then, we will be able to understand the relationship amongsubgroups, and we can easily change them depending on thepurpose.Note that the Monte-Carlo tree search also utilizes thesame nature of tree structures. There may be a route to findout the almost best regression model by utilizing subgroupdecomposition without performing expensive exhaustivesearch.In summary, we studied the data-driven approach on theCurie temperature of rare-earth transition metal stoichiomet-ric alloys. We successfully made regression models thatachieved high scores from our descriptors. We developedsubgroup relevance analysis and successfully illustrated theimportance, relationship, and structures among the descrip-tors from a huge list of exhaustive search. In addition,it should be noted that our method makes full use of thesimilarity of the given data.Acknowledgments This work was partly supported by PRESTO and by the“Materials Research by Information Integration” Initiative (MI2I) project of theSupport Program for Starting Up Innovation Hub, both from the Japan Scienceand Technology Agency (JST), Japan; by the Elements Strategy Initiative Projectunder the auspices of MEXT; and also by MEXT as a social and scientific priorityissue (Creation of New Functional Devices and High-Performance Materials toSupport Next-Generation Industries; CDMSI) to be tackled by using a post-Kcomputer. The calculations were partly carried out on Numerical MaterialsSimulator at NIMS.1) S. Sugimoto, J. Phys. D 44, 064001 (2011).2) S. Hirosawa, M. Nishino, and S. Miyashita, Adv. Nat. Sci.: Nanosci.Nanotechnol. 8, 013002 (2017).3) T. Miyake and H. Akai, J. Phys. Soc. Jpn. 87, 041009 (2018), andreferences therein.4) A. S. Belozerov, I. Leonov, and V. I. Anisimov, Phys. Rev. B 87,125138 (2013).5) A. S. Belozerov and V. I. Anisimov, J. Phys.: Condens. Matter 26,375601 (2014).6) A. A. Katanin, A. S. Belozerov, and V. I. Anisimov, Phys. Rev. B 94,161117 (2016).7) R. Potyrailo, K. Rajan, K. Stoewe, I. Takeuchi, B. Chisholm, and H.Lam, ACS Comb. Sci. 13, 579 (2011).8) K. Rajan, Informatics for Materials Science and Engineering: Data-driven Discovery for Accelerated Experimentation and Application(Butterworth, Oxford, U.K., 2013).9) A. Agrawal and A. Choudhary, APL Mater. 4, 053208 (2016).10) A. Jain, G. Hautier, S. P. Ong, and K. Persson, J. Mater. Res. 31, 977(2016).11) Y. Liu, T. Zhao, W. Ju, and S. Shi, J. Materiomics 3, 159 (2017).12) W. Lu, R. Xiao, J. Yang, H. Li, and W. Zhang, J. Materiomics 3, 191(2017).13) L. M. Ghiringhelli, J. Vybiral, S. V. Levchenko, C. Drax, and M.Scheffler, Phys. Rev. Lett. 114, 105503 (2015).14) K. Nagata, J. Kitazono, S. Nakajima., S. Eifuku, R. Tamura, and M.Okada, IPSJ Trans. Math. Model. Appl. 8, 23 (2015).15) T. Kuwatani, K. Nagata, M. Okada, T. Watanabe, Y. Ogawa, T.Komai, and N. Tsuchiya, Sci. Rep. 4, 7077 (2014).16) H. Ichikawa, J. Kitazono, K. Nagata, A. Manda, K. Shimamura, R.Sakuta, M. Okada, M. K. Yamaguchi, S. Kanazawa, and R. Kakigi,Front. Hum. Neurosci. 8, 480 (2014).17) The values of experimental TC are taken from the AtomWork database,http://crystdb.nims.go.jp/.18) (Supplemental Material) More detailed explanations are availableonline.19) F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O.Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J.Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, andE. Duchesnay, J. Mach. Learn. Res. 12, 2825 (2011).20) T. E. Oliphant, Comput. Sci. Eng. 9, 10 (2007).21) K. J. Millman and M. Aivazis, Comput. Sci. Eng., 13, 9 (2011).22) L. Yu and H. Liu, J. Mach. Learn. Res. 5, 1205 (2004).23) S. Visalakshi and V. Radha, IEEE Int. Conf. ComputationalIntelligence and Computing Research, 2014, p. 1.24) For example, C. Takahashi, M. Ogura, and H. Akai, J. Phys.: Condens.Matter 19, 365233 (2007).J. Phys. Soc. Jpn. 87, 113801 (2018) Letters H. C. Dam et al.113801-5 ©2018 The Physical Society of Japan©2018 The Author(s)J. Phys. Soc. Jpn.Downloaded from journals.jps.jp by （研）物質・材料研究機構 on 02/08/23https://doi.org/10.1088/0022-3727/44/6/064001https://doi.org/10.1088/2043-6254/aa597chttps://doi.org/10.1088/2043-6254/aa597chttps://doi.org/10.7566/JPSJ.87.041009https://doi.org/10.7566/JPSJ.87.041009https://doi.org/10.1103/PhysRevB.87.125138https://doi.org/10.1103/PhysRevB.87.125138https://doi.org/10.1088/0953-8984/26/37/375601https://doi.org/10.1088/0953-8984/26/37/375601https://doi.org/10.1103/PhysRevB.94.161117https://doi.org/10.1103/PhysRevB.94.161117https://doi.org/10.1021/co200007whttps://doi.org/10.1063/1.4946894https://doi.org/10.1557/jmr.2016.80https://doi.org/10.1557/jmr.2016.80https://doi.org/10.1016/j.jmat.2017.08.002https://doi.org/10.1016/j.jmat.2017.08.003https://doi.org/10.1016/j.jmat.2017.08.003https://doi.org/10.1103/PhysRevLett.114.105503https://doi.org/10.1038/srep07077https://doi.org/10.3389/fnhum.2014.00480http://crystdb.nims.go.jp/https://doi.org/10.7566/JPSJ.87.113801https://doi.org/10.7566/JPSJ.87.113801https://doi.org/10.7566/JPSJ.87.113801https://doi.org/10.1109/MCSE.2007.58https://doi.org/10.1109/MCSE.2011.36https://doi.org/10.1109/ICCIC.2014.7238499https://doi.org/10.1109/ICCIC.2014.7238499https://doi.org/10.1109/ICCIC.2014.7238499https://doi.org/10.1088/0953-8984/19/36/365233https://doi.org/10.1088/0953-8984/19/36/365233