# Fileset

[s41524-025-01606-5.pdf](https://mdr.nims.go.jp/filesets/26eb6b00-c2db-41df-98c3-a23774d4c3ad/download)

## Creator

[Shunya Minami](https://orcid.org/0000-0002-3566-817X), [Yoshihiro Hayashi](https://orcid.org/0000-0002-7650-4083), [Stephen Wu](https://orcid.org/0000-0002-7847-8106), Kenji Fukumizu, Hiroki Sugisawa, [Masashi Ishii](https://orcid.org/0000-0003-0357-2832), [Isao Kuwajima](https://orcid.org/0000-0002-5994-3834), Kazuya Shiratori, [Ryo Yoshida](https://orcid.org/0000-0001-8092-0162)

## Rights

[Creative Commons BY Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/)

## Other metadata

[Scaling Law of Sim2Real transfer learning in expanding computational materials databases for real-world predictions](https://mdr.nims.go.jp/datasets/26be1168-7e03-4dc7-85b2-84ce7d70c18e)

## Fulltext

Scaling Law of Sim2Real transfer learning in expanding computational materials databases for real-world predictionsnpj | computationalmaterials ArticlePublished in partnership with the Shanghai Institute of Ceramics of the Chinese Academy of Scienceshttps://doi.org/10.1038/s41524-025-01606-5Scaling Law of Sim2Real transfer learningin expanding computational materialsdatabases for real-world predictionsCheck for updatesShunya Minami 1,5, Yoshihiro Hayashi 1,2,5, Stephen Wu 1,2, Kenji Fukumizu1,2, Hiroki Sugisawa3,Masashi Ishii 4, Isao Kuwajima4, Kazuya Shiratori3 & Ryo Yoshida 1,2,4To address the challenge of limited experimental materials data, extensive physical propertydatabases are being developed based on high-throughput computational experiments, such asmolecular dynamics simulations. Previous studies have shown that fine-tuning a predictor pretrainedon a computational database to a real system can result in models with outstanding generalizationcapabilities compared to learning fromscratch. This study demonstrates the scaling lawof simulation-to-real (Sim2Real) transfer learning for several machine learning tasks in materials science. Casestudies of threeprediction tasks for polymers and inorganicmaterials reveal that theprediction error onreal systems decreases according to a power-law as the size of the computational data increases.Observing the scaling behavior offers various insights for database development, such as determiningthe sample size necessary to achieve a desired performance, identifying equivalent sample sizes forphysical and computational experiments, and guiding the design of data production protocols fordownstream real-world tasks.Machine learning holds great potential for revolutionizing themethodologyof materials science. Recent studies have demonstrated that models trainedusing materials data can accurately predict various physicochemical prop-erties for diverse material systems1,2. Conventionally, a model defines themapping from compositional or structural features of a givenmaterial to itsthermal, electrical, mechanical, and energetic properties, as well as higher-order structural features. Assessing a large library of candidate materialsusing such models has led to the discovery of various materials, such aspolymers3, inorganic crystalline compounds4,5, high-entropy alloys6,catalysts7,8, and quasiperiodic materials9–11. The success of such data-drivenresearch depends on the quantity and quality of the data, and researchersoften face the critical issue of data scarcity. Generating experimental datarequires time-consuming, multi-stage workflows involving synthesis,sample preparation, property measurements, phase identification, andother laborious trial-and-error processes. More critically, researchers lackthe incentive to disclose their laboratory data to open communities due toconcerns regarding information confidentiality12, which hampers the co-creation of an open data foundation.Large-scale databases based on computer experiments such as first-principles calculations andmolecular dynamics (MD) simulations are beingdeveloped to overcome the barriers posed by limited experimental data. Forinorganic compounds, extensive first-principles property databases,including tens of thousands or more of crystal structures, have beendeveloped, such as Materials Project13, AFLOWLIB14, NOMAD15,OQMD16,17, and GNoME4. The QM9 database18 comprises over 130,000small organic molecules, providing molecular structures and their proper-ties obtained from quantum mechanical calculations, which serves as adataset for machine-learning-based property prediction tasks1,3. Althoughthere is currently no comprehensive computational database for polymericmaterials, RadonPy19 is being developed as a Python library for fully auto-mated all-atom classical MD simulations to generate data resources formachine learning.The methodology of transfer learning, particularly simulation-to-real(Sim2Real) transfer, enables the integration of extensive simulation datawith limited quantitative experimental data20–23. Transfer learning is bene-ficial when training a model from scratch on the target task is impracticaldue to data scarcity; it leverages data or pretrained models from a sourcedomain to enhancemachine learning tasks in a target domain.This becomesincreasingly advantageous as the relevance between the source and targetdomains increases. For example, in computer vision, Sim2Real transfer is1The Institute of Statistical Mathematics, Research Organization of Information and Systems, Tachikawa, Japan. 2The Graduate Institute for Advanced Studies,SOKENDAI, Tachikawa, Japan. 3Science & Innovation Center, Mitsubishi Chemical Corporation, Yokohama, Japan. 4Research and Service Division of MaterialsData and Integrated System, National Institute for Materials Science, Tsukuba, Japan. 5These authors contributed equally: Shunya Minami, Yoshihiro Hayashi.e-mail: yoshidar@ism.ac.jpnpj Computational Materials |          (2025) 11:146 11234567890():,;1234567890():,;http://crossmark.crossref.org/dialog/?doi=10.1038/s41524-025-01606-5&domain=pdfhttp://crossmark.crossref.org/dialog/?doi=10.1038/s41524-025-01606-5&domain=pdfhttp://crossmark.crossref.org/dialog/?doi=10.1038/s41524-025-01606-5&domain=pdfhttp://orcid.org/0000-0002-3566-817Xhttp://orcid.org/0000-0002-3566-817Xhttp://orcid.org/0000-0002-3566-817Xhttp://orcid.org/0000-0002-3566-817Xhttp://orcid.org/0000-0002-3566-817Xhttp://orcid.org/0000-0002-7650-4083http://orcid.org/0000-0002-7650-4083http://orcid.org/0000-0002-7650-4083http://orcid.org/0000-0002-7650-4083http://orcid.org/0000-0002-7650-4083http://orcid.org/0000-0002-7847-8106http://orcid.org/0000-0002-7847-8106http://orcid.org/0000-0002-7847-8106http://orcid.org/0000-0002-7847-8106http://orcid.org/0000-0002-7847-8106http://orcid.org/0000-0003-0357-2832http://orcid.org/0000-0003-0357-2832http://orcid.org/0000-0003-0357-2832http://orcid.org/0000-0003-0357-2832http://orcid.org/0000-0003-0357-2832http://orcid.org/0000-0001-8092-0162http://orcid.org/0000-0001-8092-0162http://orcid.org/0000-0001-8092-0162http://orcid.org/0000-0001-8092-0162http://orcid.org/0000-0001-8092-0162mailto:yoshidar@ism.ac.jpwww.nature.com/npjcompumatscrucial for adapting vision models trained in simulation environments toreal-world applications, such as autonomous vehicles, by leveraging insightsgained from simulated environments. Sim2Real transfer is also widely usedin materials research. For instanceWu et al.3, developed a predictive modelfor the thermal conductivity of polymeric materials using experimentallyobserved data for 28 amorphous polymers. Leveraging a large dataset ofspecific heat capacity generated through quantum chemistry calculations asthe source task, they successfully derived the Sim2Real-transferredmodel inthe target domain. Similarly Aoki et al.2, employed a machine learningframework called multitask learning to integrate a quantum chemistrydataset with biased and quantitatively limited experimental data, success-fully building a predictive model for polymer–solvent miscibility for a widerange of chemical spaces. Ju et al.24 employed transfer learning to build amodel predicting lattice thermal conductivity of inorganic crystallinematerials.With only 45 samples for the target property, ordinary supervisedlearning failed tomeet accuracy requirements. To address this, they utilizedthe first-principles calculations of scattering phase space as the source taskand applied transfer learning to achieve sufficient accuracy.Given the inherent domain gap between computer experiments andreal-world systems, it is uncertain whether increasing the volume of simu-lation data enhances the generalization performance of Sim2Real-transferred models. To clarify this Mikami et al.25, provided theoreticaland experimental evidence showing that the generalization of Sim2Realtransfer learning improves according to a power-law relationship with theexpansion of simulation data. Specifically, experimental validation of thescaling lawwas demonstrated for the Sim2Real scenario in computer visiontasks. Observing the scaling behavior of Sim2Real transfer, and estimatingits convergence rate and asymptotic behavior offer valuable insights foradvancing database development.Here, we present a statistical measure for quantitatively evaluating thetransferability and scalability of a growing computational materials database.Our work reveals the existence of a scaling law in transfer learning acrossdiverse prediction tasks in materials research involving polymers and inor-ganic material systems. Specifically, it encompasses three scenarios: (1)Sim2Real prediction of polymer properties by re-purposing neural networkspretrained with all-atom classical MD simulations; (2) multitask machinelearning integrating expansivedata fromquantumchemistry calculations anda limited experimental dataset to predict the miscibility of polymer–solventbinary mixtures; and (3) valification of the Wiedemann–Franz (WF) lawbetween thermal and electrical conductivities of inorganic material systemsthrough transfer learning. Notably, since both the source and target datasetsin the thirdcasewereobtained fromreal experiments, the concept shownherecan extend beyond Sim2Real scenarios. By experimentally observing thescaling behavior of transferred predictors, we can estimate their expectedgeneralizationperformanceupon further increasing the volumeof simulationdata, serving as an indicator of the database’s potential value. Moreover,multidimensional scaling, considering both physical and computer experi-ments, provides a statistical estimate for the equivalent sample size ofexperimental and computational data. This aids in decision-making for thedesign of data production protocols. Additionally, by observing the scalingbehavior of individual materials, we can individualize database designguidelines and gain insights into the existence of material groups that sharephysical mechanisms across different material systems.ResultsOutlineSim2Real transfer learning involves adapting a predictive model that ispretrained in a virtual environment to real-world scenarios. In materialsresearch, the predictor defines a mathematical mapping from a descriptorrepresenting the composition or structural features of a givenmaterial to itsphysicochemical properties. In the source task, the model is trained using adataset of size n generated from computer experiments, such as first-principles ab initio calculations or MD simulations. In the target task, thispretrained model is repurposed and transferred to predict experimentallyobserved properties, utilizing an experimental dataset of sizem, wherem istypically much smaller than n. Mikami et al.25 presented a general theory,under certain assumptions, stating that in the fine-tuning of neural net-works, the generalization error E½Lðf n;mÞ� with the squared loss L(f) of atransferredmodel fn,m for the real-world system is bounded fromabove by afunction R(n,m):Rðn;mÞ :¼ ðAn�α þ BÞm�β þ ϵ; ð1Þwhere A, B, α, β, ϵ ≥ 0 are constants independent of n, m. In particular,considering the case of a fixed number of experimental samples at m, theupper bound for the generalization error is expressed as follows:E½Lðf n;mÞ�≤RðnÞ :¼ Dn�α þ C; ð2ÞwhereD≔ Am−β and C≔ Bm−β + ϵ. According to this law, as n increases,the generalization error of predicting experimentally observedproperties forthe transferred network converges to a reachable limit C ≥ 0, called thetransfer gap, with a decay rateα ≥ 0.Mikami et al.25 demonstrated that thesepower-law relations hold empirically in Sim2Real transfer in computervision tasks.While increasing the data size for pretraining, the reduction in thegeneralization error of the transferred model is measured experimentally,which can be used to evaluate whether the transferred model can attain thedesired prediction performance in the target task or to estimate the requiredsample size based on the estimated (D, α, C). Additionally, the observedscaling behaviors provide guidelines for designing the source database. For apredefined set of downstream tasks leveraging the database, the simulationenvironment can be tailored to accelerate scaling to real systems, such asselecting empirical interatomic potentials or polymerization degrees inMDsimulations. For instance Mikami et al.25, applied Sim2Real transfer forcomputer vision tasks and showed that intentionally increasing the diversityof the appearance, luminosity, andbackground in a synthetic image set leadsto an increase in the scaling factor α and a partial improvement in C.Creating a data-generation scheme that results in a negligible C is the ulti-mate objective in developing foundational source data. Intuitively, theconsistency of simulations to real-world scenarios and the methodologyemployed in transfer learning mainly affect the magnitude of C.In the following sections, we describe the benefits and utility of ana-lyzing scaling laws in transfer learning, based on three case studies of dif-ferent material systems and their databases.Polymer property predictions with Sim2Real transfer learningusing MD simulationsWe demonstrate that the scaling law of Sim2Real transfer learning holds inpolymer property prediction using all-atom classical MD simulations. Thetarget properties to be predicted are the refractive index, density, specificheat capacity at constant pressure (CP), and thermal conductivity. UsingRadonPy19, an open-source Python library developed to fully automateMDsimulations of polymeric materials using LAMMPS (large-scale atomistic/molecular massively parallel simulator)26, we generated a source datasetcomprising the four physical properties of approximately 7 × 104 amor-phous polymers (seeTable S1 for the number of samples).Details of theMDcalculations are provided in the Methods section. We randomly selected nsamples from this dataset for the pretraining of neural networks, where nwas varied across 10 equally spaced points on a logarithmic scale between100 and the maximum number of samples.The property predictor used a 190-dimensional descriptor vector thatrepresents the compositional and structural features of the chemicalstructure of a polymer repeating unit. This vectorized polymer wasmappedto each property using a conventional fully connected multi-layer neuralnetwork (see Fig. 1a and the Method section). With experimental data, wefine-tuned each pretrained neural network to a predictor of the experi-mental properties. The experimental datasets were extracted from thePoLyInfo database27–29. The number of polymers in each property datasetwas 234 for refractive index, 607 for density, 104 for CP, and 39 for thermalhttps://doi.org/10.1038/s41524-025-01606-5 Articlenpj Computational Materials |          (2025) 11:146 2www.nature.com/npjcompumatsconductivity. To transfer a pretrained model to each target domain, werandomly selected 80% of the experimental datasets and evaluated themodel’s predictive performanceon the remaining samples. This processwasrepeated 500 times independently for each n, observing scaling behaviorswith the average of the mean absolute errors (MAEs) with their 90% con-fidence interval calculated by performing bootstrapping sampling.As shown in Fig. 1b, the empirical generalization error for theexperimental refractive index decays almost linearly on a logarithmic scaleacross the observed range of n. The parameters for power-law scaling wereestimated as D = 0.0684, α = 0.0145, and C = 1.75 × 10−11. For the density,the prediction error also linearly decreases, and as n grows infinitely large,the MAE is expected to approach zero. The prediction error for CP linearlydecreases until around n = 104, after which the decay begins to slow.Regarding the thermal conductivity, the generalization performance rapidlyimproves until n < 104, followed by a plateau as n further increases. Insummary, all tasks are notably scaled as the volume of MD-calculated dataincreases. Moreover, as shown in Fig. 1b, the generalization performance oftransfer learning notably surpasses that of direct learning without transfer.The potential cross-domain transferability becomes more evident whencontrasting direct and transfer learning based on scaling behavior ratherthan at a fixed n.For the refractive index and density, the MD-calculated values exhibitremarkably high consistency with the experimental observations from ourprevious study19. Therefore, the observed strong scaling is verified becauseincreasing the amount of simulation data directly improves the general-ization performance for real-world scenarios. The CP calculations with theclassical MD simulations (neglecting quantum effects) introduced sys-tematic biases compared to the corresponding experimental values, result-ing in significant overestimation of the CP19. Furthermore, the effect ofrandom sampling of the initial structures in the MD simulations is morepronounced for the MD-calculated CP than for the refractive index anddensity. Similarly, there are slight systematic biases and inherently largefluctuations in theMD-calculated thermal conductivity values.Hence, thesefindings suggest that simulationuncertainty due to the randomness of initialstructures is one of the critical factors influencing the scaling strength. Thisaspect will be discussed in more detail later.Furthermore, we investigated the relationship between the similarity ofpolymers in the source and target datasets and the scaling behavior. Byartificially creating multiple source datasets with varying degrees of tasksimilarity (Fig. S1 in the Supplementary Information), we observed thescaling behavior. The results showedno significant differences (Fig. S2 in theSupplementary Information). This suggests that a greater structural simi-larity between the datasets does not necessarily lead to stronger scaling.Here, we discuss the multidimensional scaling of simulated andexperimental data. We examined the scaling behavior of Sim2Real predic-tions for the density while simultaneously varying the sizes of simulationand experimental datasets, as shown in Fig. 2. The empirical generalizationerrors for both types of data show a monotonic decreasing trend. In parti-cular, the increase in the size of the experimental dataset results in a sig-nificantly larger gain than the increase in simulationdataset size.Thepower-law curve in Eq. (1) was estimated as follows:Rðn;mÞ ¼ ð0:0192þ 0:338n�0:265Þm�0:239 þ 0:0535 ð3ÞAs predicted theoretically, the scaling effect on the simulation data weakensas the experimental dataset size increases. Likewise, itwas confirmed that thescaling effect on experimental data also decreases as the amount of simu-lation data increases.Furthermore, by applying the concept of amarginal rate of substitutionfrommicroeconomics30 to this estimated surface, we estimated the numberof simulation samples equivalent to one experimental sample. For example,at the currentmaximum sample sizes of n = 71, 068 andm =601, the partialderivatives of the estimated error function, analogous to marginal utility ineconomic theory, are given as follows:∂R∂n����n¼71068;m¼601¼ �1:41× 10�8;∂R∂m����n¼71068;m¼601¼ �3:13 × 10�6:a bFine-tuningMD calculation ExperimentLinear (32)DescriptorLinear (32)DescriptorLinear (64)ReLULinear (128)ReLULinear (256)ReLULinear (512)ReLULinear (64)ReLULinear (128)ReLULinear (256)ReLULinear (512)ReLUFig. 1 | Transfer learning of polymer property predictions using all-atomclassicalMD simulations. a Neural network architecture. b Scaling behavior of Sim2Realtransfer for four different properties, namely refractive index, density, specific heatcapacity (CP), and thermal conductivity. The horizontal axis represents the simu-lation data size, and the vertical axis shows theMAE averaged over 500 independenttrials with 90% confidence interval calculated by performing bootstrapping sam-pling. The dashed line is the estimated power-law with the estimated equation givenat the bottom left, and the horizontal red line indicates the mean MAE for directlearning with no pretraining.https://doi.org/10.1038/s41524-025-01606-5 Articlenpj Computational Materials |          (2025) 11:146 3www.nature.com/npjcompumatsTaking the ratio of these coefficients provides an estimate of the marginalrate of substitution between experiments and simulations. Specifically, onthe set of (m, n) pairs that maintain the same level of generalization errorR(n, m) = r (referred to as indifference curves in microeconomics), themarginal rate of substitution dm/dn is given by dmdn ¼ � ∂R∂n =∂R∂m (see theMethod section). In this case, 221 simulation samples are equivalent to oneexperimental sample.Sim2Real multitask learning for polymer–solvent miscibilityWhile the theoretical implications presented byMikami et al.25 were derivedunder the assumption of neural fine-tuning, here we explored the scalingbehavior for Sim2Realmultitask learning scenarios. The task is to predict theFlory–Huggins χparameter between any givenpolymer and solvent, which isa critical dimensionless quantity governing the miscibility ofpolymer–solvent binary mixtures. The dataset comprises χ parameters for9575 polymer–solvent pairs calculated viaCOSMO-RS simulations based ondensity functional calculations31, which were generated in our previouswork2, and 1,190 experimentally observed χ parameters for 766 uniquepolymer–solvent pairs compiled from Orwoll and Arnold32. Aoki et al.2demonstrated that integrating both simulated and experimental χparametersintomultitask learning significantly enhanced the generalization capability ofthe resulting predictors for experimental χ parameters. In particular, thisstrategy effectively addressed limitations of molecular diversity, data sizeconstraints, and inherent distributional biases in the experimental dataset.We slightly modified the model structure developed by Aoki et al.2 asthe multitasking network architecture inspired by domain knowledge,known as the Hansen solubility parameter, as shown in Fig. 3. The 325-dimensional descriptor encodes the chemical structure of a given polymerrepeating unit or solvent. Specifically, it comprises a 190-dimensional kernelmean force field descriptor33 and a 135-dimensional RDKit descriptor34,where irrelevant features with zero variance within the given dataset wereremoved from the 207-dimensional RDKit descriptor. Additionally, weincluded a binary flag representing polymer (1) or solvent (0), temperatureT, and its inverse 1/T as additional inputs. For a given polymer or solvent,the input descriptor was passed through three hidden layers to map it to a32-dimensional latent space. The distance between the polymer and solventin this latent space was calculated, and two separate head networks wereemployed to output the experimental and simulated χ parameters. See theMethods section for further details on the model structure.To assess the generalization performances, 20% of the experimentaldatawas randomly allocated as a test set, while the remaining samples, alongwith n randomly selected simulation data points, were used for modeltraining. This procedure was repeated 100 times independently with dif-ferent data splits. The n ranged across 10 evenly spaced points on a loga-rithmic scale within the interval [100, 9575].Figure 4 a shows the observed scaling curves, which exhibit stronglinear decay in the generalization error on a logarithmic scale. The estimatedCwas 4.52 × 10−10, suggesting that expanding COSMO-RS simulations canyield high-performance predictors for real systems. This observation sug-gests that scaling behavior occurs even in multitask learning, although theSim2Real scaling law was theoretically derived for fine-tuning scenarios inMikami et al.25.Here, we discuss the multidimensional scaling of multitask learning.Figure 4b, c. describe the observed scaling behaviors when simultaneouslyvarying the sizes of simulation and experimental datasets. Interestingly,unlike the fine-tuning results shown in the previous section, in multitasklearning, as the experimental dataset size increases, the absolute gradient ofthe scaling curve becomes steeper. Similarly, by increasing the simulationdataset size, the improvement in generalization performance per experi-mental sample also increases. In other words, the experimental resultssuggest amechanismwhere simulations and experimentsmutually enhancetheir impact on improving generalization performance through synergisticeffects.This observation differs from the theoretical implication of Eq. (1).This is thought to be due to the difference in the choice of fine-tuning andmultitask learning. Consequently, the parameter estimation for two-dimensional scaling with Eq. (1) is invalid. Instead, the equivalent samplesize was calculated by approximating the gradient based on the observedincrement of the MAE for increasing simulation and experimental datasetsizes. The estimated scaling curves and the gradients at the current datasetsizes (n = 9129 andm = 612) were computed respectively as follows:Rðn; 612Þ ¼ 0:239n�0:0348 þ 4:52× 10�10; ∂Rðn; 612Þ∂n���n¼9129¼ �6:63× 10�7;Rð9129;mÞ¼ 0:863n�0:240 þ 6:36× 10�16; ∂Rð9129;mÞ∂m���m¼612¼ �7:25× 10�5:By taking the ratio of these gradients, the marginal rate of substitution wasestimated to be 109, indicating that for the χ parameter prediction task, onereal-world experiment is worth 109 COSMO-RS simulations.Although we investigated whether the overall generalization perfor-mance scales across various materials as a whole, it is important to verifywhether individual materials scale or not. Figure 5a summarizes thea bFig. 2 |Multidimensional scaling of Sim2Real transfer learning, illustrated by thedensity prediction of amorphous polymers. a Scaling to increase the amount ofsimulation data across various experimental dataset sizes, and b scaling to increasethe amount of experimental data for different sizes of simulation datasets. Each linerepresents the MAE averaged over 500 independent trials.https://doi.org/10.1038/s41524-025-01606-5 Articlenpj Computational Materials |          (2025) 11:146 4www.nature.com/npjcompumatsobserved scaling behaviors to increasing n for different polymer classes,where test instances of polymer–solvent pairs were classified into 11 classesbased on structural features of the polymers. The generalization perfor-mances of chloropolymers, polyvinyls, polyacrylates, polyethers, andpolystyrenes were strongly scaled, while, other materials showed almost noimprovement, e.g., polyacrylamides. Observing the scalability of eachmaterial class provides valuable insights for planning data generation. In thedevelopment of a simulation database, limited computational resourcesshould be allocated primarily to scalable polymer classes. For non-scalablepolymer classes, some modifications are needed in the data productionprotocol to improve scalability. To devise strategies, it is important toidentify the governing factor that determines the scalability. Fig. 5b showsparity plots of χ parameters obtained from COSMO-RS simulations andexperimental values for each polymer class. In comparison with the scalingbehaviors shown in Fig. 5a, it is evident that the observed scalability ofpolymer species can be largely explained by the predictive capability ofCOSMO-RS simulations. However, in some polymer classes such as poly-styrenes and polyesters, where the predictive ability of COSMO-RS simu-lations is weak, the generalization performance of transferredmodels scalesstrongly, reaching levels far beyond those of the simulations. This indicatesthe essence of Sim2Real transfer. Additionally, it is important to note thatmany of the non-scalable polymer classes had extremely limited experi-mental data (see Fig. 5a). For example, the m values for celluloses andpolyacrylamides are 55 and 29, respectively. In such cases, model traininga b cFig. 4 | Scaling law observed in the Flory–Huggins χ parameter prediction task.a Scaling behavior when increasing the simulation dataset size. The horizontal axisrepresents the number of polymer--solvent pairs used as the simulation dataset, andthe vertical axis shows the average MAE of 100 independent trials with 90% con-fidence interval calculated via bootstrapping. The dashed line is the estimatedpower-law with the estimated equation given at the bottom left, and the horizontalred line indicates the averageMAE for direct learning without pretraining. b Scalingbehaviors across different sizes of experimental data, and c scaling to increase theexperimental dataset for different simulation dataset sizes. Each line shows theaverage MAE over 100 trials.Fig. 3 | Model architecture of Sim2Real multitasklearning used for predicting the Flory–Huggins χparameter. The multitasking network architecturewas inspired by domain knowledge, known as theHansen solubility parameter, by calculating thedistance between the polymer and solvent in the 32-dimentional latent space.FlagDescriptorPolymer Solvent TemperaturetanhHidden feature Hidden featureDistanceComputational χ parameterExperimental χ parameterTemperaturePolymer SolventFlagTemperatureDescriptor DescriptorDescriptorShared layersLinear (32)Linear (32) Linear (32)Linear (32)ReLULinear (32)ReLULinear (128)ReLULinear (256)ReLUhttps://doi.org/10.1038/s41524-025-01606-5 Articlenpj Computational Materials |          (2025) 11:146 5www.nature.com/npjcompumatsbecomes extremely challenging. Furthermore, empirical generalizationerrors approximated with the small sample sets may significantly deviatefrom true generalization errors. Even if a polymer class appears to be non-scalable, it cannot be conclusively determined that there is no transferabilityor scalability.Transfer learning for thermal and electrical conductivity of inor-ganic materialsBy definition, the scaling laws of transfer learning hold not only for Sim2-Real scenarios but also for real-to-real (Real2Real) transfer scenarios. Here,we highlight an interesting aspect of the scaling analysis by showing theReal2Real scaling behavior in transfer learning from thermal conductivity toelectrical conductivity for inorganic compounds.We compiled a dataset from Starrydata35, comprising 5910 inorganiccompounds with experimentally observed thermal conductivities and 3640compoundswith experimental electrical conductivities, all derived at 300K.Starrydata is a comprehensive experimental database of thermoelectricmaterials that was collected from published papers. Figure 6b illustrates thedependency and discrepancy between the two physical properties across1757 materials, where both thermal and electrical conductivity measure-ments were obtainable. According to the WF law36, the ratio of thermalconductivity (κ) to electrical conductivity (σ) of a metal is proportional totemperature (T), expressed as:κσ¼ LT;where L = 2.44 × 10−8WΩK2 is the Lorentz number. The gray line in Fig. 6bdepicts the WF law on the joint distribution of thermal and electrical con-ductivities at 300 K. While the WF law holds for metallic materials, wherethe free electrons aremainly responsible for both of these properties, it doesnot necessarily hold for non-metallicmaterials. Since the data includes bothmetallic and non-metallic materials, some of the samples deviated fromthis line.Formodel building, we encoded the compositional features of an inputcompound into a 580-dimensional kernel mean descriptor33. Subsequently,a fully connected neural network was pretrained to learn themapping fromthe vectorized composition to thermal conductivity. The network archi-tecture is illustrated in Fig. 6a. The n used to train the thermal conductivitypredictorwas increased logarithmically in 10 steps from100 to5910.Duringthe transfer learning phase, 80% of the electrical conductivity data wererandomly selected for fine-tuning, and the remaining 20% served as the testset to evaluate the performance of the transferred model. This procedurewas repeated 500 times with different randomly selected sample sets.Figure 6c shows the observed scaling behaviors. The predictive per-formance improved linearly on the logarithmic scale, with the estimatedpower-law function 1.44n−0.0836+ 8.05 × 10−7. Since the WF law holds formetallicmaterials, the transfer is expected to bemore successful formetallicmaterials than for non-metallic ones. To investigate the difference in thetransferability formetallic and non-metallicmaterials, we extracted samplesthat followed theWF law (blue square dots in Fig. 6b) and those that deviatefrom it (green triangular dots in Fig. 6b). Fig. 6c shows the scaling behaviorsseparately for each of the two sample sets. As expected, strong scaling wasobserved for materials for which the WF law holds, but for non-metallicmaterials that deviate from this law, increasing the amount of thermalconductivity data did not improve predictive performance.Furthermore, observing transferability individually for differentmaterials allows us to infer the presence or absence of common physicalmechanisms between different physical systems. Fig. 6e illustrates thescalingbehaviors of all test cases,whichclearly distinguishesmaterial speciesa bMAENumber ofsimulation samplesExperimentalvalueSimulated valueFig. 5 | Observation of Sim2Real scaling behaviors and predictive capability ofCOSMO-RS simulations for different polymer classes in the χ parameterprediction task. aObservation of Sim2Real scaling behaviors for different polymerclasses in the χ parameter prediction task. Test instances of polymer–solvent pairswere classified into 11 classes based on structural features. Them value is denoted inthe upper-right corner of each panel. b Predictive capability of COSMO-RS simu-lations (horizontal axis) against experimental values (vertical axis) for each of the 11polymer classes in the χ parameter predictions.https://doi.org/10.1038/s41524-025-01606-5 Articlenpj Computational Materials |          (2025) 11:146 6www.nature.com/npjcompumatswhere transfer does or does not scale. The presence or absence of scalinglaws and the observed scaling strength could be used to characterize indi-vidual materials. Moreover, observing individual transferability providesvaluable insights for planning data generation. The overall average perfor-mance transitioned into a plateau around n = 4 × 103 (Fig. 6c). However,there were severalmaterial groups where predictive performance continuedto improve logarithmically; for example, metallic materials. (Fig. 6d, e).Intuitively, it would be efficient to halt the production of source data formaterial groups where improvement has plateaued and reallocate resourcesto groups more likely to scale. Analyzing only the overall average general-ization performance overlooks the existence of material groups with thepotential to scale even further.DiscussionThis study discussed the significance and utility of analyzing the scalabilityin Sim2Real and Real2Real transfer learning in materials science. Acrossdiverse case studies encompassing polymers and inorganic materials, it wasconsistently observed that as the size of the computational pretraining dataset increases, the prediction error relative to the experimental data improvesaccording to a power-law relationship. These findings highlight theimportance of synergistic effects between computational and experimentalapproaches. By observing the scaling law for Sim2Real transfer, we canestimate the required size of computational datasets to achieve the desiredpredictive performance in downstream real-world tasks. Additionally, weprovide a microeconomic framework for determining the optimal alloca-tion of computational and experimental resources during the creation ofdata platforms by analyzing multi-dimensional scaling behaviors. Thisapproach guides decisions related to the allocation of resources for datacollection efforts for maximum impact.The scaling laws of transfer learning provide guiding principles fordesigning computational databases. It is desirable to create transferablecomputational databases that scale the generalization performance ofdownstream tasks for specified target tasks in the real-world domain.Alternatively, it is important to discover real-world tasks and analyticalworkflows that can be transferred scalably from computational databases.While various computational material-property databases have beendeveloped to date, there are no reported cases of the values being quantifiedfrom the perspective of scaling laws. Strong scalability of transfer to diversereal-world tasks serves as a measure of the usefulness of the computationaldatabase.aThermal conductivityElectrical conductivitybc d eFine-tuningLinear (32)DescriptorLinear (32)DescriptorLinear (64)ReLULinear (128)ReLULinear (256)ReLULinear (512)ReLULinear (64)ReLULinear (128)ReLULinear (256)ReLULinear (512)ReLUFig. 6 | Real2Real transfer learning from thermal to electrical conductivities.aModel architecture. b Parity plot showing values of electrical conductivity (hor-izontal axis) and thermal conductivity (vertical axis) at 300 K on a logarithmic scale.The dashed line represents the WF law. Blue square dots represent the top 10% ofcompounds with the smallest deviation from theWF rule (WF samples), while greentriangular dots correspond to the top 10% of compounds with the largest deviation(non-WF samples). c Scaling behavior. The horizontal axis represents the sourcedata size of the electrical conductivity prediction task, while the vertical axis showsthe averageMAE for the thermal conductivity prediction task over 500 independenttrials (solid line) with the standard deviation and 90% confidence interval calculatedusing bootstrapping. The black dashed line represents the estimated power-law(equation provided at the bottom left), and the red dashed line indicates theMAE fordirect learning. d Scaling behavior for the two extracted datasets. The line colorcorresponds to the color of the dots in the parity plot in (b). e Scaling behavior foreach of the 3640 compounds in the dataset.https://doi.org/10.1038/s41524-025-01606-5 Articlenpj Computational Materials |          (2025) 11:146 7www.nature.com/npjcompumatsIt is important to see that discrepancies always exist between simulatedand experimental properties. Additionally, experimental data are subject tobiases andfluctuationsdue tounobserved factors related to the experimentalconditions, sample fabrication, noise inmeasurement systems, and selectionbias of the researchers. Therefore, transfer learning plays a key role inbridging the gap between complex and uncertain real-world scenarios andimperfect computational models. To this end, it is crucial to explicitlydemonstrate the transferability and benefits of expanding datasets todownstream tasks. Finding a scheme with the scalability of Sim2Realtransfer is a goal of developing materials databases using simulated data.MethodsPoLyInfo polymer property datasetsExperimental propertydatasets for refractive index, density,CP, and thermalconductivity were extracted from the polymer property databasePoLyInfo27–29. The data for density, specific heat capacity, and thermalconductivity were restricted to measurements in amorphous states nearroom temperature (273–323K). The number of polymers in each propertydataset was 39 for thermal conductivity, 104 forCP, 607 for density, and 234for refractive index.RadonPy polymer property datasetsToconstruct the simulationdatasets, all-atomclassicalMDsimulationswereconducted using RadonPy, a Python library that automates polymer prop-erty calculations through high-throughput MD simulations19. Input para-meters include the chemical structure of polymer repeating unitsrepresented by a simplified molecular input line entry system (SMILES)37,polymerization degree, number of polymer chains forming a simulation cell,temperature, and pressure. The automated calculation workflow consists ofthe following steps: (1) conformation search for a monomer with the givenrepeating unit, (2) calculation of electronic properties such as atomic chargesusing the density functional theory (DFT) method, (3) search for initialconfiguration of polymer chains using the self-avoiding random walkalgorithm, (4) assignment of force field parameters using the general Amberforce field version 2 (GAFF2), (5) generation of isotropic amorphous cells,(6)MD simulations for equilibration, (7) determination of whether to reachequilibrium, (8) non-equilibrium MD (NEMD) simulations for thermalconductivity calculation, and (9) property calculation in the post-processingstage. The DFT calculations and the MD simulations were executed usingPsi438 and LAMMPS, respectively, within the RadonPy interface. Anamorphous cell was created with 10 polymer chains comprising approxi-mately 10,000 atoms. Following the initial configuration of polymer chainsusing the self-avoiding random walk and a 1 ns NVT simulation, thesimulation cell was packed isotropically to achieve a density of 0.8 g × cm−3at 700 K. The amorphous cell was equilibrated following Larsen’s 21-stepcompression/decompression equilibration protocol39, undergoing tem-perature cycles between 300 and 600 K, repeating the ascent and descent forstabilization. After completing the 21-step equilibration process, NpTsimulations were conducted for over 5 ns at 300 K and 1 atm until reachingequilibrium. The property calculation methods for the density, specific heatcapacity at constant pressure, refractive index, and thermal conductivitywere described previously19 and detailed in the Supplementary Information.In collaboration with an academia–industry consortium, we generatedthe property datasets of approximately 7 × 104 linear polymers in amor-phous states using RadonPy (see Table S1). The virtual polymers weregenerated using anN-gram-based polymer structure generator40 for each ofthe 20 polymer classes, such as polyimides, polyesters, and polystyrenes,following the classification rule established by PolyInfo. The chemicalstructureXof an existing compoundused in the trainingdataset is describedby the SMILES representation, whereX is represented by a string of length pasX= x1x2…xp. By using the string set of synthesized polymers belonging toeach polymer class, an N-gram language model was trained to obtain astructure generator that mimicked the patterns, such as frequent fragmentsand appropriate chemical bonding rules, observed for the existing polymers.The 20 class-specific SMILES generators were used to create the virtuallibrary. A list of the 20 polymer classes with their dataset sizes is provided inTable S1.Equivalent sample size for experimental and simulation dataDifferentiating the generalization error R(n,m) in Eq. (1) with respect to nand m, we obtain the following expression:dRðn;mÞ ¼ ∂R∂ndnþ ∂R∂mdm: ð4ÞOn the set of equivalent samples (n,m) that maintain the same level of R(n,m) = r, R(n, m) remains constant at r, thus satisfying dR(n, m) = 0. There-fore, dmdn ¼ � ∂R∂n =∂R∂m holds.Polymer–solvent solubility datasetsAoki et al.2 used the experimental values of the χ parameter for 1190polymer–solvent pairs, consisting of 46 different polymers and 140differentsolvent molecules, to train the model. The data were compiled from asupplementary table of Orwoll and Arnold32. The dataset also includedmeasurements of the χ parameter for different temperatures andpolymer–solvent compositions. The molecular species of the polymers/solvents in the dataset were distributed over a limited region of the entirechemical space. In addition, in certain experimental systems, it is difficult tomeasure the χ parameters of the polymer–solvent system in an immisciblestate, resulting in a significant bias in the distribution of the data. Therefore,models trained using only this dataset generally have narrow predictiveapplicability.To tackle this issue, we utilized the COSMO-RS simulation41–44 togenerate a dataset of χ parameters for 9129 pairs of polymers and solvents atthe BP-SVP-AM1 level2. The calculations were performed using theTURBOMOLE45 and COSMOtherm46 software packages for creatingCOSMO files by density functional calculations and the calculations of χparameters from the COSMO files, respectively. For polymers, a structurecomprising three repeating units was created, in which the two endpointswere replaced by methyl groups. After creating the COSMO files, theCOSMOmeso functionwas executed to calculate the χ parameters using theactivity coefficients obtained from the COSMO files.Data preprocessingIn all experiments, variable transformations were applied to the modelinputs and outputs to enhance the efficiency of machine-learning modeltraining. Themethods included logarithmic transformation, normalization,Yeo–Johnson transformation47, and min–max transformation (i.e., scalingeach feature to a range of [0, 1]). These methods were tailored for the inputand output variables in each of the three applications, as summarized inTable 1.Table 1 | Data preprocessing methods for input and outputvariables in three different applications: polymer propertyprediction using RadonPy (RadonPy), multitask learning ofpolymer–solvent miscibility (χ parameter), and transferlearning between thermal and electrical conductivities usingStarrydata (Starraydata)RadonPy χ parameter StarrydataInput log→ norm→ YJ FFKM1: log→ norm→YJ → 0-1log → norm → YJRDKit + temp + inv-temp2: 0-1Output norm → YJ norm → YJ norm → YJ → 0-1A combination of four methods — logarithmic transformation (log), normalization (norm),Yeo–Jhonson transformation (YJ), and min–max transformation (0-1)— was applied in the orderdescribed below.1 Force-field kernel mean (FFKM) descriptor33.2 Concatenation of RDKit descriptor34 (RDKit), thermodynamic temperature (temp), and inversetemperature (inv-temp).https://doi.org/10.1038/s41524-025-01606-5 Articlenpj Computational Materials |          (2025) 11:146 8www.nature.com/npjcompumatsModel fine-tuningIn Sim2Real transfer learning using RadonPy and transfer learning betweenthermal and electrical conductivities, neural fine-tuning was employed.Specifically, theweights of theneural networkspretrainedon the sourcedata(MD-calculated property data or experimental data for thermal con-ductivity) were used as initial values and updated with the target data(experimental observation of polymeric properties or electrical conductivitydata). Thehyperparameters, such as learning rate andbatch size, are listed inTable 2.Multitask learningFor the χ parameter prediction task, we employed multitask learning withempirical risk minimization as follows:minfλjDsimjPðχ;T;p;sÞ2Dsimχ � f ðT; p; sÞ� �2 þ 1�λjDexpjPðχ;T;p;sÞ2Dexpχ � f ðT; p; sÞ� �2;where Dsim and Dexp denote the dataset of χ parameters obtained by theCOSMO-RS simulation and the experimental dataset, respectively. Theneural network model f(T, p, s) is a function of temperature T, polymer p,and solvent s. The first term fits the simulated dataset, while the second termfits the experimentally observed dataset. The hyperparameter λ controls therelative importance between these two terms. In this study, we set λ = 0.5,which is consistent with the value employed in Aoki et al.2, resulting inlearning from both simulation and real systems with equal importance.Other hyperparameters are listed in Table 2.Data availabilityThe data supporting the findings of this study will be made available uponreasonable request to the corresponding author. The datasets of theexperimental and computational χ parameters can be accessed via Figsharehttps://github.com/yoshida-lab/MTL_ChiParameter. The datasets of ther-mal and electrical conductivity are accessible through Starrydata https://figshare.com/projects/Starrydata_datasets/155129.Code availabilityThe code for multitask learning for the χ parameter prediction task isavailable at theGithubhttps://github.com/yoshida-lab/MTL_ChiParameter.Other codes are available upon reasonable request to the correspondingauthor.Received: 30 August 2024; Accepted: 12 March 2025;References1. Yamada, H. et al. Predicting materials properties with little data usingshotgun transfer learning. ACS Cent. Sci. 5, 1717–1730 (2019).2. Aoki, Y. et al. Multitask machine learning to predict polymer–solventmiscibility using Flory–Huggins interaction parameters.Macromolecules 56, 5446–5456 (2023).3. Wu, S. et al. Machine-learning-assisted discovery of polymers withhigh thermal conductivity using a molecular design algorithm. NpjComput. Mater. 5, 66 (2019).4. Merchant, A. et al. Scaling deep learning for materials discovery.Nature 624, 80–85 (2023).5. Szymanski, N. J. et al. An autonomous laboratory for the acceleratedsynthesis of novel materials. Nature 624, 86–91 (2023).6. Rao, Z. et al. Machine learning–enabled high-entropy alloy discovery.Science 378, 78–85 (2022).7. Zhong, M. et al. Accelerated discovery of CO2 electrocatalysts usingactive machine learning. Nature 581, 178–183 (2020).8. Kim, M. et al. Artificial intelligence to accelerate the discovery of N2electroreduction catalysts. Chem. Mater. 32, 709–720 (2019).9. Liu, C. et al. Machine learning to predict quasicrystals from chemicalcompositions. Adv. Mater. 33, 2102507 (2021).10. Liu, C. et al. Quasicrystals predicted and discovered by machinelearning. Phys. Rev. Mater. 7, 093805 (2023).11. Uryu, H. et al. Deep learning enables rapid identification of a newquasicrystal from multiphase powder diffraction patterns. Adv. Sci.11, 2304546 (2024).12. Martin, T. B. & Audus, D. J. Emerging trends in machine learning: apolymer perspective. ACS Polym. Au 3, 239–258 (2023).13. Jain, A. et al. The materials project: a materials genome approach toaccelerating materials innovation. APL Mater. 1, 011002 (2013).14. Curtarolo, S. et al. AFLOW: an automatic framework for high-throughput materials discovery. Comput. Mater. Sci. 58, 218–226(2012).15. Draxl, C. & Scheffler, M. The NOMAD laboratory: from data sharing toartificial intelligence. J. Phys. Mater. 2, 036001 (2019).16. Saal, J. E., Kirklin, S., Aykol, M., Meredig, B. &Wolverton, C. Materialsdesign and discovery with high-throughput density functional theory:the open quantum materials database (OQMD). Jom 65, 1501–1509(2013).17. Kirklin, S. et al. The Open Quantum Materials Database (OQMD):assessing the accuracy of DFT formation energies. npj Comput.Mater. 1, 1–15 (2015).18. Ramakrishnan, R., Dral, P. O., Rupp, M. & Von Lilienfeld, O. A.Quantum chemistry structures and properties of 134 kilo molecules.Sci. Data 1, 1–7 (2014).19. Hayashi, Y., Shiomi, J., Morikawa, J. & Yoshida, R. RadonPy:automated physical property calculation using all-atom classicalmolecular dynamics simulations for polymer informatics. npj Comput.Mater. 8, 222 (2022).20. Su, H., Qi, C. R., Li, Y. & Guibas, L. J. Render for CNN: viewpointestimation in images using CNNs trained with rendered 3D modelviews. In Proc. IEEE international conference on computer vision,2686–2694 (IEEE, 2015).21. Movshovitz-Attias, Y., Kanade, T. & Sheikh, Y. How useful is photo-realistic rendering for visual learning? InComputer Vision–ECCV2016Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16,2016, Proceedings, Part III 14, 202–217 (Springer, 2016).22. Georgakis, G., Mousavian, A., Berg, A. C. & Kosecka, J. Synthesizingtraining data for object detection in indoor scenes.Rob: Sci. Syst. 043(2017).23. Tremblay, J. et al. Training deep networks with synthetic data:Bridging the reality gap by domain randomization. In Proceedings ofthe IEEE conference on computer vision and pattern recognitionworkshops, 969–977 (IEEE, 2018).24. Ju, S. et al. Exploring diamondlike lattice thermal conductivity crystalsvia feature-based transfer learning. Phys. Rev. Mater. 5, 053801(2021).25. Mikami, H. et al. A scaling law for syn2real transfer: Howmuch is yourpre-training effective? InMachine Learning andKnowledgeDiscoveryin Databases, 477-492 (Springer Nature Switzerland, 2023).Table 2 | Hyperparameter settings for model training in thethree applications: RadonPy, χ parameter, and StarrydataRadonPy χ parameter StarrydataOptimizer Adam48 Adam48 Adam48Learning rate 0.001 0.001 0.001 (source task)0.0001 (target task)Batch size 32 16 32Early stoppingpatience5 10 10In the fine-tuning experiments (RadonPy and χparameter), the samehyperparameterswere used forthe source and target tasks, except for the learning rate in Starrydata.https://doi.org/10.1038/s41524-025-01606-5 Articlenpj Computational Materials |          (2025) 11:146 9https://github.com/yoshida-lab/MTL_ChiParameterhttps://figshare.com/projects/Starrydata_datasets/155129https://figshare.com/projects/Starrydata_datasets/155129https://github.com/yoshida-lab/MTL_ChiParameterwww.nature.com/npjcompumats26. Thompson, A. P. et al. LAMMPS-a flexible simulation tool for particle-basedmaterialsmodeling at the atomic,meso, andcontinuumscales.Comput. Phys. Commun. 271, 108171 (2022).27. Otsuka, S., Kuwajima, I., Hosoya, J., Xu, Y. & Yamazaki, M. PoLyInfo:polymer database for polymeric materials design. 2011 InternationalConference on Emerging Intelligent Data and Web Technologies,22–29 (IEEE, 2011).28. Ishii, M., Ito, T., Sado, H. & Kuwajima, I. NIMS polymer databasePoLyInfo (I): an overarching view of half a million data points. Sci.Technol. Adv. Mater. Methods 0, 2354649 (2024).29. PoLyInfo. https://polymer.nims.go.jp/.30. Varian, H. R. Intermediate Microeconomics with Calculus: A ModernApproach. (WW Norton & Company, New York, NY, 2014).31. Loschen, C. & Klamt, A. Prediction of solubilities and partitioncoefficients in polymers using COSMO-RS. Ind. Eng. Chem. Res. 53,11478–11487 (2014).32. Orwoll, R. A. & Arnold, P. A. Polymer–Solvent Interaction Parameter χ(pp. 233–257. Springer New York, New York, NY, 2007).33. Kusaba, M., Hayashi, Y., Liu, C., Wakiuchi, A. & Yoshida, R.Representation of materials by kernel mean embedding.Phys. Rev. B108, 134107 (2023).34. RDKit: Open-source cheminformatics. https://www.rdkit.org/.35. Katsura, Y. et al. Data-driven analysis of electron relaxation times inPbTe-type thermoelectric materials. Sci. Technol. Adv. Mater. 20,511–520 (2019).36. Jones, W. & March, N. H. Theoretical solid state physics, vol. 35(Courier Corporation, 1985).37. Weininger, D. SMILES, a chemical language and information system.1. Introduction to methodology and encoding rules. J. Chem. Inf.Computer Sci. 28, 31–36 (1988).38. Smith,D.G. et al. PSI41.4:Open-sourcesoftware for high-throughputquantum chemistry. J. Chem. Phys. 152, 184108 (2020).39. Larsen, G. S., Lin, P., Hart, K. E. & Colina, C. M.Molecular simulationsof PIM-1-like polymersof intrinsicmicroporosity.Macromolecules44,6944–6951 (2011).40. Ikebata, H., Hongo, K., Isomura, T., Maezono, R. & Yoshida, R.Bayesian molecular design with a chemical language model. J.Comput. Aided Mol. Des. 31, 379–391 (2017).41. Klamt, A. Conductor-like screening model for real solvents: a newapproach to the quantitative calculation of solvation phenomena. J.Phys. Chem. 99, 2224–2235 (1995).42. Klamt, A., Jonas, V., Bürger, T. & Lohrenz, J. C. W. Refinement andparametrization of COSMO-RS. J. Phys. Chem. A 102, 5074–5085(1998).43. Eckert, F. & Klamt, A. Fast solvent screening via quantum chemistry:COSMO-RS approach. AIChE J. 48, 369–385 (2002).44. Klamt, A. COSMO-RS: From quantum chemistry to fluid phasethermodynamics and drug design; Elsevier Science: Amsterdam(2005).45. TURBOMOLE V7.5.1 2021, a development of University of KarlsruheandForschungszentrumKarlsruheGmbH, 1989-2007, TURBOMOLEGmbH, since 2007. https://www.turbomole.org.46. BIOVIA COSMOtherm. Release 2022; Dassault Systèmes. http://www.3ds.com.47. Yeo, I.-K. & Johnson, R. A. A new family of power transformations toimprove normality or symmetry. Biometrika 87, 954–959 (2000).48. Kingma, D. & Ba, J. Adam: a method for stochastic optimization. InInternational Conference on Learning Representations (ICLR) (SanDiega, 2015).AcknowledgementsWe express our sincere gratitude to all members of ISM-MCC FrontierMaterials Design Laboratory, a joint laboratory of Mitsubishi ChemicalCorporation (MCC)and the InstituteofStatisticalMathematics (ISM), for theirvaluable contributions to thediscussionof this study. This research receivedsupport from MEXT as “Program for Promoting Researches on the Super-computer Fugaku” (project ID: hp210264), JST CREST (Grant NumbersJPMJCR19I3, JPMJCR22O3, JPMJCR2332), MEXT/JSPS KAKENHIGrant-in-Aid for ScientificResearchon Innovative Areas (19H05820), Grant-in-Aid for Scientific Research (A) (19H01132), Grant-in-Aid for ResearchActivity Start-up (23K19980), and Grant-in-Aid for Scientific Research (C)(22K11949). Computational resources were provided by Fugaku at theRIKENCenter for Computational Science, Kobe, Japan (hp210264) and thesupercomputer at theResearchCenter forComputational Science,Okazaki,Japan (project: 23-IMS-C113, 24-IMS-C107).Author contributionsR.Y. andS.M. devised the project,main conceptual ideas, and outline proof.S.M. andY.H. implemented themachine-learningalgorithmsandconductedthe experiments with the support of R.Y., S.W., H.S., and K.S. Y.H. per-formed theMDsimulationsusingRadonPy togenerate thepolymerpropertydata. K.S. generated the χ parameter dataset using the COSMO-RS simu-lations. K.F. performed a theoretical analysis of the Sim2Real transferlearning. H.S. examined the results from a physicochemical point of view.M.I. and I.K. extracted and structured data from PoLyInfo. S.M. and R.Y.wrote the manuscript.Competing interestsThe authors declare no competing interests.Additional informationSupplementary information The online version containssupplementary material available athttps://doi.org/10.1038/s41524-025-01606-5.Correspondence and requests for materials should be addressed toRyo Yoshida.Reprints and permissions information is available athttp://www.nature.com/reprintsPublisher’s note Springer Nature remains neutral with regard tojurisdictional claims in published maps and institutional affiliations.Open Access This article is licensed under a Creative CommonsAttribution 4.0 International License, which permits use, sharing,adaptation, distribution and reproduction in anymedium or format, as longas you give appropriate credit to the original author(s) and the source,provide a link to the Creative Commons licence, and indicate if changeswere made. The images or other third party material in this article areincluded in the article’s Creative Commons licence, unless indicatedotherwise in a credit line to the material. If material is not included in thearticle’sCreativeCommons licence and your intended use is not permittedby statutory regulation or exceeds the permitted use, you will need toobtain permission directly from the copyright holder. To view a copy of thislicence, visit http://creativecommons.org/licenses/by/4.0/.© The Author(s) 2025https://doi.org/10.1038/s41524-025-01606-5 Articlenpj Computational Materials |          (2025) 11:146 10https://polymer.nims.go.jp/https://polymer.nims.go.jp/https://www.rdkit.org/https://www.rdkit.org/https://www.turbomole.orghttps://www.turbomole.orghttp://www.3ds.comhttp://www.3ds.comhttp://www.3ds.comhttps://doi.org/10.1038/s41524-025-01606-5http://www.nature.com/reprintshttp://creativecommons.org/licenses/by/4.0/www.nature.com/npjcompumats Scaling Law of Sim2Real transfer learning in expanding computational materials databases for real-world predictions Results Outline Polymer property predictions with Sim2Real transfer learning using MD simulations Sim2Real multitask learning for polymer–solvent miscibility Transfer learning for thermal and electrical conductivity of inorganic materials Discussion Methods PoLyInfo polymer property datasets RadonPy polymer property datasets Equivalent sample size for experimental and simulation data Polymer–solvent solubility datasets Data preprocessing Model fine-tuning Multitask learning Data availability Code availability References Acknowledgements Author contributions Competing interests Additional information