# Fileset

[article.pdf](https://mdr.nims.go.jp/filesets/0158c2c0-7f38-48cc-93d8-3c3ad82f6e6d/download)

## Creator

[ISHIKAWA, Atsushi](https://orcid.org/0000-0001-6908-831X), [SODEYAMA, Keitaro](https://orcid.org/0000-0002-9228-0729), [IGARASHI, Yasuhiko](https://orcid.org/0000-0003-1042-6657), [NAKAYAMA, Tomofumi](https://orcid.org/0000-0003-1240-3571), [TATEYAMA, Yoshitaka](https://orcid.org/0000-0002-5532-6134), [OKADA, Masato](https://orcid.org/0000-0002-9040-8784)

## Rights



## Other metadata

[Machine learning prediction of coordination energies for alkali group elements in battery electrolyte solvents](https://mdr.nims.go.jp/datasets/dba43e9b-fd4a-466f-8e2e-27cec40e9daa)

## Fulltext

Machine learning prediction of coordination energies for alkali group elements in battery electrolyte solventsThis journal is© the Owner Societies 2019 Phys. Chem. Chem. Phys., 2019, 21, 26399--26405 | 26399Cite this:Phys.Chem.Chem.Phys.,2019, 21, 26399Machine learning prediction of coordinationenergies for alkali group elements in batteryelectrolyte solvents†Atsushi Ishikawa, *abcd Keitaro Sodeyama, *acd Yasuhiko Igarashi,aeTomofumi Nakayama, e Yoshitaka Tateyama bce and Masato Okada eWe combined a data science-driven method with quantum chemistry calculations, and applied it to thebattery electrolyte problem. We performed quantum chemistry calculations on the coordination energy(Ecoord) of five alkali metal ions (Li, Na, K, Rb, and Cs) to electrolyte solvent, which is intimately related toion transfer at the electrolyte/electrode interface. Three regression methods, namely, multiple linearregression (MLR), least absolute shrinkage and selection operator (LASSO), and exhaustive search withlinear regression (ES-LiR), were employed to find the relationship between Ecoord and descriptors.Descriptors include both ion and solvent properties, such as the radius of metal ions or the atomiccharge of solvent molecules. Our results clearly indicate that the ionic radius and atomic charge of theoxygen atom that is connected to the metal ion are the most important descriptors. Good predictionaccuracy for Ecoord of 0.127 eV was obtained using ES-LiR, meaning that we can predict Ecoord for anyalkali ion without performing quantum chemistry calculations for ion–solvent pairs. Further improvementin the prediction accuracy was made by applying the exhaustive search with Gaussian process, whichyields 0.016 eV for the prediction accuracy of Ecoord.IntroductionElectrolytes are indispensable components of rechargeablesecondary batteries, and finding good electrolytes is a key issuein the development of next-generation batteries.1,2 Currently,electrolyte solvents for Li ion batteries (LIBs) have been established,such as ethylene carbonate, propylene carbonate, dimethyl carbo-nate, diethyl carbonate, and ethyl methyl carbonate. However, westill have only limited knowledge about electrolyte solvents for othermetal ions (Na, K, Mg, Ca, etc.). Considering the limited Li resourcesin the Earth’s crust, it is necessary to develop alternative batteriesthat use more abundant metal ions. Thus, extending our knowledgeof LIBs to other systems, such as Na or K ion batteries, is critical forfuture battery technology.3,4 Ideal batteries should possess highvoltage and high capacity, as well as fast charging/recharging. Froman atomistic perspective, in the charging/recharging processes theions are transferred between the anode and cathode, and thus iondiffusion between the electrode and electrolytes determines thecharge–discharge rate.The ion transfer between the electrolyte and the electrodehas a large impact on the ion transport of the whole battery.The overall process of ion transfer between electrolyte andelectrode is complicated, mainly because of the formation ofthe solid–electrolyte interface layer. Therefore, finding thedirect relationship between ion transfer efficiency and theproperties of isolated molecules is quite a challenging task.In spite of these difficulties, several studies have shown thatthe character of the single ion–solvent pair is useful for under-standing the tendencies in the ion transfer at the electrolyte/electrode interface. For example, the activation energy of electrolyte–electrode Li transfer is largely influenced by the desolvation energyof the ion from the electrolyte molecule.5,6 This suggests that ion–solvent interaction is one of the important factors governing the iontransfer phenomenon. In this context, the coordination energy ofthe ion to the solvent (Ecoord) can be a good indicator for ion transferat the electrolyte/electrode interface. Indeed, several studies havea PRESTO, Japan Science and Technology Agency (JST), 4-1-8 Honcho, Kawaguchi,Saitama 333-0012, Japanb Center for Green Research on Energy and Environmental Materials (GREEN), andInternational Center for Materials Nanoarchitectonics, National Institute forMaterials Science (NIMS), 1-1 Namiki, Tsukuba, Ibaraki 305-0044, Japan.E-mail: ISHIKAWA.Atsushi@nims.go.jp, SODEYAMA.Keitaro@nims.go.jpc Center for Materials Research by Information Integration (cMI2), Research andServices Division of Materials Data and Integrated System (MaDIS), NationalInstitute for Materials Science (NIMS), 1-2-1 Sengen, Tsukuba, Ibaraki, 305-0047,Japand Elements Strategy Initiative for Catalysts & Batteries (ESICB), Kyoto University,1-30 Goryo-Ohara, Nishikyo-ku, Kyoto 615-8245, Japane Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5,Kashiwanoha, Kashiwa, Chiba 277-8561, Japan† Electronic supplementary information (ESI) available. See DOI: 10.1039/c9cp03679bReceived 30th June 2019,Accepted 17th November 2019DOI: 10.1039/c9cp03679brsc.li/pccpPCCPPAPEROpen Access Article. Published on 18 November 2019. Downloaded on 9/24/2021 7:06:48 AM.  This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence.View Article OnlineView Journal  | View Issuehttp://orcid.org/0000-0001-6908-831Xhttp://orcid.org/0000-0002-9228-0729http://orcid.org/0000-0003-1240-3571http://orcid.org/0000-0002-5532-6134http://orcid.org/0000-0002-9040-8784http://crossmark.crossref.org/dialog/?doi=10.1039/c9cp03679b&domain=pdf&date_stamp=2019-12-03http://rsc.li/pccphttp://creativecommons.org/licenses/by-nc/3.0/http://creativecommons.org/licenses/by-nc/3.0/https://doi.org/10.1039/c9cp03679bhttps://pubs.rsc.org/en/journals/journal/CPhttps://pubs.rsc.org/en/journals/journal/CP?issueid=CP02104826400 | Phys. Chem. Chem. Phys., 2019, 21, 26399--26405 This journal is© the Owner Societies 2019investigated Ecoord of Li, Na, and K with various solvent moleculesusing quantum chemistry methods.7,8 For this reason, the searchfor battery electrolytes based on Ecoord would be an efficient andimportant approach.Recently, great advances have been made in machine learning-based or data science-driven approaches. These approaches, incombination with high-throughput theoretical calculations,have also been applied to battery electrolytes.9–15 For example,a computational screening of over 12 000 materials has beenreported for solid electrolytes in LIBs.16,17 Existing studies havemainly focused on solid electrolytes, while investigations onliquid electrolytes are limited.18,19 This is mainly because a solidsystem has a rather rigid structure, thus extracting structural,electronic, and energetic information from it is straightforward.By comparison, a liquid system is much more flexible in termsof molecular structure, making the extraction of structuralinformation more challenging.In the present study, a machine learning-based technique,in combination with quantum chemistry calculations, wasapplied to the battery electrolyte problem, to derive an accurateand efficient method to predict values of Ecoord. Here, weconsider coordination of alkali metal ions (Li, Na, K, Rb,and Cs) to electrolyte solvents, and use Ecoord calculated byquantum chemistry methods as the target properties. To thebest of our knowledge, computational evaluation of Ecoord for sucha wide range of alkali metals has not previously been reported. Weexpect that the combination of computational chemistry and datascience-driven methods will be of great benefit in the search forelectrolytes for next-generation batteries. Extending our knowledgeof electrolyte solvents to metal ions other than Li would facilitatethe computational screening of materials in post-LiBs.Theoretical backgroundData science methodPredicting Ecoord with simple physical or chemical properties ofthe solvent has two main advantages: (i) reduced cost of quantumchemistry calculations, since the computation for the ion–solventcomplex is avoided, and (ii) it provides a fundamental under-standing of the ion–solvent interaction, because it shows whichsolvent properties are critical for estimating the Ecoord value. Inthe present case, we can regard Ecoord and the solvent propertiesas the target properties and descriptors, respectively. Finding therelationship between these two sets is often called the variableselection problem.Among several approaches for variable selection, the simplestone is multiple linear regression (MLR). However, MLR oftensuffers from redundant descriptors when their number becomeslarge. The sparseness of the variable space is useful to alleviatethis redundancy and avoids overfitting. Recently, sparsemethods, such as the least absolute shrinkage and selectionoperator (LASSO), have been applied to many problems.20 Despiteits success, LASSO gives only one combination of descriptors,which is not guaranteed to be the best among all possible pairsof descriptors. In order to analyze the stability of the chosendescriptor combination, examining combinations other thanthe optimal one is informative.Recently, we showed that the exhaustive search with linearregression (ES-LiR) method, proposed and developed by Okadaand co-workers, is quite useful in this context.21–23 In the ES-LiRmethod, all combinations of variable pairs are tested, guaranteeingthat the best pair should be found. Thus, the ES-LiR method is anew and powerful solution for the variable selection.Based on the above considerations, here, we applied theMLR, LASSO, and ES-LiR methods to find the relationshipbetween Ecoord and solvent properties. The MLR was performedby minimizing the least-squares errorE ¼Xmzm �Xiwixmi !2 (1)where z and xi (i = 1,. . ., Nvar) are the target value and the ithexplanatory variable, respectively, and Nvar is the total numberof variables. LASSO involves a penalty parameter (l) that islinear in the error function:E ¼Xmz�Xiwixmi !2þ lXiwij j (2)If l is sufficiently large, some of the coefficients wi becomezero. This makes the model sparse with respect to explanatoryvariables. To determine l, we used the tenfold cross-validationerror (CV error), that is, the whole data set was divided intotraining and validating data in ten different ways. The ES-LiRcan be defined by introducing the indicatorc = (c1, c2,. . .,cN) C {0, 1}N (3)where each variable ci is either 0 or 1. An indicator represents acombination of non-zero explanatory variables, and using thisindicator the error in the ES-LiR can be written asE ¼Xmzm �Xiwicixmi !2: (4)After making an exhaustive search of the 0–1 combinations inci, wi is found by minimizing the tenfold CV error.The exhaustive search with Gaussian process (ES-GP) is alsoan exhaustive search method, like ES-LiR.24 In ES-LiR, theregression method is linear regression, while in ES-GP it is aGaussian process (GP).25 In the GP, the predicted value iswritten asE ¼Xmzm � kmðcÞT KðcÞ þ s2I� ��1n oTy� �2 (5)wherekm(c) = (k(x1,xm),. . .,k(xn,xm))T (6)k(xn,xm) = exp(�b|xn(c) � xm(c)|)2 (7)is a kernel function with hyperparameter b, (x1,. . .,xn) aretraining data, n is a number of training data, xn(c) is a vectorPaper PCCPOpen Access Article. Published on 18 November 2019. Downloaded on 9/24/2021 7:06:48 AM.  This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence.View Article Onlinehttp://creativecommons.org/licenses/by-nc/3.0/http://creativecommons.org/licenses/by-nc/3.0/https://doi.org/10.1039/c9cp03679bThis journal is© the Owner Societies 2019 Phys. Chem. Chem. Phys., 2019, 21, 26399--26405 | 26401which is the only element with ci = 1 extracted from the nthsample xn, eqn (8)K(c) = {k(xn,xx)}n,x (1 r n, x r n) (8)is the kernel matrix, s is a variance of noise, I is an identitymatrix and y = (y1,. . .,yn)T is a target variable of training data. Byminimizing E, optimal s, b, c are found.Quantum chemistry calculationFor the electrolyte solvent database, we selected 70 solventstaken from commercialized battery-grade materials from KISHIDAChemical Co., Ltd.26 The full list of electrolyte solvents examined isshown in Table S1 (ESI†). The electrolyte database is close to theone used in ref. 21. Some experimental data are included asdescriptors here, namely, the melting point, boiling point, anddensity taken from ref. 26 The metal ions are described by theirexperimental ionic radii.In the present study, Ecoord was defined by the followingformulaEcoord = Eion–solv � (Esolv + Eion) (9)where Eion–solv, Esolv, and Eion are the total energies of the ion–solvent system, the solvent energy, and the ion energy, respectively.The total energy is defined as the sum of the electronic andnuclear repulsion energies.Density functional theory (DFT) was used in the electronicstructure calculation. M06-2X was used for the exchange–correlationfunctional, since this functional is reported to accurately predict thethermodynamic properties of main group elements.27,28 TheDef2-SVP basis set was used for all the elements, and thepseudo-potential was used for K, Rb, and Cs.29 Another alkaliion, Fr, is omitted in this work because it is unstable andradioactive, thus not relevant for batteries. Atomic charges werecalculated by the natural population analysis method proposedby Weinhold et al., using the NBO 6 program.30 All the calculationswere performed with Gaussian16.31For the descriptors or explanatory variables, the followingwere used as ‘computational’ descriptors: energies of the highestoccupied molecular orbital (HOMO) and the lowest unoccupiedmolecular orbital (LUMO), dipole moment, natural bond orbital(NBO) charge of the O atom that coordinates to the metal ion,total energy (i.e. electronic energy plus nuclear repulsion), andtotal dipole moment. From an atomic/molecular perspective, theion–solvent interaction can be understood as an acid–baseinteraction, since the ion works as a hard acid and the solventworks as a hard or soft Lewis base. Common organic electrolytesolvents have alkoxy or carbonyl groups, and in these cases the Oatom works as the Lewis base site. For this reason, we assumedthat the ion coordinated to this O atom. Also, the NBO charge onthe coordinating O atom was included in the descriptors. For theoptimized geometries of the cation-coordinated system, see Fig.S1 in the ESI.† The computational properties of the solvent areobtained by DFT calculation of the pure solvent, i.e. without ions.All the experimental and computational descriptors for thesolvent molecules are shown in Table 1.Results and discussionFirst, we discuss the accuracy of the three methods to estimatethe true (i.e. DFT-calculated) Ecoord values. Here, the data setincludes all the Ecoord data (i.e. coordination of Li, Na, K, Rb,and Cs to solvent molecule). In other words, solvent descriptorsand ion descriptors were independently made and combined toform the whole data set. Since we have 70 solvents, the Ecoorddata set consists of 5 � 70 = 350 points.32Our calculated Ecoord values for Li, Na, K, Rb, and Cs aresummarized in the bar chart in Fig. 1, and the selectednumerical values for Ecoord are shown in Table 2. The rangeof Ecoord for the five ions are: Li �1.32 to �2.91 eV (mean value:�2.20 eV), Na�0.88 to �2.18 (�1.60), K�0.61 to �1.73 (�1.20),Rb �0.55 to �1.60 (�1.11), and Cs �0.46 to �1.44 eV (�0.98).Thus, the Ecoord of metal ions can be ranked as Li 4 Na 4K B Rb 4 Cs.Next, we examined the regression of Ecoord from the solventand ion properties. Fig. 2 demonstrates a good correlationbetween Ecoord values calculated by DFT and those estimatedby ES-LiR. The CV error for ES-LiR in Fig. 2 was 0.127 eV. This isonly 5.7% for the average Li coordination energy, indicatingthat the regression formula from ES-LiR gives accurate results.We also observe that the prediction accuracy tends to be lowerat Ecoord o �2.5 eV. As we shall see later, the importantdescriptors are the O charge and the total dipole. The deviationfrom this regression formula indicates other effects, for example,large distortion of the ion–solvent complex would contribute tolarge Ecoord values.The accuracy of the estimation methods can be evaluated bythe CV errors. The smallest CV error calculated with the MLR,LASSO, and ES-LiR methods was 0.1280, 0.1278, and 0.1271 eV,respectively. These values are shown in Table 3, together withselected combinations of descriptors. Values in Table 3 suggestthat ES-LiR gives the smallest CV error and thus the bestprediction accuracy, although the differences between the threeTable 1 Descriptors of solvents and ions used in the present study.‘Experimental’ and ‘computational’ mean descriptors taken from experi-mental values or calculated by DFT, respectivelyExperimental Cations: ionic radius, electronegativity, atomic weightSolvents: boiling point, melting point, flashing point,densityComputational Solvents: NBO charge on coordinating O atom,HOMO energy, LUMO energy, total dipole moment,total energy, molecular weightFig. 1 Ecoord values of 70 solvents and five ions (Li, Na, K, Rb, and Cs).PCCP PaperOpen Access Article. Published on 18 November 2019. Downloaded on 9/24/2021 7:06:48 AM.  This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence.View Article Onlinehttp://creativecommons.org/licenses/by-nc/3.0/http://creativecommons.org/licenses/by-nc/3.0/https://doi.org/10.1039/c9cp03679b26402 | Phys. Chem. Chem. Phys., 2019, 21, 26399--26405 This journal is© the Owner Societies 2019methods are moderate. It is well known that the CV error isintimately related to the choice of descriptors. Since the ES-LiRexamines all combinations of descriptors, it is always guaranteedto choose the best combination. In all three regression formulae,the ionic radius of the metal ion has the largest coefficient andthus it is the most important descriptor. This can be understood interms of Pearson’s hard–soft acid–base rule, which states that thesmaller ion has hard acid character. The positive coefficient ofionic radius in Table 3 indicates that smaller ions give the smallerEcoord values (thus the stronger ion–solvent interaction). After theionic radius, the NBO charge on the O atom coordinating to theion has the second largest coefficient. Since the ion–solventinteraction mainly has an electrostatic cationic–anionic char-acter, a more negative O charge leads to a stronger interactionand thus a larger Ecoord value. This conclusion is the same as inour previous work, in which the O atomic charge is the mostimportant descriptor for the Li coordination on electrolytesolvent molecules.21 We also found that the total dipole has arelatively large coefficient. This adds to the charge–chargeelectrostatic interaction via charge–dipole interaction, so thisalso contributes to the ion–solvent interaction.Another important difference among the three regressionmethods is the sparseness of the regression formula. In MLRand LASSO, all descriptors have some non-zero coefficients,and thus these methods are the least sparse among the three.Contrary to these two methods, ES-LiR gives a more sparseregression formula because three descriptors (HOMO energy,melting point, and density) have zero coefficients. This indicatesthat the regression formula given by ES-LiR is the most accurateof the three methods, and at the same time its physical andchemical meanings are the easiest to interpret.Up to now, our discussion is based on the optimal combinationof descriptors that minimize the CV error. Estimation accuracy forother descriptor combinations can also be found using the ES-LiR,Table 2 DFT-calculated Ecoord of Li, Na, K, Rb, and Cs for 23 selectedsolventsSolventEcoord (eV)Li Na K Rb CsEthylene carbonate �2.343 �1.747 �1.365 �1.272 �1.135Propylene carbonate �2.399 �1.789 �1.397 �1.307 �1.165Vinylene carbonate �2.179 �1.610 �1.246 �1.157 �1.025Fluoroethylene carbonate �2.128 �1.569 �1.210 �1.129 �1.001Dimethyl carbonate �2.068 �1.454 �1.078 �0.968 �0.842Diethyl carbonate �2.130 �1.492 �1.106 �1.010 �0.877Ethyl methyl carbonate �2.114 �1.488 �1.108 �1.006 �0.878Furan �1.320 �0.884 �0.605 �0.545 �0.461Tetrahydrofuran �2.047 �1.454 �1.065 �0.978 �0.851Ethyl acetate �2.206 �1.574 �1.185 �1.083 �0.950Isopropyl acetate �2.222 �1.585 �1.187 �1.093 �0.958Methyl propionate �2.138 �1.524 �1.133 �1.030 �0.896Methyl formate �2.011 �1.444 �1.082 �0.981 �0.861Vinyl acetate �2.052 �1.454 �1.076 �0.984 �0.857Sulfolane �2.481 �1.879 �1.450 �1.350 �1.200Dimethyl sulfoxide �2.905 �2.183 �1.725 �1.590 �1.427Cyclohexanone �2.259 �1.654 �1.265 �1.158 �1.025Benzaldehyde �2.177 �1.570 �1.188 �1.085 �0.958Benzyl benzoate �2.758 �2.139 �1.682 �1.591 �1.441Diphenyl ether �1.625 �1.120 �0.758 �0.738 �0.638Acetone �2.190 �1.600 �1.219 �1.117 �0.987Chloroacetone �1.938 �1.399 �1.047 �0.964 �0.845Methyl acrylate �2.195 �1.570 �1.178 �1.069 �0.938Fig. 2 Comparison between Ecoord calculated by DFT (x-axis) and that pre-dicted by ES-LiR (y-axis). The diagonal line corresponds to a perfect match.Table 3 Coefficient of descriptors in the three regression formulae (MLR,LASSO, and ES-LiR) with the smallest CV error, and their CV errorsMLR LASSO ES-LiRIonic radius 0.6637 0.6542 0.6637Electronegativity 0.1612 0.1569 0.1612Atomic weight �0.0986 �0.0930 �0.0986NBO charge of Oatom 0.1832 0.1751 0.1860HOMO energy 0.0121 0.0111 0.0000LUMO energy 0.0260 0.0248 0.0273Total dipole �0.1467 �0.1420 �0.1475Total energy �0.1384 �0.1261 �0.1476Boiling point �0.0956 �0.0941 �0.0977Flashing point 0.1154 0.1034 0.1182Melting point �0.0202 �0.0151 0.0000Molecular weight �0.1156 �0.1051 �0.1215Density 0.0249 0.0270 0.0000CV error 0.1280 0.1278 0.1271Fig. 3 The number of counts for the CV error (i.e. histogram) for variousdescriptor combinations. The orange, green, and red symbols show thesmallest CV errors for ES-LiR, LASSO, and MLR, respectively.Paper PCCPOpen Access Article. Published on 18 November 2019. Downloaded on 9/24/2021 7:06:48 AM.  This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence.View Article Onlinehttp://creativecommons.org/licenses/by-nc/3.0/http://creativecommons.org/licenses/by-nc/3.0/https://doi.org/10.1039/c9cp03679bThis journal is© the Owner Societies 2019 Phys. Chem. Chem. Phys., 2019, 21, 26399--26405 | 26403because this method examines all combinations of descriptors.The number of counts in the descriptor combination within afixed CV error range can be summarized by the histogram in Fig. 3,where descriptor combinations that reduce CV error to below 0.14are rather rare. From this, we can infer that the combination ofparticular descriptors is important for achieving accuracy.This issue can be analyzed with the linear coefficient of theaccurate regression formula. This is another important piece ofinformation obtained by ES-LiR. The plot of linear coefficientsfor ten descriptor combinations that give low CV errors isshown in Fig. 4. We call this the ‘weight diagram’, where eachcolor represents the magnitude of the fitted coefficient. Sincewe can find the contribution of descriptors for several combinationsof them, the stability of the important descriptors can be foundfrom the weight diagram. We consider that analysis with severalregression formulae is important, because multicollinearity oftenoccurs in the linear regression model; inspecting the descriptorweights for multiple combinations of regression models is morerobust than analysis based on a single regression model.In the weight diagram, the ionic radius has the largestcontribution to the regression formula in all descriptor combi-nations. Thus, this property is the most important and alsomost stable descriptor in the Ecoord prediction, as stated above.Since the ionic radius is the most important descriptor in alltop 20 descriptor combinations, it is also the most stable one inthe present descriptor set. The next important descriptor is theNBO charge of the coordinating O atom, which is also a stabledescriptor among the 20 combinations. Other descriptors, suchas dipole moment, boiling point, and density, are also important,but their stability is not as high as the ionic radius or the solvent ONBO charge.We also note that the atomic weights of cation species havelarge weight. The atomic weight works as a secondary factor forthe ionic radius, as can be confirmed by carrying out the ES-LiRwithout the ionic radius; in this case the atomic weights havethe largest weight in the regression formula. However, the calculatedCV error is considerably higher (0.2807 eV), indicating that theionic radius does much better in the linear regression model.Finally, we applied the ES-GP method for Ecoord prediction.The ES-GP method, like ES-LiR, examines all the possiblecombinations of descriptors, while regression of the target valueis done with the Gaussian process. This includes the non-linearterms of the descriptors, which were not taken into account inthe ES-LiR method. According to this feature, we can expecthigher prediction accuracy with ES-GP, which was alreadyshown in our previous study.24 Here, the same data set usedfor ES-LiR was used for ES-GP. We used the following sevendescriptors in the ES-GP; ionic radius, NBO charge, total dipolemoment, total energy, boiling point, melting point, and density.We selected these descriptors as they minimize the CV error ofthe ES-GP prediction; the dependence of the CV error on thenumber of descriptors is shown in Fig. S2 in the ESI.†In Fig. 5, we compare the Ecoord values calculated by DFTand predicted by ES-GP. The CV error for ES-GP was 0.016 eV,Fig. 4 Weight diagram for the descriptors of top 20 combinations with small CV error in ES-LiR. Descriptors with coefficients smaller than 10�10 shownin white box.Fig. 5 Comparison between Ecoord calculated by DFT (x-axis) and pre-dicted by ES-GP (y-axis). The diagonal line corresponds to a perfect match.PCCP PaperOpen Access Article. Published on 18 November 2019. Downloaded on 9/24/2021 7:06:48 AM.  This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence.View Article Onlinehttp://creativecommons.org/licenses/by-nc/3.0/http://creativecommons.org/licenses/by-nc/3.0/https://doi.org/10.1039/c9cp03679b26404 | Phys. Chem. Chem. Phys., 2019, 21, 26399--26405 This journal is© the Owner Societies 2019which is significantly better than that for ES-LiR (0.127 eV). Theaccuracy of the ES-GP method is 1.54 in kJ mol�1 unit, which issufficient for most purposes for battery-related study. Fromthese results, we can conclude that the combined use of ES-LiRand ES-GP is advantageous in obtaining good physical orchemical intuition and achieving high prediction accuracy.ConclusionsExploration of new electrolyte solvents is key for next-generationbatteries. To this end, data science-driven techniques combinedwith computational chemistry are an up and coming powerfultool. In the present study, the coordination energy (Ecoord) ofalkali metal ions to battery electrolyte solvent was calculated byDFT for Li, Na, K, Rb, and Cs ions and 70 solvents. Additionally,the calculated Ecoord was used as the target property in the regressionusing MLR, LASSO, and ES-LiR methods. This enables the predictionof Ecoord for various ion species using only the properties of the ionand the solvent.Ecoord calculated with DFT using M06-2X show that the ion–solvent interaction is in the order of Li 4 Na 4 K B Rb 4 Cs,with the mean Ecoord values of �2.20, �1.60, �1.20, �1.11, and�0.98 eV. We then constructed regression models to predictEcoord from ion and solvent descriptors (melting point, flashingpoint, HOMO energy, LUMO energy, NBO atomic charge, totalenergy, total dipole moment, and metal ionic radius). We foundthat the ES-LiR gives the best accuracy for Ecoord, since its cross-validation error was 0.127 eV. Even higher accuracy (0.016 eV)can be obtained with ES-GP. This suggests that accurate predictionof Ecoord is possible even if solvent descriptors and ion descriptorsare independently formed. The ionic radius is the most importantdescriptor since it has the largest coefficient in the regressionformula. Other descriptors, such as NBO charge on the solvent Oatom or total dipole, are also important. This result can be easilyunderstood as the ion–solvent interaction is mainly electrostatic innature. The weight diagram from ES-LiR revealed that the impor-tance of the ionic radius and O atom NBO charge as descriptors isstable over many regression formulae.This study has shown that combined use of computationalchemistry and data-driven science can be an efficient andaccurate tool for coordination energy prediction. We succeededin showing that this approach can be applicable to any alkalimetal ion coordination. The constructed regression models areaccurate enough for practical use in the search for batteryelectrolytes. These features will be important in developingpost-Li next-generation batteries.Conflicts of interestThere are no conflicts to declare.AcknowledgementsThis work was supported in part by JST ‘‘Materials research byInformation Integration’’ Initiative (MI2I) project, by JSPSKAKENHI Grant Number JP15H05701, and by MEXT as ‘‘PriorityIssue (No. 5) on Post K Computer’’. The calculations werecarried out at the supercomputer center of NIMS. The work alsoused computational resources of the K computer at the RIKENthrough the HPCI System Research Projects (project IDs:hp160174, hp170198, and hp180134).Notes and references1 K. Xu, Nonaqueous Liquid Electrolytes for Lithium-BasedRechargeable Batteries, Chem. Rev., 2004, 104(10), 4303–4418.2 M. D. Bhatt and C. O’Dwyer, Recent Progress in Theoreticaland Computational Investigations of Li-ion Battery Materi-als and Electrolytes, Phys. Chem. Chem. Phys., 2015, 17(7),4799–4844.3 N. Yabuuchi, K. Kubota, M. Dahbi and S. Komaba, ResearchDevelopment on Sodium-Ion Batteries, Chem. Rev., 2014,114(23), 11636–11682.4 A. Eftekhari, Z. Jian and X. Ji, Potassium Secondary Batteries,ACS Appl. Mater. Interfaces, 2017, 9(5), 4404–4419.5 Y. Yamada, F. Sagane, Y. Iriyama, T. Abe and Z. Ogumi,Kinetics of Lithium-Ion Transfer at the Interface betweenLi0.35La0.55TiO3 and Binary Electrolytes, J. Phys. Chem. C,2009, 113(32), 14528–14532.6 T. Abe, F. Sagane, M. Ohtsuka, Y. Iriyama and Z. Ogumi,Lithium-Ion Transfer at the Interface Between Lithium-IonConductive Ceramic Electrolyte and Liquid Electrolyte-A Keyto Enhancing the Rate Capability of Lithium-Ion Batteries,J. Electrochem. Soc., 2005, 152(11), A2151–A2154.7 M. Okoshi, Y. Yamada, A. Yamada and H. Nakai, TheoreticalAnalysis on De-Solvation of Lithium, Sodium, and MagnesiumCations to Organic Electrolyte Solvents, J. Electrochem. Soc., 2013,160(11), A2160–A2165.8 M. Okoshi, Y. Yamada, S. Komaba, A. Yamada and H. Nakai,Theoretical Analysis of Interactions between Potassium Ionsand Organic Electrolyte Solvents: A Comparison withLithium, Sodium, and Magnesium Ions, J. Electrochem.Soc., 2017, 164(2), A54–A60.9 R. Ramprasad, R. Batra, G. Pilania, A. Mannodi-Kanakkithodiand C. Kim, Machine Learning in Materials Informatics: RecentApplications and Prospects. npj Computational, Materials, 2017,3(1), 54.10 L. Cheng, R. S. Assary, X. Qu, A. Jain, S. P. Ong, N. N. Rajput,K. Persson and L. A. Curtiss, Accelerating Electrolyte Dis-covery for Energy Storage with High-Throughput Screening,J. Phys. Chem. Lett., 2015, 6(2), 283–291.11 M. D. Halls and K. Tasaki, High-Throughput QuantumChemistry and Virtual Screening for Lithium Ion BatteryElectrolyte Additives, J. Power Sources, 2010, 195(5), 1472–1478.12 G. Hautier, A. Jain and S. P. Ong, From the Computer to theLaboratory: Materials Discovery and Design using First-Principles Calculations, J. Mater. Sci., 2012, 47(21), 7317–7340.13 S. Curtarolo, G. L. W. Hart, M. B. Nardelli, N. Mingo,S. Sanvito and O. Levy, The High-Throughput Highway toComputational Materials Design, Nat. Mater., 2013, 12, 191.Paper PCCPOpen Access Article. Published on 18 November 2019. Downloaded on 9/24/2021 7:06:48 AM.  This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence.View Article Onlinehttp://creativecommons.org/licenses/by-nc/3.0/http://creativecommons.org/licenses/by-nc/3.0/https://doi.org/10.1039/c9cp03679bThis journal is© the Owner Societies 2019 Phys. Chem. Chem. Phys., 2019, 21, 26399--26405 | 2640514 M. Korth, Large-Scale Virtual High-Throughput Screeningfor the Identification of New Battery Electrolyte Solvents:Evaluation of Electronic Structure Theory Methods, Phys.Chem. Chem. Phys., 2014, 16(17), 7919–7926.15 T. Husch, N. D. Yilmazer, A. Balducci and M. Korth, Large-Scale Virtual High-Throughput Screening for the Identificationof New Battery Electrolyte Solvents: Computing Infrastructureand Collective Properties, Phys. Chem. Chem. Phys., 2015, 17(5),3394–3401.16 Z. Ahmad, T. Xie, C. Maheshwari, J. C. Grossman andV. Viswanathan, Machine Learning Enabled ComputationalScreening of Inorganic Solid Electrolytes for Suppression ofDendrite Formation in Lithium Metal Anodes, ACS Cent.Sci., 2018, 4(8), 996–1006.17 A. D. Sendek, Q. Yang, E. D. Cubuk, K. A. N. Duerloo, Y. Cuiand E. J. Reed, Holistic Computational Structure Screeningof More Than 12 000 Candidates for Solid Lithium-IonConductor Materials, Energy Environ. Sci., 2017, 10(1), 306–320.18 X. Chen, X. Shen, B. Li, H.-J. Peng, X.-B. Cheng, B.-Q. Li,X.-Q. Zhang, J.-Q. Huang and Q. Zhang, Ion–Solvent ComplexesPromote Gas Evolution from Electrolytes on a Sodium MetalAnode, Angew. Chem., Int. Ed., 2018, 57(3), 734–737.19 X. Chen, H.-R. Li, X. Shen and Q. Zhang, The Origin of theReduced Reductive Stability of Ion–Solvent Complexes onAlkali and Alkaline Earth Metal Anodes, Angew. Chem., Int.Ed., 2018, 57(51), 16643–16647.20 R. Tibshirani, Regression Shrinkage and Selection via theLasso, J. Roy. Stat. Soc. B, 1996, 58(1), 267–288.21 K. Sodeyama, Y. Igarashi, T. Nakayama, Y. Tateyama andM. Okada, Liquid Electrolyte Informatics using an ExhaustiveSearch with Linear Regression, Phys. Chem. Chem. Phys., 2018,20(35), 22585–22591.22 Y. Igarashi, K. Nagata, T. Kuwatani, T. Omori, Y. Nakanishi-Ohno and M. Okada, Three Levels of Data-Driven Science. InInternational Meeting on High-Dimensional Data-Driven Science,ed. T. Obuchi, T. Kasai, M. J. Miyama, M. Ohzeki andM. Uemura, 2016, vol. 699.23 Y. Igarashi, H. Takenaka, Y. Nakanishi-Ohno, M. Uemura,S. Ikeda and M. Okada, Exhaustive Search for Sparse Vari-able Selection in Linear Regression, J. Phys. Soc. Jpn., 2018,87, 4.24 T. Nakayama, Y. Igarashi, K. Sodeyama and M. Okada,Material Search for Li-ion Battery Electrolytes Through anExhaustive Search with a Gaussian Process, Chem. Phys.Lett., 2019, 731, 136622.25 C. E. Rasmussen and C. K. I. Williams, Gaussian Processesfor Machine Learning, The MIT Press, Cambridge, MA, 2005.26 KISHIDA CHEMICAL Co., L., KISHIDA Product Information.2016.27 N. Mardirossian and M. Head-Gordon, How Accurate Arethe Minnesota Density Functionals for Noncovalent Inter-actions, Isomerization Energies, Thermochemistry, andBarrier Heights Involving Molecules Composed of Main-GroupElements?, J. Chem. Theory Comput., 2016, 12(9), 4303–4325.28 Y. Zhao and D. G. Truhlar, The M06 Suite of DensityFunctionals for Main Group Thermochemistry, ThermochemicalKinetics, Noncovalent Interactions, Excited States, andTransition Elements: Two New Functionals and SystematicTesting of Four M06-Class Functionals and 12 Other Functionals,Theor. Chem. Acc., 2008, 120(1-3), 215–241.29 F. Weigend and R. Ahlrichs, Balanced Basis Sets of SplitValence, Triple Zeta Valence and Quadruple Zeta ValenceQuality for H to Rn: Design and Assessment of Accuracy,Phys. Chem. Chem. Phys., 2005, 7(18), 3297–3305.30 A. E. Reed, R. B. Weinstock and F. Weinhold, Natural-Population Analysis, J. Chem. Phys., 1985, 83(2), 735–746.31 M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria,M. A. Robb, J. R. Cheeseman, G. Scalmani, V. Barone, G. A.Petersson, H. Nakatsuji, X. Li, M. Caricato, A. V. Marenich,J. Bloino, B. G. Janesko, R. Gomperts, B. Mennucci, H. P.Hratchian, J. V. Ortiz, A. F. Izmaylov, J. L. Sonnenberg,D. Williams-Young, F. Ding, F. Lipparini, F. Egidi, J. Goings,B. Peng, A. Petrone, T. Henderson, D. Ranasinghe, V. G.Zakrzewski, J. Gao, N. Rega, G. Zheng, W. Liang, M. Hada,M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida,T. Nakajima, Y. Honda, O. Kitao, H. Nakai, T. Vreven,K. Throssell, J. A. Montgomery, J. E. Peralta, F. Ogliaro,M. Bearpark, J. J. Heyd, E. Brothers, K. N. Kudin,V. N. Staroverov, T. A. Keith, R. Kobayashi, J. Normand,K. Raghavachari, A. Rendell, J. C. Burant, S. S. Iyengar,J. Tomasi, M. Cossi, J. M. Millam, M. Klene, C. Adamo,R. Cammi, J. W. Ochterski, R. L. Martin, K. Morokuma,Ö. Farkas, J. B. Foresman and D. J. Fox, Gaussian 16, RevisionA.03, Gaussian Inc, Wallingford, CT, 2016.32 In our calculation, descriptors are not fully independentbecause solvent descriptors are same with five cations.However, Ecoord values were individually calculated for allthe solvent-cation pairs (i.e. 350 systems). Therefore thewhole dataset (descriptors and target values) is independentand identically distributed.PCCP PaperOpen Access Article. Published on 18 November 2019. Downloaded on 9/24/2021 7:06:48 AM.  This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence.View Article Onlinehttp://creativecommons.org/licenses/by-nc/3.0/http://creativecommons.org/licenses/by-nc/3.0/https://doi.org/10.1039/c9cp03679b