# Fileset

[d2ma00881e.pdf](https://mdr.nims.go.jp/filesets/fcaeedc2-0180-470c-af42-ec87e7356505/download)

## Creator

[Yukinori Koyama](https://orcid.org/0000-0002-7090-4430), Hidekazu Ikeno, Masamichi Harada, Shiro Funahashi, [Takashi Takeda](https://orcid.org/0000-0003-2510-4562), Naoto Hirosaki

## Rights

Creative Commons BY Attribution 4.0 International[Creative Commons BY Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/)

## Other metadata

[Rapid discovery of new Eu2+-activated phosphors with a designed luminescence color using a data-driven approach](https://mdr.nims.go.jp/datasets/135a07a6-821d-4480-8f76-44d867b2e074)

## Fulltext

Rapid discovery of new Eu2+-activated phosphors with a designed luminescence color using a data-driven approach© 2023 The Author(s). Published by the Royal Society of Chemistry Mater. Adv., 2023, 4, 231–239 |  231Cite this: Mater. Adv., 2023,4, 231Rapid discovery of new Eu2+-activated phosphorswith a designed luminescence color usinga data-driven approach†Yukinori Koyama, *a Hidekazu Ikeno, b Masamichi Harada,c Shiro Funahashi,cTakashi Takeda c and Naoto HirosakicFor rapid and efficient development of new phosphors, a suitable method that proposes promisingcandidates is expected to focus time-consuming trial-and-error experiments. A data-driven approach todiscover new phosphor materials with a designed luminescence color is demonstrated in this paper. Toscreen compounds for a desirable luminescence color, a machine learning model has been developedfor predicting emission peak wavelengths from a dataset composed of 129 Eu2+-activated phosphors.General-purpose compositional and structural features are used to represent host compounds ofphosphors. Bootstrap aggregation with the gradient boosted regression trees method is adopted toobtain high predictive performance and to avoid overfitting. The predictive performance of the machinelearning model is estimated to be 25 nm of mean absolute error (MAE) and 33 nm of root mean squarederror (RMSE) by 10-fold cross validation. To discover new green-emitting Eu2+-activated phosphors,twenty candidate compounds have been selected to have predicted emission peak wavelengths ofabout 500–550 nm from a materials database, and the candidates have been synthesized and charac-terized by experiments. Three new Eu2+-activated phosphors, Li2Ca4Si4O13:Eu2+, Na2Ca2Si2O7:Eu2+, andSrLaGaO4:Eu2+, successfully show green or blue-green emissions as designed.IntroductionPhosphor-converted white light-emitting-diodes (pc-wLEDs),which are composed of blue or near-ultraviolet LED chips asa primary light source, and phosphors as down-conversionluminescent materials, are one of the indispensable lightingtechnologies today because of their high luminous efficiency,cost effectiveness, environment-friendliness, and spectraldesign flexibility.1 For pc-wLED applications, phosphors havevarious requirements such as strong absorption of the LEDlight, suitable emission spectrum, high quantum efficiency,small thermal quenching/degradation, high chemical stability,and small luminance saturation. Ce3+ and Eu2+ ions are oftenselected as activators of the phosphors for the pc-wLEDs. Theselanthanide ions utilize parity allowed 4f–5d transitions, whichare often characterized by high radiative emission probability,short lifetime, and relatively broad absorption and emissionspectra in contrast to parity forbidden 4f–4f transitions.2Furthermore, because their 5d-states are strongly influencedby the host lattices, their luminescence properties can be tunedby variation of the hosts. However, it requires time-intensivetrial-and-error experiments to explore and optimize new phos-phors. Even though several strategies have been proposed forefficient development of new phosphors,3,4 an effective methodto select candidate compounds for desirable properties isexpected to focus the time-intensive experiments upon promis-ing candidates.Recently, data-driven approaches have been reported for therapid discovery and development of new phosphors usingscreening of materials databases, high-throughput densityfunctional theory (DFT) calculations, and machine learningon luminescence properties.5–10 The emission spectrum isone of the most important characteristics of phosphors becauseit determines their luminescence color. The emission spectrumis often characterized by its peak top and full width at halfmaximum (FWHM). A relationship among host compounds,the absorption spectrum, and the emission spectrum has beeninvestigated empirically or semi-empirically for Ce3+ and Eu2+-activated phosphors so far.11 Ab initio multi-configurationala Research and Services Division of Materials Data and Integrated System,National Institute for Materials Science, Tsukuba, Ibaraki, 305-0044, Japan.E-mail: KOYAMA.Yukinori@nims.go.jpb Department of Materials Science, Graduate School of Engineering,Osaka Metropolitan University, Sakai, Osaka, 599-8570, Japanc Research Center for Functional Materials, National Institute for Materials Science,Tsukuba, Ibaraki, 305-0044, Japan† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d2ma00881eReceived 1st September 2022,Accepted 10th November 2022DOI: 10.1039/d2ma00881ersc.li/materials-advancesMaterialsAdvancesPAPEROpen Access Article. Published on 29 November 2022. Downloaded on 1/6/2023 2:11:53 AM.  This article is licensed under a Creative Commons Attribution 3.0 Unported Licence.View Article OnlineView Journal  | View Issuehttps://orcid.org/0000-0002-7090-4430https://orcid.org/0000-0002-3840-4049https://orcid.org/0000-0003-2510-4562http://crossmark.crossref.org/dialog/?doi=10.1039/d2ma00881e&domain=pdf&date_stamp=2022-11-28https://doi.org/10.1039/d2ma00881ehttps://doi.org/10.1039/d2ma00881ehttps://rsc.li/materials-advanceshttp://creativecommons.org/licenses/by/3.0/http://creativecommons.org/licenses/by/3.0/https://doi.org/10.1039/d2ma00881ehttps://pubs.rsc.org/en/journals/journal/MAhttps://pubs.rsc.org/en/journals/journal/MA?issueid=MA004001232 |  Mater. Adv., 2023, 4, 231–239 © 2023 The Author(s). Published by the Royal Society of Chemistryquantum chemical calculations have been performed to quanti-tatively calculate configuration coordinate diagrams and absorp-tion spectra.12 Constrained DFT calculations have also beenconducted to evaluate absorption and emission energies.13,14However, these theoretical methods require time-consuming cal-culations at both the ground and excited states. Because of thehigh computational cost, high-throughput theoretical calculationsto screen candidate compounds are not currently feasible.Machine learning to predict emission spectra instead of thetheoretical calculations has been investigated recently.6–8 Sohnand his coworkers reported the pioneering machine-learningstudy on a relationship among emission peak wavelength,FWHM, and local environments of substitution sites inhost lattices,5 and recently reported comprehensive machinelearning to predict band gap, excitation energy, and emissionenergy for Eu2+-activated phosphors.6 Nakano et al. reportedmachine learning to predict emission peak energy fromchemical compositions of the host compounds for Eu2+-activated phosphors.7 The reported prediction accuracy is notdirectly comparable among the theoretical calculations and themachine learning studies because they used different datasets.But the results suggest that the machine learning models6,7have comparable prediction accuracy to the DFT calculations.14Based on the successful machine-learning studies to date, itis expected that new phosphors with desirable luminescenceproperties will be developed using machine learning. Althoughseveral research groups have reported new phosphors by data-driven approaches,9 discovery of new phosphors with adesigned luminescence color is still a big challenge. In thispaper, we report the discovery of three new green or blue-greenemitting phosphors, which a machine-learning model hasproposed as green emitting phosphors. First, we developeda machine learning model to predict the emission peak wave-lengths of Eu2+-activated phosphors from an in-house phos-phor dataset. Next, we explored a materials database andcollected candidate host compounds predicted to show greenemissions by the machine learning model. Then, we synthe-sized and characterized the candidates, and finally discoveredthe three new Eu2+-activated phosphors, Li2Ca4Si4O13:Eu2+,Na2Ca2Si2O7:Eu2+, and SrLaGaO4:Eu2+. The results clearlydemonstrate the power of the machine learning on the emis-sion peak wavelength for rapid and efficient development ofnew phosphors with a designed luminescence color.MethodsData collectionEven though phosphors have been intensively investigated sofar, there is no readily available dataset of phosphor materialsand luminescence properties. Therefore, a dataset of hostcompounds and emission peak wavelengths of Eu2+-activatedphosphors was collected from the literature.1,15 Only hostcompounds with typical oxidation states and containing Ca,Sr, or Ba elements were selected. These alkaline earth metalsare considered as substitution sites for Eu2+ ions because theyhave the same valence and close ionic radii to Eu2+. Crystalstructures of the hosts were collected from the inorganic crystalstructure database (ICSD)16 and AtomWork-Adv.17 Some struc-ture data were modified as follows. (1) Structure data withchemical compositions that deviate from the ideal composi-tions of the hosts, for example containing Eu2+, was correctedto have the ideal compositions of the hosts. (2) Structure datawith partially occupied sites and different site occupancies weremodified to have high occupancy sites only. Partially occupiedsites cause ambiguity in the representation of local environ-ments of the substitution sites. Host compounds with awkwardsite occupancy, which cannot be simply discretized as describedabove, were dropped.Emission peak wavelength is used as a target variable in thisstudy because the emission spectra of phosphors are usuallymeasured and reported in wavelength. The emission peakwavelengths depend on the concentrations of activators andother factors. The conditions in the literature are inconsistent,and the reported values vary more or less. If multiple emissionpeak wavelengths are reported for a single phosphor materialand the reported values differ by more than 30 nm, thephosphor is eliminated. In our opinion, a deviation of 10 nmor more in the emission peak wavelength is conceivable due tothe different conditions.Finally, a dataset composed of 129 Eu2+-activated phosphorswas prepared. The distribution and statistics of the emissionpeak wavelengths are respectively shown in Fig. 1a and Table 1.Constituent elements of the host compounds are summarizedin Fig. 1b. Among the constituent elements, sulfur appeared asboth a cation (S6+) and an anion (S2�). N, O, F, Cl, Br, and Ielements were anions, and the other elements were cations.Host representationTwo sets of features were used to represent host compounds ofEu2+-activated phosphors. The first set is a representation ofchemical compositions (compositional features, hereafter), andthe second set is a representation of crystal structures, parti-cularly local environments of substitution sites for Eu2+ activa-tors, from both geometrical and chemical aspects (structuralfeatures, hereafter).As the compositional features, general-purpose features18were adopted. The general-purpose features were a set ofstatistics of elemental features to represent various aspects ofchemical compositions. Nakano et al. used the same schemefor their machine learning.7 In this study, 22 elemental featuresand seven statistics, namely, weighted arithmetic mean,weighted geometric mean, weighted harmonic mean, weightedstandard deviation, minimum, maximum, and range, wereused. The elemental features and the statistics are respectivelylisted in Tables S1 and S2 in the ESI.† In addition to theelemental features, oxidation states were considered. As oxida-tion states are both positive and negative values and satisfycharge neutrality, the weighted arithmetic, geometric, andharmonic means were excluded. Instead, the seven statisticsof absolute oxidation states were additionally included. As thehosts in this study are all ionic compounds, the statistics of thePaper Materials AdvancesOpen Access Article. Published on 29 November 2022. Downloaded on 1/6/2023 2:11:53 AM.  This article is licensed under a Creative Commons Attribution 3.0 Unported Licence.View Article Onlinehttp://creativecommons.org/licenses/by/3.0/http://creativecommons.org/licenses/by/3.0/https://doi.org/10.1039/d2ma00881e© 2023 The Author(s). Published by the Royal Society of Chemistry Mater. Adv., 2023, 4, 231–239 |  233elemental features and the absolute oxidation states were alsoevaluated for each of the cations only and the anions only.The compositional features consisted of 487 features.To represent the local environments of the substitutionsites, Park et al. used geometrical and elemental features ofactivator-anion and activator-cation polyhedra.6 This idea wasgeneralized, inspired by the general-purpose compositionalfeatures. The structural features used in this study consistedof three groups of features. The first group was a geometricalaspect of the substitution sites. The numbers of neighboringanions and cations, average distances to their neighboringanions and cations, distortion index,19 and bond valencesum20 were evaluated for individual Ca, Sr, and Ba sites. Theneighboring anions were determined using the CrystalNNmethod.21 The neighboring cations were determined so thatthey shared neighboring anions with the substitution sites. Assome of the host compounds used in this study have multiplesubstitution sites, the average and standard deviation of eachfeature among the substitution sites were evaluated and usedas features of the host structures. The number of symmetricallyinequivalent substitution sites was also included. The secondgroup was analogous to the compositional features but specia-lized for the local environments of the substitution sites. Theseven statistics of the 22 elemental features and the absoluteoxidation states were calculated for the neighboring anions andthe neighboring cations of individual Ca, Sr, and Ba sites. Theaverage and standard deviation among the substitution siteswere used as the features of the hosts. Besides the features ofthe substitution sites, density and numerical density wereadded as the third group. The structural features consisted of659 features.The features were evaluated using the Pymatgen package22and a customized version of the XenonPy package.23Machine learningThe general-purpose features used in this study were system-atically calculated to represent various aspects of the hostcompounds, and thus a part of them were redundant andirrelevant to the emission peak wavelength. Therefore, featureselection was adopted before regression. First, features with lowvariance were dropped, and the passed features were standar-dized so that the means were zero and standard deviations wereone. After the standardization, the features were roughlyselected in the order of mutual information with the emissionpeak wavelength. The features were further narrowed downusing recursive feature elimination (RFE) based on the impor-tance of each feature obtained by a regression model. Finally,regression was conducted. The ridge, automatic relevancedetermination (ARD), random forest (RF), gradient boostedregression trees (GB), and bootstrap aggregation (bagging) ofGB methods were applied for the regression. The regressionTable 1 Statistics of emission peak wavelengths of Eu2+-activated phos-phors used in this studyCount 129Mean (nm) 495Median (nm) 478Minimum (nm) 368Maximum (nm) 681Standard deviation (nm) 80Mean absolute deviation (nm) 66Fig. 1 (a) Histogram of emission peak wavelengths and (b) frequency ofconstituent elements of Eu2+-activated phosphors used in this study. S is acation (S6+) and an anion (S2�). N, O, F, Cl, Br, and I are anions. The otherelements are cations.Materials Advances PaperOpen Access Article. Published on 29 November 2022. Downloaded on 1/6/2023 2:11:53 AM.  This article is licensed under a Creative Commons Attribution 3.0 Unported Licence.View Article Onlinehttp://creativecommons.org/licenses/by/3.0/http://creativecommons.org/licenses/by/3.0/https://doi.org/10.1039/d2ma00881e234 |  Mater. Adv., 2023, 4, 231–239 © 2023 The Author(s). Published by the Royal Society of Chemistrymethod used in RFE was the same as the final regression,except for the bagging of GB regression. For the bagging of GBregression, a single GB model was used in RFE to reducecomputation time. The Scikit-learn package24 was used forthe machine learning.The predictive performance of the machine learning modelswas evaluated by 10-fold cross validation by means of the meanabsolute error (MAE), root mean squared error (RMSE), andcoefficient of determination (R2). The scores were averagedamong the folds. The parameters of the regression modelsand the numbers of selected features were selected to minimizethe average RMSE for the validation data. The parameter searchwas performed in a manner of Bayesian optimization using theHyperopt25 and scikit-optimize26 packages with 1000 iterationsfor each method. Default parameters were used for the regres-sion models used in RFE to reduce the computation time forthe parameter search. The pipelines of the machine-learningmodels and the optimized parameters are summarized in TableS3 in the ESI.†ExperimentsCandidates of Eu2+-activated phosphors proposed by a machinelearning model were synthesized and characterized by experi-ments. The phosphors were synthesized by a solid-statemethod. The starting materials (oxides or carbonates) of thehost compounds were mixed with Eu2O3. The amount of Euelement was fixed at 2 at% of the substitution sites, namely, Ca,Sr, and Ba, in the hosts. The starting materials were fired in air,and then fired in a reducing atmosphere (in a carbon heaterfurnace filled with nitrogen). The firing temperatures and timewere altered depending on the host compounds.The products were first characterized using a powder X-raydiffractometer (XRD) (Bruker, D8 ADVANCE, Cu Ka radiation)and a spectrofluorometer (JASCO, FP-8600). The powder XRDanalysis indicated that some products were mixtures of thetarget compounds and impurity phases. As the photolumines-cence (PL) spectra of the powder samples are largely influencedby impurity phases with bright luminescence, it was not clearwhether the PL spectra of the mixture products were derivedfrom the target compounds or the impurity phases. Therefore,after the first screening using the powder samples, well-crystallized particles were picked up from the products andcharacterized by single crystal XRD and microspectroscopy in amanner of the single-particle diagnosis approach.4 The singlecrystal XRD data of the picked particles were collected using adiffractometer (Bruker-AXS, SMART APEX II Ultra) with MoKa radiation. The data were integrated and corrected forabsorption using SADABS. The crystal structures were solvedand refined with SHELX. The PL spectra of the particles wereobtained using a spectrometer (Otsuka electronics, MCPD7700)through a microscope (Olympus, BX51M) under 365 nm LEDexcitation.Results and discussionComparison of regression methodsRegression methods are compared in this section. MAE, RMSE,and R2 for the training and validation data in the cross valida-tion are summarized in Table 2. Fig. 2 illustrates predictedemission peak wavelengths with respect to the reported valuesin the cross validation. The ridge regression is the baselinemodel in this study. The R2 of the ridge regression to thevalidation data, 0.74, suggests that the prediction accuracywas comparable to the previous studies,6,7 although the resultsare not directly comparable due to the use of the differentdatasets.To improve the predictive performance, other regressionmethods were applied. The ARD regression is a Bayesian linearmodel with an intrinsic feature selection capability, and thismethod resulted in a slightly higher prediction accuracy to thevalidation data compared with the ridge regression. The ridgeand ARD models showed relatively large fitting errors to thetraining data. This indicates that the relationship between thegeneral-purpose features used in this study and the emissionpeak wavelength is basically nonlinear, although the general-purpose features are numerous and diverse. The small differencesin the predictive performance scores between the training andvalidation data of these linear models imply that the obtainedpredictive performance almost reached the optimal of linearmodels.Nonlinear regression methods were applied to obtain ahigher predictive performance. The RF model showed slightlysmaller MAE but larger RMSE to the validation data than theARD model. The GB model showed much smaller MAE andRMSE to the validation data than the ARD and RF models.However, the fitting errors of the GB model to the training datawere almost zero, and overfitting was concerned. To dispel theconcerns about the overfitting of the GB model, the baggingTable 2 Mean absolute error (MAE), root mean squared error (RMSE), and coefficient of determination (R2) of the machine learning models for thetraining and validation data in the cross validation. The scores were averaged among the folds of the cross validation. Standard deviations among the foldsare shown in parenthesesRegression methodMAE (nm) RMSE (nm) R2Training Validation Training Validation Training ValidationRidge 24 (1) 29 (7) 31 (1) 36 (9) 0.85 (0.01) 0.74 (0.13)ARD 25 (1) 28 (6) 31 (1) 34 (9) 0.84 (0.02) 0.77 (0.12)RF 10 (0) 27 (7) 14 (1) 35 (11) 0.97 (0.00) 0.75 (0.19)GB 0 (0) 24 (9) 0 (0) 31 (12) 1.00 (0.00) 0.79 (0.18)Bagging of BG 10 (0) 25 (7) 14 (1) 33 (10) 0.97 (0.00) 0.77 (0.17)Paper Materials AdvancesOpen Access Article. Published on 29 November 2022. Downloaded on 1/6/2023 2:11:53 AM.  This article is licensed under a Creative Commons Attribution 3.0 Unported Licence.View Article Onlinehttp://creativecommons.org/licenses/by/3.0/http://creativecommons.org/licenses/by/3.0/https://doi.org/10.1039/d2ma00881e© 2023 The Author(s). Published by the Royal Society of Chemistry Mater. Adv., 2023, 4, 231–239 |  235technique was adopted to the GB regression. The baggingtechnique is also used in the RF regression and is expected tosuppress the overfitting. The bagging of the GB model showedintermediate predictive performance to the validation databetween the GB and RF models. The better predictive perfor-mance of the bagging of the GB model compared with the RFmodel is probably due to the higher predictive capacity of theGB regression as a base learner compared with that of theregression trees in the RF model.The RF, GB, and bagging of GB models showed largeprediction errors for some specific compounds in the valida-tion folds. A plausible cause of these large prediction errors isthat the phosphor dataset used in this study is not sufficientlylarge with respect to the diverse phosphor materials. If a hostcompound is unique in the dataset and is put in the validationdata in a fold of the cross validation, the training data does notcontain compounds like the unique host, resulting in a largeprediction error. Another possible cause of the large predictionerrors is the quality of the reported emission peak wavelengths.Some phosphor materials have a deviation of tens of nm ormore in the reported emission peak wavelengths. Phosphorswith large deviations have been eliminated from the dataset asmentioned in the Methods section, but the data might not befully curated yet. Further investigation for the large predictionerrors is beyond the scope of this study, whereas obtaining ahigh-quality dataset that covers diverse materials is a big issuein the data-driven materials research.Emission peak wavelength is used as the target variable inthis study, while the energy of the emission peak was usedas the target variables in the previous studies.6,7 Note that inprinciple, correction of intensity is required to convert anemission spectrum from the wavelength to energy and viceversa, and its peak top shifts. For comparison with previousstudies, the emission peak wavelengths were simply convertedinto energy without such intensity correction, and regressionon the converted energy was conducted. The bagging of the GBmethod was used. The prediction accuracy and the plot of thepredicted values with respect to the reported ones are shown inTable S4 and Fig. S1 in the ESI.† The present results (0.13 eVMAE, 0.16 eV RMSE) are slightly smaller (better) than those inFig. 2 Predicted emission peak wavelengths with respect to reported values for the training (blue) and validation (red) data in the cross validation using(a) ridge, (b) automatic relevance determination (ARD), (c) random forest (RF), (d) gradient boosted regression trees (GB), and (e) bagging of GB methods.Materials Advances PaperOpen Access Article. Published on 29 November 2022. Downloaded on 1/6/2023 2:11:53 AM.  This article is licensed under a Creative Commons Attribution 3.0 Unported Licence.View Article Onlinehttp://creativecommons.org/licenses/by/3.0/http://creativecommons.org/licenses/by/3.0/https://doi.org/10.1039/d2ma00881e236 |  Mater. Adv., 2023, 4, 231–239 © 2023 The Author(s). Published by the Royal Society of Chemistryref. 7 (0.139 eV MAE, 0.183 eV RMSE), and slightly larger (worse)than ref. 6 (0.020 eV2 MSE corresponding to 0.14 eV RMSE).Only the features derived from the chemical composition wereused in ref. 7, whereas features derived from the structure werealso considered in ref. 6 and in this study. This would haveresulted in the slightly poorer predictive performance in ref. 7.In ref. 6, the data were restricted to phosphors with only asingle substitution site and to examples of the critical activatorconcentrations corresponding to concentrations showing thehighest PL intensity. In contrast, some phosphors in the pre-sent dataset had multiple substitution sites and the activatorconcentrations depended on the literature. The restriction inref. 6 might have suppressed the data variability and reducedthe RMSE, but it also limited the coverage of the machinelearning model.Test with additional literature dataTo develop new phosphor materials, the AtomWork-Adv mate-rials database was explored and candidate host compounds ofoxides, nitrides, and oxynitrides composed of main elementsand containing Ca, Sr, or Ba elements were collected. Emissionpeak wavelengths of the collected compounds were predictedusing the bagging of GB model that was rebuilt using the wholephosphor dataset with the optimized parameters. Compoundswith predicted wavelengths of about 500–550 nm were selectedas candidates of green-emitting phosphors. Some of the col-lected compounds had already been reported as Eu2+-activatedphosphors, while they were not in the phosphor dataset.Therefore, an additional test was performed on the machinelearning model with additional 21 Eu2+-activated phosphors.The predicted and reported emission peak wavelengths ofthe additional 21 phosphors are illustrated in Fig. 3, which areoverlaid on the cross-validation results (Fig. 2e). MAE andRMSE to the test data were 33 nm and 42 nm, respectively.The distribution of the prediction errors looks comparable withthat for the validation data in the cross validation, but the MAEand RMSE were much larger than the values estimated by thecross validation. The test data contained Sr2GeO4:Eu2+, whichlooked like an outlier. Sr2GeO4:Eu2+ showed the largest predic-tion error: 515 nm of the prediction versus 620 nm reported inref. 27. This host compound contains Ge element, which wasnot in the phosphor dataset as shown in Fig. 1b. MAE andRMSE to the other 20 test data except Sr2GeO4:Eu2+ wererespectively 30 nm and 37 nm, which were comparable to theresults from the cross validation. These suggest that it isessential to extend the phosphor dataset to cover the diversephosphor materials for a higher predictive performance over awide range of candidate compounds.Exploration of new phosphor materialsAs described in the previous section, oxides, nitrides, andoxynitrides composed of main elements and containing Ca,Sr, or Ba elements were collected from the AtomWork-Advmaterials database to develop new phosphors. 20 candidatecompounds were selected by removing high-pressure phasesand selecting compounds with predicted emission peakFig. 3 Predicted emission peak wavelengths with respect to reportedvalues for the test data of the additionally collected Eu2+-activatedphosphors (green) using the bagging of the gradient boosted regressiontrees method. The plot is overlaid on the cross-validation results (Fig. 2e).Table 3 Compositions and space groups of candidate compounds,predicted emission peak wavelengths, and summary of experimentalresults. Multiple lines for a single composition denote that the candidatecomposition has polytypes. The space groups and predictions for thepolytypes of the synthesized products are underlinedIndex Composition Space groupPredictedwavelength(nm) Experimental results1 Ba2MgGe2O7 P%421m (113) 501 No luminescence2 Ba2ZnGe2O7 P%421m (113) 500 No luminescence3 Ca2Ga2GeO7 P%421m (113) 513 No luminescence4 Ca2GeO4 P63mc (186) 518No luminescencePnma (62) 5175 Ca2ZnGe2O7 P%421m (113) 510 Low-purity products6 Ca3Al2Ge3O12 Ia%3d (230) 486 Eu3+ luminescence7 Ca5Ge3O11 C2/m (12) 517Eu3+ luminescenceP%1 (2) 5238 CaGa2O4 Pna21 (33) 512 No luminescenceP21/c (14) 5159 K4BaSi3O9 Ama2 (40) 520 Eu3+ luminescence10 K4CaGe3O9 Pa%3 (205) 524 Eu3+ luminescence11 K4SrSi3O9 pa%3 (205) 524 Eu3+ luminescenceAma2 (40) 51912 Li2Ca4Si4O13 P%1 (2) 529 Eu2+ luminescence,520 nm13 Na2Ca2Si2O7 C2/c (15) 544 Eu2+ luminescence,527 nm14 Na2SrSi2O6 R%3m (166) 519 Eu3+ luminescence15 Na4SrSi3O9 C2 (5) 527 No luminescence16 Sr2Al2GeO7 P%421m (113) 508 Eu3+ luminescence17 Sr2MgGe2O7 P%421m (113) 494 No luminescence18 Sr3Ga4O9 P%1 (2) 516 No luminescence19 SrGeO3 C2/c (15) 485 No luminescenceP%1 (2) 49620 SrLaGaO4 I4/mmm (139) 548 Eu2+ luminescence,502 nmPaper Materials AdvancesOpen Access Article. Published on 29 November 2022. Downloaded on 1/6/2023 2:11:53 AM.  This article is licensed under a Creative Commons Attribution 3.0 Unported Licence.View Article Onlinehttp://creativecommons.org/licenses/by/3.0/http://creativecommons.org/licenses/by/3.0/https://doi.org/10.1039/d2ma00881e© 2023 The Author(s). Published by the Royal Society of Chemistry Mater. Adv., 2023, 4, 231–239 |  237wavelengths of about 500–550 nm. As a result, the 20 candi-dates were all oxides. These candidates were synthesized andcharacterized by experiments. Compositions, space groups,predicted wavelengths, and experimental results are summar-ized in Table 3. Some candidate compounds had polytypes. Theprediction was done for all the polytypes. The predicted wave-lengths for all the polytypes are listed together in the table.Data for the polytypes of the synthesized products areunderlined.The target compounds were synthesized with a purity of70 wt% or more estimated by the powder XRD analysis, exceptfor Ca2ZnGe2O7. Ca2ZnGe2O7 was obtained only as low-purityproducts, and further characterization was not conducted to it.PL spectra were measured for the powder products of theremaining 19 candidates. No luminescence was observedfrom the 9 products. Only sharp 4f–4f emission spectra derivedfrom Eu3+ activators were observed from the other 7 products.Finally, emission spectra from the Eu2+ activators were observedfrom three products, Li2Ca4Si4O13, Na2Ca2Si2O7, and SrLaGaO4.XRD patterns of the powder products, which are shown in Fig. S2in the ESI,† indicated that the products were mixtures of thetarget compounds and impurity phases. Because the PL spectraof the powder samples are largely influenced by impurityphases with bright luminescence, it is essential to verify thatthe observed Eu2+ luminescence derives from the target com-pounds. Therefore, well-grown particles were picked up fromthe products of these three candidates and characterized in amanner of the single-particle diagnosis.4 The single crystal XRDconfirmed that the crystal structures of the picked particleswere identical to Li2Ca4Si4O13,28 Na2Ca2Si2O7,29 and SrLa-GaO4,30 respectively. Crystallographic information by the singlecrystal XRD is listed in the ESI.† Fig. 4 displays photo imagesand emission spectra of the picked particles of these threephosphors under 365 nm LED excitation.Li2Ca4Si4O13 was predicted to have an emission peak wave-length of 529 nm, and the peak was observed at 520 nm.Na2Ca2Si2O7 were predicted to have a peak wavelength of544 nm, and the peak was observed at 527 nm. These twophosphors show green emissions as designed, and the predic-tion errors were as small as 9 nm and 17 nm, respectively.In both cases, FWHMs were very large, 140 nm or more. Thereare many possible substitution sites for Eu2+ in both structures.The luminescence properties depend on the substitution sites,and the observed emission spectra are an integration of thosefrom the individual sites. The machine learning predictionworked well even for such complex structures. In Na2Ca2Si2O7,weak Eu3+ luminescence was also observed. Some substitutionsites might be suitable for Eu3+ even in the synthesis under areducing atmosphere.The Eu-doped SrLaGaO4 particle showed both a blue-greenemission derived from Eu2+ activators and a characteristic redemission from Eu3+ activators. As for the Eu2+ luminescence,the predicted emission peak wavelength was 548 nm, whereasthe peak was observed at 502 nm with a FWHM of 83 nm. Theprediction error, 46 nm, was large, but would be acceptablewith respect to the prediction accuracy of the present machinelearning model. As Sr and La atoms occupy the same crystal-lographic site in SrLaGaO4, Eu atoms might occupy this site in amixed valence of Eu2+ and Eu3+, resulting in the simultaneousEu2+ and Eu3+ luminescence.The three new Eu2+-activated phosphors were successfullydiscovered, but a large part of the candidates failed to showEu2+ luminescence. The products were annealed in the redu-cing atmosphere to obtain Eu2+, but the reduction processFig. 4 Photo images (upper panels) and emission spectra (lower panels) of particles of Eu-doped (a) Li2Ca4Si4O13, (b) Na2Ca2Si2O7, and (c) SrLaGaO4under 365 nm LED excitation.Materials Advances PaperOpen Access Article. Published on 29 November 2022. Downloaded on 1/6/2023 2:11:53 AM.  This article is licensed under a Creative Commons Attribution 3.0 Unported Licence.View Article Onlinehttp://creativecommons.org/licenses/by/3.0/http://creativecommons.org/licenses/by/3.0/https://doi.org/10.1039/d2ma00881e238 |  Mater. Adv., 2023, 4, 231–239 © 2023 The Author(s). Published by the Royal Society of Chemistryseemed insufficient for some compounds. The stability ofEu2+/Eu3+ in the host lattices is attributed to the redox potentialof the substituted Eu ions and the annealing conditions,whereas the annealing conditions were limited depending onthe host compounds to prevent them from melting or decom-posing. Even if Eu2+ is stable in the hosts, the luminescencemay be quenched if the energy levels of the Eu2+ excited statesoverlap or are close to the conduction bands of the hosts. Theseare likely the reasons why many candidates have not exhibitedEu2+ luminescence. At this moment, it is hard to predict thevalence and energy level of the substituted Eu ion in the hostand to predict appropriate synthesis conditions to obtain Eu2+,prior to synthesis. Prediction of these factors is also importantfor efficient development of new Eu2+-activated phosphors andis a future task.ConclusionsTo rapidly discover new Eu2+-activated phosphors with adesigned luminescence color, a machine learning model topredict emission peak wavelength was developed from thephosphor dataset composed of 129 Eu2+-activated phosphors.The general-purpose compositional and structural featureswere used to represent host compounds. The bagging techni-que with the gradient boosted regression trees method wasadopted to obtain high predictive performance against thenonlinear relationship between the features and the emissionpeak wavelength, and to avoid overfitting with the smallphosphor dataset. The predictive performance of the builtmachine learning model was comparable to those in previousstudies.6,7 The results of the cross validation and the additionaltest suggest that it is essential to extend the phosphor datasetto cover the diverse phosphor materials for a higher predictiveperformance over a wide range of candidate compounds.Using the machine learning model, new green-emittingEu2+-activated phosphors were searched from the AtomWork-Adv materials database. Among twenty candidate compoundspredicted to have emission peak wavelengths of about 500–550 nm, three new phosphors, namely, Eu-doped Li2Ca4Si4O13,Na2Ca2Si2O7, and SrLaGaO4, were successfully synthesized.Li2Ca4Si4O13:Eu2+ and Na2Ca2Si2O7:Eu2+ showed the Eu2+ lumi-nescence of the green color as designed. Eu-doped SrLaGaO4showed simultaneous Eu2+ and Eu3+ luminescence, and itshows a blue-green emission derived from the Eu2+ activators.These results clearly demonstrate that the machine learning onthe emission peak wavelength is useful for the rapid andefficient development of new Eu2+-activated phosphors with adesigned luminescence color.Author contributionsY. K. and N. H. devised the basic idea of this study. N. H.collected the phosphor data. Y. K. conducted the machinelearning, and H. I., T. T. and N. H. contributed to refiningthe models. M. H. and S. F. performed the synthesis andcharacterization under the guidance of T. T. and N. H. Y. K.and T. T. prepared the original draft of the manuscript. All theauthors approved the final version of the manuscript.Conflicts of interestThere are no conflicts to declare.AcknowledgementsThis work was supported in part by the Japan Science andTechnology Agency (JST), CREST Gant Number JPMJCR19J2.References1 S. Ye, F. Xiao, Y. X. Pan, Y. Y. Ma and Q. Y. Zhang, Mater. Sci.Eng., R, 2010, 71, 1; Z. G. Xia and Q. L. Liu, Prog. Mater. Sci.,2016, 84, 59; L. Wang, R. J. Xie, T. Suehiro, T. Takeda andN. Hirosaki, Chem. Rev., 2018, 118, 1951.2 X. Qin, X. W. Liu, W. Huang, M. Bettinelli and X. G. Liu,Chem. Rev., 2017, 117, 4488.3 X. F. Luo and R. J. Xie, J. Rare Earths, 2020, 38, 464;X. D. Sun, K. A. Wang, Y. Yoo, W. G. Wallace-Freedman,C. Gao, X. D. Xiang and P. G. Schultz, Adv. Mater., 1997,9, 1046; E. Danielson, J. H. Golden, E. W. McFarland,C. M. Reaves, W. H. Weinberg and X. D. Wu, Nature, 1997,389, 944; J. S. Wang, Y. Yoo, C. Gao, I. Takeuchi, X. D. Sun,H. Y. Chang, X. D. Xiang and P. G. Schultz, Science, 1998,279, 1712; K. S. Sohn, J. M. Lee and N. S. Shin, Adv. Mater.,2003, 15, 2081; W. B. Park, N. Shin, K. P. Hong, M. Pyo andK. S. Sohn, Adv. Funct. Mater., 2012, 22, 2258.4 N. Hirosaki, T. Takeda, S. Funahashi and R. J. Xie, Chem.Mater., 2014, 26, 4280.5 W. B. Park, S. P. Singh, M. Kim and K. S. Sohn, ACS Comb.Sci., 2015, 17, 317.6 C. Park, J. W. Lee, M. Kim, B. D. Lee, S. P. Singh, W. B. Parkand K. S. Sohn, Inorg. Chem. Front., 2021, 8, 4610.7 H. Nakano, K. Tanaka, T. Miyao, K. Funatsu, R. Shirasawaand S. Tomiya, Chem. Lett., 2017, 46, 1482.8 S. Q. Lai, M. Zhao, J. W. Qiao, M. S. Molokeev and Z. G. Xia,J. Phys. Chem. Lett., 2020, 11, 5680.9 J. M. Ha, Z. B. Wang, E. Novitskaya, G. A. Hirata, O. A.Graeve, S. P. Ong and J. McKittrick, J. Lumin., 2016, 179,297; Z. B. Wang, J. Ha, Y. H. Kim, W. B. Im, J. McKittrick andS. P. Ong, Joule, 2018, 2, 914; Y. Zhuo, A. M. Tehrani,A. O. Oliynyk, A. C. Duke and J. Brgoch, Nat. Commun.,2018, 9, 4377; S. X. Li, Y. H. Xia, M. Amachraa, N. T. Hung,Z. B. Wang, S. P. Ong and R. J. Xie, Chem. Mater., 2019,31, 6286.10 S. X. Li and R. J. Xie, ECS J. Solid State Sci. Technol., 2019,9, 016013.11 L. G. Vanuitert, J. Lumin., 1984, 29, 1; P. Dorenbos, ECSJ. Solid State Sci. Technol., 2013, 2, R3001.Paper Materials AdvancesOpen Access Article. Published on 29 November 2022. Downloaded on 1/6/2023 2:11:53 AM.  This article is licensed under a Creative Commons Attribution 3.0 Unported Licence.View Article Onlinehttp://creativecommons.org/licenses/by/3.0/http://creativecommons.org/licenses/by/3.0/https://doi.org/10.1039/d2ma00881e© 2023 The Author(s). Published by the Royal Society of Chemistry Mater. Adv., 2023, 4, 231–239 |  23912 Z. Barandiarán, J. Joos and L. Seijo, Luminescent Materials:A Quantum Chemical Approach for Computer-Aided Discoveryand Design, Springer; Cham, 2022; J. L. Pascual, J. Schamps,Z. Barandiaran and L. Seijo, Phys. Rev. B: Condens. MatterMater. Phys., 2006, 74, 104105; Z. Barandiaran, A. Meijerinkand L. Seijo, Phys. Chem. Chem. Phys., 2015, 17, 19874.13 Y. C. Jia, A. Miglio, S. Poncé, X. Gonze and M. Mikami, Phys.Rev. B, 2016, 93, 155111.14 Y. C. Jia, A. Miglio, S. Poncé, M. Mikami and X. Gonze, Phys.Rev. B, 2017, 96, 125132; Y. C. Jia, A. Miglio, S. Poncé,M. Mikami and X. Gonze, Phys. Rev. B, 2020, 101, 089902.15 Y. Li, M. Gecevicius and J. R. Qiu, Chem. Soc. Rev., 2016,45, 2090; W. M. Yen and M. J. Weber, Inorganic Phosphors:Compositions, Preparation and Optical Properties, CRC Press;Boca Raton, 2004.16 Inorganic Crystal Structure Database (ICSD). FIZ KarlsruheGmbH, Germany. https://icsd.products.fiz-karlsruhe.de/.17 AtomWork-Adv. National Institute for Materials Science,Japan. https://atomwork-adv.nims.go.jp/.18 L. Ward, A. Agrawal, A. Choudhary and C. Wolverton, npjComput. Mater., 2016, 2, 16028; A. Seko, H. Hayashi, K.Nakayama, A. Takahashi and I. Tanaka, Phys. Rev. B, 2017,95, 144110.19 W. H. Baur, Acta Crystallogr., Sect. B: Struct. Crystallogr.Cryst. Chem., 1974, 30, 1195.20 M. O’keeffe and N. E. Brese, J. Am. Chem. Soc., 1991,113, 3226.21 N. E. R. Zimmermann and A. Jain, RSC Adv., 2020, 10, 6063.22 S. P. Ong, W. D. Richards, A. Jain, G. Hautier, M. Kocher,S. Cholia, D. Gunter, V. L. Chevrier, K. A. Persson andG. Ceder, Comput. Mater. Sci., 2013, 68, 314.23 XenonPy. https://github.com/yoshida-lab/XenonPy/.24 F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B.Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss,V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau,M. Brucher, M. Perrot and É. Duchesnay, J. Mach. Learn.Res., 2011, 12, 2825.25 Hyperopt. https://hyperopt.github.io/hyperopt/.26 Scikit-optimize. https://scikit-optimize.github.io/.27 K. Fiaczyk and E. Zych, RSC Adv., 2016, 6, 91836.28 M. E. Villafuerte-Castrejón, A. Dago and R. Pomés, J. SolidState Chem., 1994, 112, 438.29 V. Kahlenberg and A. Hösch, Z. Kristallogr., 2002, 217, 155.30 J. F. Britten, H. A. Dabkowska, A. B. Dabkowski, J. E.Greedan, J. L. Campbell and W. J. Teesdale, Acta Crystallogr.,Sect. C: Cryst. Struct. Commun., 1995, 51, 1975.Materials Advances PaperOpen Access Article. Published on 29 November 2022. Downloaded on 1/6/2023 2:11:53 AM.  This article is licensed under a Creative Commons Attribution 3.0 Unported Licence.View Article Onlinehttps://icsd.products.fiz-karlsruhe.de/https://atomwork-adv.nims.go.jp/https://github.com/yoshida-lab/XenonPy/https://hyperopt.github.io/hyperopt/https://scikit-optimize.github.io/http://creativecommons.org/licenses/by/3.0/http://creativecommons.org/licenses/by/3.0/https://doi.org/10.1039/d2ma00881e