# Fileset

[d2ma00881e1.pdf](https://mdr.nims.go.jp/filesets/fe3b6d6d-5773-4d9d-975a-8bc02d6062c5/download)

## Creator

[Yukinori Koyama](https://orcid.org/0000-0002-7090-4430), Hidekazu Ikeno, Masamichi Harada, Shiro Funahashi, [Takashi Takeda](https://orcid.org/0000-0003-2510-4562), Naoto Hirosaki

## Rights

Creative Commons BY Attribution 4.0 International[Creative Commons BY Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/)

## Other metadata

[Rapid discovery of new Eu2+-activated phosphors with a designed luminescence color using a data-driven approach](https://mdr.nims.go.jp/datasets/135a07a6-821d-4480-8f76-44d867b2e074)

## Fulltext

1 Supplementary Information Rapid Discovery of New Eu2+-Activated Phosphors with a Designed Lumines-cence Color by a Data-Driven Approach Yukinori Koyama,*a Hidekazu Ikeno, b Masamichi Harada, c Shiro Funahashi, c Takashi Takeda c and Naoto Hirosaki c a Research and Services Division of Materials Data and Integrated System, National Institute for Materials Science, Tsu-kuba, Ibaraki 305-0044, Japan b Department of Materials Science, Graduate School of Engineering, Osaka Metropolitan University, Sakai, Osaka 599-8570, Japan c Research Center for Functional Materials, National Institute for Materials Science, Tsukuba, Ibaraki 305-0044, Japan  * Email: KOYAMA.Yukinori@nims.go.jp     Electronic Supplementary Material (ESI) for Materials Advances.This journal is © The Royal Society of Chemistry 2022  2 Table S1. List of elemental features used for the general-purpose features. The elemental features were obtained from the XenonPy package (Ref. 23). Elemental feature Atomic number Period Group Atomic mass Number of valence electrons Number of valence s electrons Number of valence p electrons Number of valence d electrons Number of valence f electrons Number of unoccupied valence states Number of unoccupied valence s states Number of unoccupied valence p sates Number of unoccupied valence d states Number of unoccupied valence f states Atomic radius Covalent radius Van de Waals radius Electronegativity Electron affinity First ionization energy Mendeleev number Polarizability       3 Table S2. List of statistics used for the general-purpose features. 𝑓𝑓𝑖𝑖  and 𝑤𝑤𝑖𝑖  (∑ 𝑤𝑤𝑖𝑖𝑖𝑖 = 1) denote an elemental feature and atomic fraction of element 𝑖𝑖, respectively. 𝑛𝑛 denotes the number of elements. Statistic Equation Weighted arithmetic mean 𝑓𝑓mean = �𝑤𝑤𝑖𝑖𝑓𝑓𝑖𝑖𝑛𝑛𝑖𝑖=1 Weighted geometric mean 𝑓𝑓g-mean = �𝑓𝑓𝑖𝑖𝑤𝑤𝑖𝑖𝑛𝑛𝑖𝑖=1 Weighted harmonic mean 𝑓𝑓h-mean =1∑ 𝑤𝑤𝑖𝑖𝑓𝑓𝑖𝑖𝑛𝑛𝑖𝑖=1 Weighted standard deviation 𝑓𝑓sd = ��𝑤𝑤𝑖𝑖(𝑓𝑓𝑖𝑖 − 𝑓𝑓mean)2𝑛𝑛𝑖𝑖=1 Minimum 𝑓𝑓min = min{𝑓𝑓𝑖𝑖} Maximum 𝑓𝑓max = max{𝑓𝑓𝑖𝑖} Range 𝑓𝑓range = 𝑓𝑓max − 𝑓𝑓min       4 Table S3. Feature-selection and regression pipeline, parameter ranges and optimized values of the a) ridge, b) automat-ic relevance determination (ARD), c) random forest (RF), d) gradient boosted regression trees (GB), and e) bagging of GB models. Classes and functions in the scikit-learn package are listed without their module names. a) Ridge Estimator Parameter Range Optimized value VarianceThreshold threshold Fixed 1.0e-7 StandardScaler    SelectKBest score_func Fixed mutual_info_regression  k 100, 150, …, 350 300 RFE estimator Fixed Ridge (default parameters)  n_features_to_select 10, 20, …, 100 90  step Fixed 10 Ridge alpha [1.0e-6, 1.0e+6] (log-uniform) 35.1      5 b) ARD Estimator Parameter Range Optimized value VarianceThreshold threshold Fixed 1.0e-7 StandardScaler    SelectKBest score_func Fixed mutual_info_regression  k 100, 150, …, 350 350 RFE estimator Fixed ARDRegression (default parameters)  n_features_to_select 10, 20, …, 100 60  step Fixed 10 ARDRegression alpha_1 [0.0, 1.0] (uniform) 0.734  alpha_2 [1.0e-6, 1.0e+6] (log-uniform) 1.73e-5  lambda_1 [0.0, 1.0] (uniform) 0.335  lambda_2 [1.0e-6, 1.0e+6] (log-uniform) 0.494  threshold_lambda [1.0e+2, 1.0e+6] (log-uniform) 4.65e+4       6 c) RF Estimator Parameter Range Optimized value VarianceThreshold threshold Fixed 1.0e-7 StandardScaler    SelectKBest score_func Fixed mutual_info_regression  k 100, 150, …, 350 350 RFE estimator Fixed RandomForestRegressor (default parameters)  n_features_to_select 10, 20, …, 100 100  step Fixed 10 RandomForestRegressor max_depth 1, 2, …, 20 13  min_samples_leaf 1, 2, 3 1  n_estimators 50, 60, …, 200 180       7 d) GB Estimator Parameter Range Optimized value VarianceThreshold threshold Fixed 1.0e-7 StandardScaler    SelectKBest score_func Fixed mutual_info_regression  k 100, 150, …, 350 350 RFE estimator Fixed GradientBoostingRegressor (default parameters)  n_features_to_select 10, 20, …, 100 70  step Fixed 10 GradientBoostingRegressor learning_rate [0.01, 0.5] (log-uniform) 0.111  max_depth 1, 2, …, 5 3  n_estimators 100, 200, …, 1000 900       8 e) Bagging of GB Estimator Parameter Range Optimized value VarianceThreshold threshold Fixed 1.0e-7 StandardScaler    SelectKBest score_func Fixed mutual_info_regression  k 100, 150, …, 350 350 RFE estimator Fixed GradientBoostingRegressor (default parameters)  n_features_to_select 10, 20, …, 100 100  step Fixed 10 BaggingRegressor base_estimator Fixed GradientBoostingRegressor  n_estimators Fixed 25 GradientBoostingRegressor (base_estimator) learning_rate [0.01, 0.5] (log-uniform) 0.213  max_depth 1, 2, …, 5 3  n_estimators 100, 200, …, 1000 400       9 Table S4. Mean absolute error (MAE), root mean squared error (RMSE), and coefficient of determination (R2) of the machine learning on emission peak energy using the bagging of the gradient boosted regression trees method for the training and validation data in the cross validation. The scores were averaged among the folds of the cross validation. Standard deviations among the folds are shown in parentheses. Score Training Validation MAE / eV 0.05 (0.00) 0.13 (0.03) RMSE / eV 0.07 (0.00) 0.16 (0.05) R2 0.97 (0.00) 0.78 (0.16)     Figure S1. Predicted emission peak energies with respect to reported values for the training (blue) and validation (red) data in the cross validation using the bagging of the gradient boosted regression trees method.       10 a) Li2Ca4Si4O13  b) Na2Ca2Si2O7  c) SrLaGaO4  Figure S2. XRD patterns of powder samples of Eu-doped (a) Li2Ca4Si4O13, (b) Na2Ca2Si2O7, and (c) SrLaGaO4.