# Fileset

[Dieb-Tsuda2018_Chapter_MachineLearning-BasedExperimen.pdf](https://mdr.nims.go.jp/filesets/eaea4853-5181-4189-a1f2-d6056cec78a1/download)

## Creator

[Tsuda, Koji](https://orcid.org/0000-0002-4288-1606), [Dieb, Thaer M.](https://orcid.org/0000-0002-8111-2009)

## Rights

Creative Commons BY Attribution 4.0 International[Creative Commons BY Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/)

## Other metadata

[Machine Learning-Based Experimental Design in Materials Science](https://mdr.nims.go.jp/datasets/bdfcceb0-aeb3-4d80-835a-bc91df27172f)

## Fulltext

Chapter 4Machine Learning-Based ExperimentalDesign in Materials ScienceThaer M. Dieb and Koji TsudaAbstract In materials design and discovery processes, optimal experimental design(OED) algorithms are getting more popular. OED is often modeled as an optimiza-tion of a black-box function. In this chapter, we introduce two machine learning-based approaches for OED: Bayesian optimization (BO) and Monte Carlo tree search(MCTS). BO is based on a relatively complex machine learning model and hasbeen proven effective in a number of materials design problems. MCTS is a sim-pler and more efficient approach that showed significant success in the computer Gogame. We discuss existing OED applications in materials science and discuss futuredirections.Keywords Materials design ⋅ Optimal experiment design ⋅ Machine learning4.1 IntroductionMaterials design and discovery is a fundamental issue in materials science andengineering. The design of composite material structure, that achieves certain quali-ty metrics, is often the problem of selecting the optimal solution from a search space[1, 2]. Traditionally, this process depends on personal experience and expensivetrial-and-error experiments. To accelerate this process, several optimal experimentaldesign (OED) algorithms have been proposed aiming to reduce the number of req-uired experiments [3–8]. Figure 4.1 illustrates the materials design process by anoptimal experimental design approach. Given a space of candidates S, OED aims toT. M. Dieb ⋅ K. Tsuda (✉)National Institute for Materials Science, Tsukuba, Japane-mail: tsuda@k.u-tokyo.ac.jpT. M. Diebe-mail: MOUSTAFADIEB.Thaer@nims.go.jpT. M. Dieb ⋅ K. TsudaGraduate School of Frontier Sciences, The University of Tokyo, Kashiwa, JapanK. TsudaCenter for Advanced Intelligence Project, RIKEN, Tokyo, Japan© The Author(s) 2018I. Tanaka (ed.), Nanoinformatics, https://doi.org/10.1007/978-981-10-7617-6_46566 T. M. Dieb and K. TsudaFig. 4.1 Optimalexperimental design (OED)algorithm process. For apredetermined number ofiterations, OED algorithmselects a candidate set fromthe candidate space forexperimentation. Theexperimental outcomes arethen exploited for a betterselection in the next iterationfind the best candidate that optimizes a black-box function f (s), whose evaluationis possible only by an experiment. Starting from a random set of candidate solu-tions, an OED algorithm iteratively selects a set of candidate solutions for experi-ments. Experimental results are fed back to the OED algorithm to make further deci-sions. In many cases, experiments are replaced by simulators such as first-principlecalculation.In this chapter, we review the applications of two OED algorithms in the materialsscience domain. The first is Bayesian optimization (BO) [9], which has been proveneffective in many materials design and discovery studies [1, 2, 6, 7, 10–13]. In BOmethods, a machine learning model is employed to reconstruct the black-box func-tion f (s). In addition, the uncertainty of prediction is also taken into considerationin candidate selection. The second is Monte Carlo tree search (MCTS) that showedexceptional performance in computer Go [14]. MCTS explores a tree-shaped searchspace and is more efficient than BO in most cases. In a recent study [8], MCTS wasapplied to a Si-Ge alloy design problem and shown to be applicable to large-scaledesign problems.This chapter is organized into four sections. Section 4.2 discusses the Bayesianoptimization method and its applications in materials design and discovery, whileSect. 4.3 is dedicated to Monte Carlo tree search. Section 4.4 concludes this chapterwith a brief look at other available OED approaches.4.2 Bayesian OptimizationIn machine learning communities, Bayesian Optimization (BO), aka kriging, hasbecome a very popular tool for optimization problems recently [15–17]. BO is asequential design strategy to optimize an expensive black-box function f (s). Deriva-tives of f are not required. The difference between Bayesian optimization and earlier4 Machine Learning-Based Experimental Design in Materials Science 67models that used regression [18] is that, BO methods not only consider the predictedmerit of candidates, but also quantify uncertainty as the predictive variance. Basedon this variance, BO can determine where to query f (s) next to achieve maximumperformance. In this section, we will briefly describe a basic BO method, then reviewseveral applications in the domain of materials design and discovery.4.2.1 MethodAssume that each candidate is represented using a set ofN descriptors. The candidateset is then described as a set of points S = {s1, ..sm} in an N-dimensional space. Weare looking for the best point sopt ∈ S that maximizes a target black-box functionf (s). It is very common, particularly in materials science and engineering domain,that the cost of querying f (s) is very high. It is necessary to find the optimal solutionsopt with as few queries as possible.Bayesian optimization methods maintain a probabilistic model of f (s), most com-monly Gaussian process (GP) [19] (Fig. 4.2). Initially, a number of candidates arerandomly selected and f (s) is obtained for each of them. GP is trained using thesedata and the user obtains a nonlinear regression function and its predictive variance.In BO, an aquisition function quantifies how promising a candidate is, and dependsboth on the regression function and predictive variance. There are three typicalchoices: maximum probability of improvement, maximum expected improvement,and Thompson sampling [9]. The aquisition function is applied to all remainingcandidates and the one with the largest value is selected for next experimentation.The importance of uncertainty evaluation was investigated by Balachandran etal. [2]. They aimed to find the optimal design of M2AX family of compounds, wherethe interest is focused on elastic properties [bulk (B), shear (G), and Young’s (E)modulus]. Balachandran et al. compared BO with the selection with predicted val-ues of support vector machines and showed that using uncertainty lead to betterperformance.Fig. 4.2 Illustration ofBayesian optimization (BO).Gaussian process provides aregression function (redcurve) and its variance (bluecurves). Candidate points areshown as red triangles. Thenext candidate is selectedbased on an aquisitionfunctionExplanatory VariableMeasured ValueCurrentMaximum 68 T. M. Dieb and K. TsudaFig. 4.3 Si-Ge interfacial structure between two Si leads. In this case, the interface region is madeup of 16 atoms4.2.2 COMBO: Bayesian Optimization PackageWith the increasing popularity of applications of Bayesian optimization to materialsdesign problems, there was a need to develop an efficient tool to support this pro-cess. We implemented an open source package for Bayesian optimization in python(COMBO: COMmon Bayesian Optimization library, https://github.com/tsudalab/combo) [11]. Thompson sampling, random feature maps and one-rank Cholesky up-date made it particularly suitable to handle large training datasets. It was shown thatCOMBO is more efficient than a GP implementation in scikit-learn (http://scikit-learn.org). To make it usable by non-experts, COMBO is parameter-free and caneasily be used in various materials design problems. COMBO was first applied tooptimize crystalline interface structures [10], where the aim is to find the best trans-lation parameters with lowest grain boundary energy. It is reported that more than50 times speedup was observed in comparison to random design.4.2.3 Designing Phonon Transport NanostructuresIn a recent paper, Ju et al. [7] studied thermal conductivity in Si-Ge nanostructures.They applied COMBO to search for maximum and minimum interfacial thermalconductance (ITC) across all configurations of Silicon and Germanium (Fig. 4.3).Binary representation was used to describe the position of each atom in the structure:1 and 0 represent the Ge and Si atom respectively. It is reported that the optimalsolution was reached after exploring only 3.4% of the total number of candidates(12870).4.3 Monte Carlo Tree SearchLarge-scale problems are not rare cases in materials design and discovery. For exam-ple, finding the optimal configuration of two elements in a materials crystal structurewith x sites involves exploring a search space with the size 2x. When x = 10, the sizehttps://github.com/tsudalab/combohttps://github.com/tsudalab/combohttp://scikit-learn.orghttp://scikit-learn.org4 Machine Learning-Based Experimental Design in Materials Science 69of the space is 1024. The space size increases exponentially with the number of sitesx (for x = 20, the size becomes 1048576). Since BO applies an aquisition functionto all candidates, the computational time becomes inhibitive for large x.The significant success of Monte Carlo tree search (MCTS) [20] in computer Gogame [14] inspired researchers to develop similar approaches in different researchareas including other type of games [21–24]. MCTS is a guided-random best-firstsearch method that models the search space as a gradually expanded tree. Addition-ally, MCTS does not involve costly matrix operation like GP, making it very scal-able for large-scale search spaces. We recently applied MCTS to atom assignmentproblems in Fig. 4.3 and showed that MCTS is more efficient in BO in large-scaleproblems [8].4.3.1 MethodAssume a material structure s with p positions. Each position has to be assigned byan atom from set A. We are looking for the best assignment of length p from theset of all possible assignments. The evaluation of a structure is given by a black-boxfunction f (s) corresponding to either an experiment or simulation.MCTS uses a tree data structure to represent the search space (Fig. 4.4). A nodeat level n of the tree corresponds to the assignment of a ∈ A into n-th position. Themaximum depth of the tree is p. A solution is defined by a path from the root to a leafnode at level p. MCTS constructs only a top part of the search tree and it is expandedgradually to promising areas. At a node at depth n < p, only a part of the solutionis obtained. To obtain a full solution, MCTS uses a technique called rollout, i.e.,completing the solution by random assignment of atoms in the remaining positions.After a full solution is made, f (s) is evaluated and recorded as the immediate meritof the node that the rollout started.At the beginning, only the root node exists. The search continues until a pre-requested number of iterations are finished. In each iteration, MCTS has four steps(Fig. 4.4): selection, expansion, simulation, and backpropagation. The pseudo-codeof MCTS is shown as Algorithm 1. In the selection step, MCTS starts from theroot and traverses down following the path of the most promising child. Childrenof the node are scored with different methods. The most common one is the UpperConfidence Bound (UCB) score [20],ucbi =zivi+ C√2 ln vparentvi, (4.1)where zi is the accumulated merit of the node, i.e., the sum of immediate merits ofthe all downstream nodes, vi is the visit count of the node, vparent is the visit countof the parent node, and C is the constant to balance exploration and exploitation. Inthe expansion step, one or more child (depending on the implementation) are created70 T. M. Dieb and K. TsudaRootYXX YYSelec on Expansion Simula onXYYXZYZXXYZZYYXXBackpropaga on ZZZX YRootYXX YYZZZX YXXRootYXX YYZZZX YXXRootYXX YYZZZX YXXXFig. 4.4 Monte Carlo tree search (MCTS) for a three atom assignment problem. Atoms are to beassigned to a set of available positions. The search space is modeled as a decision tree where eachnode denotes a possible assignment. MCTS repeats four steps in each iteration: In the selection step,a promising leaf node is chosen by following the child with the best score. The expansion step addsa number of children nodes to the selected one. In simulation, a full solution is created by randomrollout for each expanded node. The backpropagation step updates nodes’ information along thepath back to the root for a better selection in the next iterationunder the selected node. For each expanded child, a full solution is obtained throughrollout, then evaluated using f (s) and recorded in the simulation step. Finally, in thebackpropagation step, the node information zi, vi is updated to be used for betterselection in the next iteration.4.3.2 MDTS: A Python Package for MCTSWe developed a python package of the MCTS algorithm that solves atom assignmentproblems [8]. The package named MDTS (Materials Design using Tree Search) isavailable at https://github.com/tsudalab/MDTS. MDTS is a parameter-free tool thatautomatically sets the only hyperparameter of MCTS algorithm (C) to obtain thebest performance based on the target application. Following a similar idea to [25],MDTS controls C adaptively at each node as follows:C =√2J4(fmax − fmin), (4.2)where J is a meta-parameter initially set to one and increased whenever the algorithmencounters a so-called dead-end leaf to allow more exploration. fmax and fmin are themaximum and minimum immediate merits in downstream nodes.To investigate the efficiency of MDTS, we compared the application ofMDTS and an efficient Bayesian optimization package [11] to design optimal Silicon-Germanium (Si-Ge) alloy interfacial structures (Si:Ge = 1:1) in order to achieve bothminimum and maximum thermal conductance [7]. The total computation time washttps://github.com/tsudalab/MDTS4 Machine Learning-Based Experimental Design in Materials Science 71Startmake root node root ⊳ Each node has 2 values, z: accumulated merit, v: visit countsolutions_set ← ∅while within number of iterations don ← SELECTION(root)if n is not a maximum depth leaf thenchildren ← EXPANSION(n)for all child ∈ children dosolution ← SIMULATION(child)e ← evaluate solution using experiment or computationBACKPROPAGATION(child, e)solutions_set ← [solutions_set, solution]end forend ifend whilereturn argmax(solutions_set)Finishfunction SELECTION(node)if node has no children thenreturn nodeelsebst_child ← argmax( node.znode.v+ C√2ln(parent.v)node.v) ⊳ parent is the parent of nodereturn SELECTION(bst_child)end ifend functionfunction EXPANSION(node)for all possible children domake node childadd child to children of the nodeend forreturn all children of the nodeend functionfunction SIMULATION(node)structure ← the path from the root to nodeif node is not a maximum depth leaf thenstructure ← complete the solution randomly ⊳ random rolloutend ifreturn structureend functionfunction BACKPROPAGATION(node, e)node.z ← node.z + enode.v ← node.v + 1if parent is not None then ⊳ parent is the parent of nodereturn BACKPROPAGATION(parent, e)end ifend functionAlgorithm 1: Monte Carlo tree search72 T. M. Dieb and K. Tsudadivided into design time and simulation time. The former is the time needed by theOED algorithm to select the next candidates, and the later is the time needed toquery the target function f (s), i.e., time to compute the thermal conductance for thecandidate solution in this particular application. When the number of positions issmaller than 24, Bayesian optimization showed better efficiency due to its sophisti-cated machine learning algorithm. However, for larger problems, the design time ofBO gets prohibitively long and MDTS was better in finding the best solution quickly.4.3.3 DiscussionUse of the rollout is the basis of MCTS. It enables systematic space explorationwithout needing to generate the whole search space. In MDTS, the rollout is ran-dom, but it can possibly be improved using machine learning. For example, Yee et al.proposed a new MCTS algorithm with machine learning in continuous actionspaces [26], where the UCB score is modified using kernel regression. It shouldbe possible to apply this approach to materials science as well.It is important to consider the balance between design time and simulation time.MCTS methods are most useful when the simulation time is short. The long designtime of a more inefficient machine learning-based approach can appear less prob-lematic when the simulation time is longer [8].4.4 Concluding RemarksOptimal experimental design (OED) methods are gaining more importance recent-ly in the field of materials science and engineering due to popular need to reducethe cost of materials design and discovery. In this chapter, we presented two OEDmethods and their applications in materials design. Bayesian optimization (BO) is awell-established method with several successful applications; however, it struggleswith large-scale problems. A new approach using Monte Carlo tree search (MCTS)has emerged with competitive search efficiency and superior scalability. In the fu-ture, a hybrid approach combining machine learning and MCTS may achieve evenbetter design efficiency.Other available OED methods include evolutionary algorithms such as genetic al-gorithms [27, 28]. Such methods are scalable, but they have many parameters to tune(such as crossover and mutation rates). With limited data available a priori, as in mostcases in materials design and discovery, tuning parameters may be difficult. Othersequential learning (SL) methodologies have been proposed. For example Ling et al.have implemented a new OED approach based on random forests with uncertaintyestimates [29]. The proposed framework is scalable to high-dimensional parameterspaces. Wang et al. proposed a nested-batch-mode sequential learning method thatsuggests experiments in batches [30]. In order to increase the efficiency of BO, some4 Machine Learning-Based Experimental Design in Materials Science 73researchers proposed a new surrogate model which combines independent GaussianProcesses with a linear model that encodes a tree-based dependency structure, whichcan transfer information between overlapping decision sequences [31]. In their ap-proach, Jenatton et al. designed a specialized a two-step acquisition function thatexplores the search space more effectively.Acknowledgements This work was supported by a Grant-in-Aid for Scientific Research on Inno-vative Areas ‘Nano Informatics’ (Grant No. 25106005) from the Japan Society for the Promotionof Science (JSPS).References1. A. Seko, A. Togo, H. Hayashi, K. Tsuda, L. Chaput, I. Tanaka, Phys. Rev. Lett. 115, 205901(2015)2. P.V. Balachandran, D. Xue, J. Theiler, J. Hogden, T. Lookman, Sci. Rep. 6, 19660 (2016)3. D. Reker, S.G. Drug, Discov. Today 20, 458 (2015)4. A.R. Oganov, C.W. Glass, J. Chem. Phys. 124, 244704 (2006)5. M. Ahmadi, M. Vogt, P. Iyer, J. Bajorath, H. Frhlich, J. Chem. Inf. Model. 53, 553 (2013)6. A. Seko, T. Maekawa, K. Tsuda, T. I. Phys. Rev. B 89, 054303 (2014)7. S. Ju, T. Shiga, L. Feng, Z. Hou, K. Tsuda, J. Shiomi, Phys. Rev. X 7, 021024 (2017)8. T.M. Dieb, S. Ju, K. Yoshizoe, Z. Hou, J. Shiomi, K. Tsuda, Sci. Tech. Adv. Mater. 18, 498(2017)9. J. Snoek, H. Larochelle, R. Adams, Advances in Neural Information Processing Systems, pp.2951–2959, 201210. S. Kiyohara, H. Oda, K. Tsuda, T. Mizoguchi, Jpn. J. Appl. Phys. 55, 045502 (2016)11. T. Ueno, T. Rhone, Z. Hou, T. Mizoguchi, K. Tsuda, Mater. Discov. 4, 18 (2016)12. R. Aggarwal, M.J. Demkowicz, Y.M. Marzouk, Modelling Simul. Mater. Sci. Eng. 23, 015009(2015)13. T. Lookman, F. Alexander, K. Rajan (eds.), Information Science for Materials Discovery andDesign, Springer Series in Materials Science, vol. 225 (Springer International Publishing,Switzerland, 2016)14. D. Silver, A. Huang, C. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser,I. Antonoglou, V. Panneershelvam, E.A. Lanctot, M. Nature 529, 484 (2016)15. D.R. Jones, M. Schonlau, W.J. Welch, J. Glob. Optim. 13, 455 (1998)16. S. Streltsov, P. Vakili, J. Glob. Optim. 14, 283 (1999)17. M.J. Sasena, Flexibility and efficiency enhancement for constrained global design optimizationwith kriging approximations. Ph.D. thesis, University of Michigan, 200218. D. Coulinga, R. Bernotb, K.M. Dochertyb, J.K. Dixona, E.J. Maginn, Green Chem. 8, 82 (2006)19. C.E. Rasmussen, C.K.I. Williams (eds.), Gaussian Processes for Machine Learning (MITPress, 2006)20. C. Browne, E. Powley, D. Whitehouse, S. Lucas, P. Cowling, P. Rohlfshagen et al., IEEE Trans.Comput. Intell. AI Games 4(1), 1 (2012)21. B. Arneson, R.B. Hayward, P. Henderson, I.E.E.E. Trans, Comput. Intell. AI Games 2, 251(2010)22. J. Mehat, T. Cazenave, I.E.E.E. Trans, Comput. Intell. AI Games 2, 271 (2010)23. A. Rimmel, F. Teytaud, T. Cazenave, Appl. Evol. Comput. 501–510 (2011)24. C. Mansley, A. Weinstein, M.L. Littman, Int. Conf. Automat. Plan. Sched 335–338 (2011)25. L. Kocsis, C. Szepesvári, Machine Learning: ECML 2006 (Springer, Berlin, 2006), pp. 282–29374 T. M. Dieb and K. Tsuda26. T. Yee, V. Lisy, M. Bowling, in International Joint Conference on Artificial Intelligence, pp.690–696, 201627. Patra, T.K., Meenakshisundaram, V., Hung, J., Simmons, D, Comb, A.C.S. Sci. 19(2), 96(2017). https://doi.org/10.1021/acscombsci.6b0013628. W. Paszkowicz, K.D. Harris, R.L. Johnston, Comput. Mater. Sci. 45(1), ix (2009). https://doi.org/10.1016/j.commatsci.2008.07.00829. J. Ling, M. Hutchinson, E. Antono, S. Paradiso, B. Meredig, Integr. Mater. Manuf. Innov.(2017). https://doi.org/10.1007/s40192-017-0098-z30. Y. Wang, K.G. Reyes, K.A. Brown, C.A. Mirkin, W.B. Powell, SIAM J. Sci. Comput. 37, B361(2015)31. R. Jenatton, C. Archambeau, J. Gonzalez, M. Seeger, in International Conference on MachineLearning, pp. 1655–1664, 2017Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing,adaptation, distribution and reproduction in any medium or format, as long as you give appropriatecredit to the original author(s) and the source, provide a link to the Creative Commons license andindicate if changes were made.The images or other third party material in this chapter are included in the chapter’s CreativeCommons license, unless indicated otherwise in a credit line to the material. If material is notincluded in the chapter’s Creative Commons license and your intended use is not permitted by s-tatutory regulation or exceeds the permitted use, you will need to obtain permission directly fromthe copyright holder.https://doi.org/10.1021/acscombsci.6b00136https://doi.org/10.1016/j.commatsci.2008.07.008https://doi.org/10.1016/j.commatsci.2008.07.008https://doi.org/10.1007/s40192-017-0098-zhttp://creativecommons.org/licenses/by/4.0/ 4 Machine Learning-Based Experimental Design in Materials Science 4.1 Introduction 4.2 Bayesian Optimization 4.2.1 Method 4.2.2 COMBO: Bayesian Optimization Package 4.2.3 Designing Phonon Transport Nanostructures 4.3 Monte Carlo Tree Search 4.3.1 Method 4.3.2 MDTS: A Python Package for MCTS 4.3.3 Discussion 4.4 Concluding Remarks References