Description:
(abstract)Materials science research benefits from the powerful machine-learning (ML) surrogate models, but it is also limited by the implicit requirement for sufficiently big and balanced data distribution for ML. In this paper, we propose a model to obtain more credible results for small and imbalanced materials data sets as well as chemical knowledge. Taking 2 bandgaps imbalanced data sets as instances, we demonstrate the usability and performance of our model compared with common ML models with normal sampling and resampling methods.
Rights:
Keyword: cost-sensitive, iterative machine-learning method, small and imbalanced materials data sets, chemical knowledge, CSIML
Date published: 2024-05-02
Publisher: Oxford University Press (OUP)
Journal:
Funding:
Manuscript type: Publisher's version (Version of record)
MDR DOI:
First published URL: https://doi.org/10.1093/chemle/upae090
Related item:
Other identifier(s):
Contact agent:
Updated at: 2024-11-28 16:30:28 +0900
Published on MDR: 2024-11-28 16:30:29 +0900
| Filename | Size | |||
|---|---|---|---|---|
| Filename |
upae090 (1).pdf
(Thumbnail)
application/pdf |
Size | 3.35 MB | Detail |