Yuna Oikawa
;
Guillaume Deffrennes
;
Rintaro Shimayoshi
;
Taichi Abe
;
Ryo Tamura
;
Koji Tsuda
Description:
(abstract)Large language models (LLMs) are general-purpose tools with wide-ranging applications, including in materials science. In this work, we introduce aLLoyM, a fine-tuned LLM specifically trained on alloy compositions, temperatures, and their corresponding phase information. To develop aLLoyM, we curated question-and-answer (Q&A) pairs for binary and ternary phase diagrams using the open-source Computational Phase Diagram Database (CPDDB) and assessments based on CALPHAD (CALculation of PHAse Diagrams). We fine-tuned Mistral, an open-source pre-trained LLM, for two distinct Q&A formats: multiple-choice and short-answer. Benchmark evaluations demonstrate that fine-tuning substantially enhances performance on multiple-choice phase diagram questions. Moreover, the short-answer model of aLLoyM can generate novel phase diagrams from its components alone, suggesting that it may aid the discovery of new materials systems. To promote further research and adoption, we have publicly released the short-answer fine-tuned version of aLLoyM, along with the complete benchmarking Q&A dataset, on Hugging Face.
Rights:
Keyword: LLM, phase diagram
Date published: 2026-01-22
Publisher: Springer Science and Business Media LLC
Journal:
Funding:
Manuscript type: Publisher's version (Version of record)
MDR DOI:
First published URL: https://doi.org/10.1038/s41524-026-01966-6
Related item:
Other identifier(s):
Contact agent:
Updated at: 2026-04-01 13:56:07 +0900
Published on MDR: 2026-04-01 16:26:13 +0900
| Filename | Size | |||
|---|---|---|---|---|
| Filename |
s41524-026-01966-6 (2).pdf
(Thumbnail)
application/pdf |
Size | 3.33 MB | Detail |