Mining experimental data from materials science literature with large language models: an evaluation study

Luca Foppiano; Guillaume Lambard; Toshiyuki Amagasa; Masashi Ishii

doi:10.1080/27660400.2024.2356506

Article Mining experimental data from materials science literature with large language models: an evaluation study

Luca Foppiano ; Guillaume Lambard ; Toshiyuki Amagasa ; Masashi Ishii

Mining experimental data from materials science literature with large language models an evaluation study.pdf

Download file (3.19 MB)
Download as zip (2.63 MB)

Collection

Citation

Luca Foppiano, Guillaume Lambard, Toshiyuki Amagasa, Masashi Ishii. Mining experimental data from materials science literature with large language models: an evaluation study. Science and Technology of Advanced Materials: Methods. 2024, 4 (1), . https://doi.org/10.1080/27660400.2024.2356506

(BibTeX)

Description:

(abstract)

This study is dedicated to assessing the capabilities of large language models (LLMs) such as GPT-3.5-Turbo, GPT-4, and GPT-4-Turbo in the extraction of structured information from scientific documents in materials science. To this end, we primarily focus on (i) a named entity recognition (NER) of studied materials and physical properties and (ii) a relation extraction (RE) between these entities. The performance of LLMs in executing these tasks is benchmarked against traditional models, BERT and rule-based approaches. As a typical result, GPT-4 and GPT-4-Turbo display remarkable reasoning and relationship extraction capabilities after being provided with merely a couple of examples.

Rights:

Creative Commons BY Attribution 4.0 International

Keyword: Large language models, benchmark, NER, TDM, evaluation, materials science

Date published: 2024-12-31

Publisher: Informa UK Limited

Journal:

Science and Technology of Advanced Materials: Methods (ISSN: 27660400) vol. 4 issue. 1

Funding:

Research and Development JPMXP1122715503

Manuscript type: Publisher's version (Version of record)

MDR DOI:

First published URL: https://doi.org/10.1080/27660400.2024.2356506

Related item:

Other identifier(s):

Contact agent:

Updated at: 2024-10-24 16:30:23 +0900

Published on MDR: 2024-10-24 16:30:24 +0900

Filename	Size
Filename	Mining experimental data from materials science literature with large language models an evaluation study.pdf (Thumbnail) application/pdf	Size	3.19 MB	Detail