# Development of LLM-assisted data curation tools for the Starrydata materials science database

https://mdr.nims.go.jp/datasets/df425f93-ab4c-4ca1-8cc9-0d7171b0c1ad

## Files

- [Development of LLM-assisted data curation tools for the Starrydata materials science database.pdf](https://mdr.nims.go.jp/filesets/8ff96c58-a62d-41ea-8fe6-02cefcaae988/download) ([Detail](https://mdr.nims.go.jp/filesets/8ff96c58-a62d-41ea-8fe6-02cefcaae988.md))

## Id

df425f93-ab4c-4ca1-8cc9-0d7171b0c1ad

## Local identifier



## Visibility

open_to_public

## State

published

## Created at

2026-01-15T09:35:03.653562Z

## Updated at

2026-01-20T00:35:27.185877Z

## Published at

2026-01-20T03:23:04.022488Z

## Doi



## First published url

https://doi.org/10.1080/27660400.2025.2590811

## Date published

2025-12-31

## Recorded date published

2025-12-31

## Resource type

journal_article

## Manuscript type

vor

## Collection



## Title

- title: Development of LLM-assisted data curation tools for the Starrydata materials
    science database
  title_type: original
  lang: en

## Description

- description: 'We developed two LLM-assisted tools to accelerate data collection
    from materials science publications for the Starrydata database. The first tool,
    Starrydata Auto-Suggestion, generates concise English descriptions from abstracts
    and methods that conform to our database schema, integrated into the Starrydata2
    platform using lightweight models. The second tool is a dual-component system:
    Auto-Summary GPT processes PDF files to generate comprehensive JSON output capturing
    all figures, tables, and samples, while Auto-Summary Viewer transforms this into
    interactive tables for efficient curator review. These tools enhance curation
    efficiency and advance automated scientific database construction.'
  description_type: abstract
  lang: und

## Creator

- name: Yukari Katsura
  role: author
  orcid: https://orcid.org/0000-0002-8905-2995
  organization: National Institute for Materials Science
- name: Tomoya Mato
  role: author
  orcid: https://orcid.org/0000-0002-0918-6468
  organization: National Institute for Materials Science
  department: Center for Basic Research on Materials
- name: Yu Takada
  role: author
  orcid: https://orcid.org/0009-0002-0709-1817
  organization: National Institute for Materials Science
  department: Center for Basic Research on Materials
- name: Eiji Koyama
  role: author
  organization: National Institute for Materials Science (NIMS)
  department: Center for Basic Research on Materials
- name: Dewi Yana
  role: author
  organization: National Institute for Materials Science (NIMS)
  department: Center for Basic Research on Materials
- name: Atsumi Tanaka
  role: author
  organization: National Institute for Materials Science (NIMS)
  department: Center for Basic Research on Materials
- name: Masaya Kumagai
  role: author
  organization: Sakura Internet Inc.
  department: Sakura Internet Research Center

## Contact agent



## Publisher

organization: Informa UK Limited

## Managing organization



## Keyword

- subject: Materials informatics
  schema: not_defined
- subject: materials database
  schema: not_defined
- subject: data curation
  schema: not_defined
- subject: literature data mining
  schema: not_defined
- subject: automated data extraction
  schema: not_defined
- subject: automated knowledge extraction
  schema: not_defined
- subject: large language model
  schema: not_defined

## Rights

- identifier: https://creativecommons.org/licenses/by/4.0/

## Other identifier(s)



## Data origin

- data_origin_type: other

## Embargo



## Journal

- title: 'Science and Technology of Advanced Materials: Methods'
  issn: '27660400'
  volume: '5'
  issue: '1'
  article_number: '2590811'

## Conference



## Related item



## Funding

- identifier: Grant JPMJCR19J1
  funder_name: JST-CREST
- funder_name: Kazuchika Okura Memorial Foundation

## Instrument



## Instrument operator



## Instrument managing organization



## Measurement method



## Specimen



## Chemical composition



## Structure for specimen



## Structural feature for specimen



## Specific property for specimen



## Process for specimen treatment



## Computational method



## Energy level/transition state



## Software



## Custom property



## Fileset

- id: 8ff96c58-a62d-41ea-8fe6-02cefcaae988
  filename: Development of LLM-assisted data curation tools for the Starrydata materials
    science database.pdf
  content_type: application/pdf
  size: 13056193
  md5: a38ce569b9699c1c451818cae4da30d7

## Thumbnail

fileset_id: 8ff96c58-a62d-41ea-8fe6-02cefcaae988
filename: Development of LLM-assisted data curation tools for the Starrydata materials
  science database.pdf