# CSIML: a cost-sensitive and iterative machine-learning method for small and imbalanced materials data sets

https://mdr.nims.go.jp/datasets/3afe81ed-686e-4f83-be7c-68ac7cf1fdff

## Files

- [upae090 (1).pdf](https://mdr.nims.go.jp/filesets/de16ee8d-8e03-4880-8721-376e99cf2fde/download) ([Detail](https://mdr.nims.go.jp/filesets/de16ee8d-8e03-4880-8721-376e99cf2fde.md))

## Id

3afe81ed-686e-4f83-be7c-68ac7cf1fdff

## Local identifier



## Visibility

open_to_public

## State

published

## Created at

2024-11-26T07:19:54.001926Z

## Updated at

2024-11-28T07:30:28.976911Z

## Published at

2024-11-28T07:30:29.322667Z

## Doi



## First published url

https://doi.org/10.1093/chemle/upae090

## Date published

2024-05-02

## Recorded date published

2024-5-2

## Resource type

journal_article

## Manuscript type

vor

## Collection



## Title

- title: 'CSIML: a cost-sensitive and iterative machine-learning method for small
    and imbalanced materials data sets'
  title_type: original
  lang: en

## Description

- description: Materials science research benefits from the powerful machine-learning
    (ML) surrogate models, but it is also limited by the implicit requirement for
    sufficiently big and balanced data distribution for ML. In this paper, we propose
    a model to obtain more credible results for small and imbalanced materials data
    sets as well as chemical knowledge. Taking 2 bandgaps imbalanced data sets as
    instances, we demonstrate the usability and performance of our model compared
    with common ML models with normal sampling and resampling methods.
  description_type: abstract
  lang: und

## Creator

- name: Shengzhou Li
  role: author
  orcid: https://orcid.org/0000-0001-6973-3825
- name: Ayako Nakata
  role: author
  orcid: https://orcid.org/0000-0002-3311-6283

## Contact agent



## Publisher

organization: Oxford University Press (OUP)

## Managing organization



## Keyword

- subject: cost-sensitive
  schema: not_defined
- subject: iterative machine-learning method
  schema: not_defined
- subject: small and imbalanced materials data sets
  schema: not_defined
- subject: chemical knowledge
  schema: not_defined
- subject: CSIML
  schema: not_defined

## Rights

- identifier: https://creativecommons.org/licenses/by-nc/4.0/

## Other identifier(s)



## Data origin



## Embargo



## Journal

- title: Chemistry Letters
  issn: '03667022'
  volume: '53'
  issue: '5'

## Conference



## Related item



## Funding

- identifier: JP20H05883
  funder_name: JSPS
- identifier: JP20H05878
  funder_name: JSPS
- identifier: JPMJPR20T4
  funder_name: JST PRESTO

## Instrument



## Instrument operator



## Instrument managing organization



## Measurement method



## Specimen



## Chemical composition



## Structure for specimen



## Structural feature for specimen



## Specific property for specimen



## Process for specimen treatment



## Computational method



## Energy level/transition state



## Software



## Custom property



## Fileset

- id: de16ee8d-8e03-4880-8721-376e99cf2fde
  filename: upae090 (1).pdf
  content_type: application/pdf
  size: 3507695
  md5: e43ec047bde0f8df5e596a2d1d3e04f4

## Thumbnail

fileset_id: de16ee8d-8e03-4880-8721-376e99cf2fde
filename: upae090 (1).pdf