データセット A material dictionary database to extract information on permanent magnets from scientific articles

Akira Suzuki ORCID (Research and Services Division of Materials Data and Integrated System, National Institute for Materials ScienceROR)

コレクション

引用
Akira Suzuki. A material dictionary database to extract information on permanent magnets from scientific articles. https://doi.org/10.48505/nims.3857

説明:

(abstract)

In this study, we developed a new information extraction method using a material dictionary database (MDDB), which parses scientific articles and collects related phrases from various ex-pressions. We used magnetic properties as an illustrative case to analyze the working of the proposed system. Structured terms comprising sub-phrases, tagged words, and their relationships enabled automatic annotation and information extraction. The MDDB was constructed on a pre-built knowledge base that includes information categories and related keywords. These cat-egories can be hierarchically structured and flexibly updated to extract a wide range of information on the associations between magnetic materials and properties along with the measurement systems used, structural analyses performed, and theoretical foundations. Herein, we propose preliminary rule-based phrase collection methods and label pattern extraction for phrases that can easily add new structured terms. We found 1,136 new phrases by label pattern-matching that enabled more related expressions to be retrieved from the text and enhanced the information extraction’s accuracy. Approximately 350 relationships among the material types, properties, and values were extracted from the manually modified annotations of 40 articles on permanent magnets. Our method can be applied to other research domains and can be used by such disciplines to build knowledge bases for any topic in their field.

データの性質: informatics_and_data_science

権利情報:

キーワード: database, magnetic property, materials informatics, natural language processing, ontology, information extraction

刊行年月日:

出版者: NIMS

掲載誌:

研究助成金:

  • Ministry of Education, Culture, Sports, Science and Technology (MEXT) JPMXP1122715503 (Data Creation and Utilization-Type Material Research and Development Project (Digital Trans-formation Initiative Center for Magnetic Materials))

原稿種別: 論文以外のデータ

MDR DOI: https://doi.org/10.48505/nims.3857

公開URL: https://github.com/suzuki-akira3/MDDB.git

関連資料:

その他の識別子:

連絡先:

更新時刻: 2023-01-31 22:30:50 +0900

MDRでの公開時刻: 2023-02-03 11:36:20 +0900

Computational method / 計算手法

Description / 説明 :

Category / カテゴリ : https://matvoc.nims.go.jp/entity/Q21 MatVoc

Category description / カテゴリの説明 :

Calculated at / 計算時刻 :

ファイル名 サイズ
ファイル名 MDDB.owl (サムネイル)
application/rdf+xml
サイズ 2.37MB 詳細
ファイル名 Table_2_all_data.csv
text/csv
サイズ 3.89KB 詳細
ファイル名 Table_3_all_data.csv
text/csv
サイズ 38KB 詳細
ファイル名 Table_5_all_data.csv
text/csv
サイズ 100KB 詳細
ファイル名 Table_6_all_data.csv
text/csv
サイズ 4.97KB 詳細
ファイル名 Table_7_all_data.csv
text/csv
サイズ 3.3KB 詳細
ファイル名 Table_8_all_data.csv
text/csv
サイズ 43.8KB 詳細
ファイル名 Table_9_all_data.csv
text/csv
サイズ 102KB 詳細
ファイル名 Table_10_all_data.csv
text/csv
サイズ 24.2KB 詳細