# Fileset

[RDA-20201110tanifuji.pdf](https://mdr.nims.go.jp/filesets/bf26f178-dacb-4294-9777-42b4c8d6dfb8/download)

## Creator

[Tanifuji, Mikiko](https://orcid.org/0000-0001-5284-6364)

## Rights

Creative Commons BY-ND Attribution-NoDerivatives 4.0 International

## Other metadata

[Materials Data Platform overview: metadata, vocabulary, and repository](https://mdr.nims.go.jp/datasets/1c086cdc-4eb7-43bf-9be0-719c4e9fa25d)

## Fulltext

材料データリポジトリにおける 共通メタデータ・分野別メタデータMaterials Data Platform overview: metadata, vocabulary, and repositoryMik iko Tani fu j iMater ia l s  Da ta  P la t fo rm Cen te rD iv.  o f  Ma te r i a l s  Da ta  and  In teg ra ted  Sys tem (MaDIS)Na t iona l  Ins t i t u te  fo r  Ma te r i a l s  Sc ienceRDA2020 RDA Breakout 6: Materials IG session: Data Infrastructure for Collaborations in Materials Research, November 12, 20202Who we areNIMS National Institute for Materials ScienceEstablished in 2001 by merger of two national institutes: (metals + inorganic materials)  Now covers materials in generalMaDIS Research & Services Division of Materials Data and Integrated SystemEstablished in 2017 to focus on materials data and integrationDPFC Materials Data Platform CenterBudget 2017 – 2020, 3 billion yen3Data-driven people at MaDIS66 staff at Materials Data Platform CenterData Science• スパースモデリング（モデル選択）• 画像解析〜パターン認識、深層学習• 回帰技術〜機械学習、ベイズ推定• 最適化技術〜能動学習等• 自然言語処理Materials Architectures• 計算シミュレーション：自動化• SIP-MI：プロセスから一貫予測• SMILES X：分子人工設計• 計測インフォマティックスMaterials Development• 高分子材料設計• 熱制御材料• 金属系構造材料• 半導体• 多孔質材料Data infrastructure• Data structure and modeling• Data curation• Data collection and FAIRable• Data mining from publications• Data system technology and developmentWho we are: a Facebook4Materials Data Platform at NIMSCreatethe dataUsethe dataStorethe dataPublishthe dataText/Data miningExperiments &CalculationsAnalysis &IntegrationRepositoryStore and manageNIMS NOW, 19 (1), 20195Materials Data Platform overviewPublishing, data linking,open scienceHPC ServerAnalysis environmentUserauthenticationData collectionBuilding advanced databasesLarge-scale facilitiesAnalyses and materials integrationClosed data Open dataCollaborationIndustry AcademiaPublishing,Federation with other repositoriesText and data miningMatNaviDatabaseDataCollectionSystem IoTdata transfersystemOtherdatabaseM-DaCData ConvToolsMachinelearningsystemSIPMintsystemResearch DataManagementCommon metadata modelAPIFrameworkMaterialsVocabularyWikiData Management PlanData policyMaterialsDataRepositoryImage credit: Koji101, pngimg.com (CC)Data Provider Data Center Data ScienceAcademia-Private sectorsData CloudMaterials data hub(2021 - )6Four actions mapped to the platform componentsCreatethe dataUsethe dataStorethe dataPublishthe dataText and data miningDataCollectionSystem IoTdata transfersystemMaterialsintegrationAnalysis environmentMatNaviDatabaseOtherdatabaseResearch DataManagementMaterialsDataRepositoryCommon metadata model MaterialsVocabularyWikiServerclusterData policyUser auth7DICE Common Message FormatCharacterization metadataMethod,Environment…Specimen metadataMaterial type,Structure…PropertymetadataPhysical properties,Units…Synthesis/ProcessmetadataProcessed date,Temperature…CalculationmetadataComputer software,Version…Characterizationprimary paramsSpecimen primary paramsProperty primary paramsSynthesis/Processprimary paramsCalculationprimary paramsData DataData Data DataMandatory metadataDomain-specific metadataPrimary parametersImplementedas data modelSave as filesMETADATADATACommon metadata modelBibliographic metadata Administrative metadata Subject material++After lots of discussions (still ongoing), schema file published at https://dice.nims.go.jp/https://dice.nims.go.jp/8DICE Common Message Format• Various datasets in various systems all described using the common metadata model• Bibliographic metadata mapped to standard vocabulary on MDR using RDF for FAIRnessData Collection SystemFilesDirect MDR uploadRDMMetadata formExperimental data filesImporterResearch Data ManagementFiles Invoice Common metadata modeldice.nims.go.jp9Ongoing discussion about “primary parameters”• Highly domain-specific parameters such as ◦ Which absorption edge for a XAFS measurement?◦ Which basis set for a Gaussian calculation?Software Gaussian09Calculation B3LYPBasis set cc-pVDZ 📄📄 data.dat📄📄 metadata.jsonld{ “@graph”: [{ “@id”: “./”, /* Bibliographic metadata */“name”: ...,“author”: ...,/* Scientific metadata */“variableMeasured”: ...,“hasPart”: {“@id”: “data.dat”}}, ...... ]}As a 2 x n key-value CSVAs a Schema.org-like JSON-LD fileExtremely difficult toagree how to include in the common schema!At least, save as files?10Vocabulary system overviewMatVoc-aware applicationsContribute(GUI)Import(API)NIMSUsersImporterscriptMatVoc-unawareapplicationsAuthorityfilePeriodic batchQuery ServiceJSON APISPARQL APIRead MatVocQuery MatVoc with SPARQLMatVoc.nimsNIMS MDPF applications“Wikibase Ecosystem”Federated SPARQL queries across distributed endpointsMaterialsVocabularyWiki11Bird’s eye view of our vocabulary now MaterialsVocabularyWiki12Vocabulary-based metadata transfer between systemsData Collection:Research Data ExpressCommon message formatdepositUploadReq.jsonMaterials Data RepositoryRDE local dictionaryMatVocsyncedAPI queryreferM. Ishii, H. Nagao, A. Matsuda,K. Tanabe, H. Yoshikawa,23rd XAFS Forum (2020)MatVoc ID: Q38613Automatic data collection using WiFi-SD cardstowards FAIR dataInstrumentsOffline PCWi-FiIoT ServerWi-Fi capableprogrammableSD cardWill launch Dec, 2020LANClosed data Open dataMaterialsDataRepositoryResearch Data Management Storage ServerWill launch 2021 March14Data Collection System for efficient measurement data collection and automatic conversion14binary raw datatext fileText conversion program with cooperation of the instrument manufacturersConverted numeric data(easier for humans)Python program (parser) that interprets the converted numeric data and visualizes themgraphProgram developed in-house, for format conversion and controlled metadataXMLmetadata“readable” dataNi3p“Schema-on-Read” data registration with user-customizable formatsSpectral data automatically sparse-modeledData Conversion Toolson GitHub.com/nims-dpfc15Collecting experimental dataX Y1 762 773 821011001010100011X Y1 762 773 82X Y1 762 773 82HEADEREnergy: 150.0 eVMode:   TDate:   20190528X      Y1.00   76.011.10   77.251.20   81.95Sample:  A01Process: PQROperator:MeRaw binaryfrom instrumentsFormat conversionReadableIdentifiableMetadata annotatedAuto-analyzed/visualizedAutomatic data transfer using IoTElectronic lab notebookDBData CollectionSystemDatasets, publications, and images published on the Materials Data RepositoryOld publications repoOld images repoNew publicationsNew datasetsimportuser depositDatasetsPublicationsImagesMetadata types:Launched June, 2020Materials Data Repository – transition stage 1Integration <–> FAIRable <–> Accumulation Data Collection SystemCloud storage(Google Drive, Dropbox)Data-mining applicationsVisualization applications(Researchers directory with ORCID integration,  https://samurai.nims.go.jp)materials vocabularyDOI(planned)Applications to collect and store raw data at RDMApplications to publish and analyze research data via direcotry serviceLaunched June, 2020Materials Data Repository – transition stage 2https://samurai.nims.go.jp/18Createthe dataUsethe dataStorethe dataPublishthe dataText and data miningDataCollectionSystem IoTdata transfersystemMaterialsintegrationAnalysis environmentMatNaviDatabaseOtherdatabaseResearch DataManagementMaterialsDataRepositoryCommon metadata model MaterialsVocabularyWikiServerclusterData policyUser auth19Summary Materials Data Platform, DICE as FAIRable platform Public and internal services will be launched 2020 -2021 DICE as R&D platform for academia and industries DICE aims to Japan-wide data hub with:◦ library of data models and meta data schemas for target materials◦ data quality guideline (recommendation) in respond to what data scientists need between what materials scientists can cope with.20Special Thanks toDr. Takuya KadohiraDr. Kosuke TanabeDr. Asahiko MatsudaAnd  Dr.  Hideki YoshikawaLaunched June, 2020 �Materials Data Platform overview: �metadata, vocabulary, and repository Who we are Who we are: a Facebook Materials Data Platform at NIMS Materials Data Platform overview Four actions mapped to the platform components DICE Common Message Format DICE Common Message Format Ongoing discussion about “primary parameters” Vocabulary system overview Bird’s eye view of our vocabulary now Vocabulary-based metadata transfer between systems Automatic data collection using WiFi-SD cards�towards FAIR data Data Collection System for efficient measurement data collection and automatic conversion Collecting experimental data Datasets, publications, and images published on �the Materials Data Repository Integration <–> FAIRable <–> Accumulation  スライド番号 18 Summary スライド番号 20