# Fileset

[ing5006.pdf](https://mdr.nims.go.jp/filesets/b5f5a66a-7de3-4dba-bae1-acf2a1e192d1/download)

## Creator

[Masashi Ishii](https://orcid.org/0000-0003-0357-2832), [Asahiko Matsuda](https://orcid.org/0000-0001-5989-027X), Koichi Sakamoto, Shohei Yamashita, [Yasuhiro Niwa](https://orcid.org/0000-0001-5808-5594), [Yasuhiro Inada](https://orcid.org/0000-0001-5772-4788)

## Rights

[Creative Commons BY Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/)

## Other metadata

[Global cross-database search system for X-ray absorption spectra](https://mdr.nims.go.jp/datasets/1eda8f10-5a5b-467c-bf5b-fbd840a203b0)

## Fulltext

Global cross-database search system for X-ray absorption spectraresearch papersJ. Synchrotron Rad. (2025). 32 https://doi.org/10.1107/S1600577525002206 1 of 8ISSN 1600-5775Received 7 October 2024Accepted 11 March 2025Edited by R. Ingle, University College London,United KingdomKeywords: International XAFS DB portal;cross-database search; terminology; ontology;semantics.Published under a CC BY 4.0 licenceGlobal cross-database search system for X-rayabsorption spectraMasashi Ishii,a* Asahiko Matsuda,b Koichi Sakamoto,a Shohei Yamashita,cYasuhiro Niwac and Yasuhiro InadadaCenter for Basic Research on Materials, National Institute for Materials Science (NIMS), 1-2-1 Sengen, Tsukuba, Ibaraki305-0047, Japan, bMaterials Data Platform, National Institute for Materials Science (NIMS), 1-1 Namiki, Tsukuba, Ibaraki305-0044, Japan, cInstitute of Materials Structure Science, High Energy Accelerator Research Organization (KEK),1-1 Oho, Tsukuba, Ibaraki 305-0801, Japan, and dCollege of Life Sciences, Ritsumeikan University, 1-1-1 Noji-higashi,Kusatsu, Shiga 525-8577, Japan. *Correspondence e-mail: ishii.masashi@nims.go.jpWhile the importance of a systematic overview of scientific data and demandstoward data integration are increasing, data capitalization and confidentialityare also emerging competitively. X-ray absorption spectroscopy has a strongtradition of data sharing, as cross-referencing data enhances the detailedunderstanding of the obtained spectra. While physically integrating databases invarious formats is impractical, a system has been successfully developed thatallows cross-searching among Japanese, USA and European databases incyberspace. This achievement is made possible through the realization of‘vocabulary unification’ and ‘knowledge unification’ on a global scale, imple-mented in publicly accessible endpoints. This paper provides a summary of theconcepts of terminology, ontology and semantics for X-ray spectroscopy behindthis system, and presents a pilot case study along with future directions for dataintegration in synchrotron radiation science.1. IntroductionCurrently, there are limited examples of global-scale inte-grated management for materials data, most of which areclosely tied to industrial applications and are typically highlyconfidential. One notable example of effective integratedscientific data management is the Protein Data Bank (PDB)(PDB, 2024). The pioneering efforts in data sharing that beganin an era when digitization seemed almost inconceivable havebeen thoroughly documented (Strasser, 2019); therefore, adetailed review is unnecessary here. Although systematicallystoring and sharing materials data following the PDB modelposes significant challenges, this study has developed a prac-tical approach for data sharing while protecting the rights ofeach data holder. This approach has been implemented asa web-accessible database portal (IXDB, 2024). The portalfocuses on spectral data from a widely used synchrotronradiation technique, X-ray absorption fine structure (XAFS)(Kincaid & Eisenberger, 1975; Rehr & Albers, 2000). With thisportal, XAFS users can search across databases in Japan,USA and Europe equally, and can immediately access theselinked databases.As a foundational work for this study, it is essential toreference the MDR XAFS DB project (Ishii et al., 2021).XAFS is a synchrotron radiation experiment that preciselyprovides chemical bonding states and local structures at theatomic level. Since the atomic-level local information isinherently less confidential and the data interpretation canbe enriched by comparing it with other spectra, there is ahttps://doi.org/10.1107/S1600577525002206https://journals.iucr.org/shttps://scripts.iucr.org/cgi-bin/full_search?words=International%20XAFS%20DB%20portal&Action=Searchhttps://scripts.iucr.org/cgi-bin/full_search?words=cross-database%20search&Action=Searchhttps://scripts.iucr.org/cgi-bin/full_search?words=cross-database%20search&Action=Searchhttps://scripts.iucr.org/cgi-bin/full_search?words=terminology&Action=Searchhttps://scripts.iucr.org/cgi-bin/full_search?words=ontology&Action=Searchhttps://scripts.iucr.org/cgi-bin/full_search?words=semantics&Action=Searchhttps://creativecommons.org/licenses/by/4.0/legalcodehttps://creativecommons.org/licenses/by/4.0/legalcodehttps://scripts.iucr.org/cgi-bin/citedin?search_on=name&author_name=Ishii,%20M.https://scripts.iucr.org/cgi-bin/citedin?search_on=name&author_name=Matsuda,%20A.https://scripts.iucr.org/cgi-bin/citedin?search_on=name&author_name=Sakamoto,%20K.https://scripts.iucr.org/cgi-bin/citedin?search_on=name&author_name=Yamashita,%20S.https://scripts.iucr.org/cgi-bin/citedin?search_on=name&author_name=Niwa,%20Y.https://scripts.iucr.org/cgi-bin/citedin?search_on=name&author_name=Niwa,%20Y.https://scripts.iucr.org/cgi-bin/citedin?search_on=name&author_name=Inada,%20Y.mailto:ishii.masashi@nims.go.jphttp://crossmark.crossref.org/dialog/?doi=10.1107/S1600577525002206&domain=pdf&date_stamp=2025-04-11potential demand for data sharing. Under this background, theproject MDR XAFS DB was launched in 2018 to aggregateXAFS data from leading synchrotron-radiation-related insti-tutes in Japan and make it available as a database on theMaterials Data Repository (MDR) of the National Institutefor Materials Science (NIMS). Since the publication of a paperon this initiative (Ishii et al., 2023), the number of participatinginstitutions has expanded to six, and the database nowprovides access to 2263 XAFS spectra. The database features56 unique absorption elements and 74 absorption edges.Importantly, the specimen names, which were not consistentwhen the data were provided, have been standardized throughthe NIMS Materials Vocabulary management system, MatVoc(MatVoc, 2024). This standardization ensures equal datafindability, unifies accessibility, and eliminates institutionaldiscrepancies.The next goal is to extend this cross-search functionalityglobally, which will require addressing several challenges,starting with the international sharing of MatVoc. The targetdatabases for this global cross-search include MDR XAFSDB, XASLIB in the USA (XASLIB, 2024), SSHADE/FAME(Kieffer & Testemale, 2024) and LISA XAS Database (Puri,2024) in Europe. The number of spectra to be integrated intothis worldwide framework includes 277 from XASLIB, 517from SSHADE/FAME and 48 from LISA XAS Database;combined with the 2263 spectra from the MDR XAFS data-base, the total reaches 3104.The web application developed for this cross-databasesearch is henceforth referred to as the International XAFSDatabase Portal (IXDB). The practical construction policy ofIXDB is as follows:(i) Cross-searching all data across the four databases usingtwo consistent terms — absorption edge and specimen name— that are uniformly included in XAFS measurementmetadata.(ii) To respect the rights and functions of each database,IXDB will only display links to the databases and theirproviding institutions, without hosting any data itself.(iii) Data retrieval and processing is performed by externalshared endpoints: create a framework that allows anyone toeasily build a customized user interface (UI) and help increasedata findability through economical portals.2. Definition of academic requirements for IXDBThe academic requirements necessary for realization of IXDBare as follows:(i) Introduction of ‘vocabulary unification’ to standardizedatabase content (terminology).(ii) Introduction of ‘knowledge unification’ to harmonizevarious data sources (ontology).(iii) Development of protocols for sharing vocabulary andknowledge (semantics).The requirements outlined here are not specific to IXDBalone, but can serve as general guidelines for systems designedfor extensive data collaboration and utilization. The followingsubsections provide a detailed description of each of theserequirements.2.1. Vocabulary unification (terminology)In recent data utilization platforms, attaching metadata toexperimental data has become standard practice. It is essentialto unify metadata items (keys) for reuse of experimental datawith reproducibility and reliability (Ravel & Newville, 2016).For XAFS, unified metadata may encompass the operatingconditions of the storage ring in synchrotron radiation facil-ities, the insertion device, the optical configuration of themonochromator, focusing and higher-order light eliminationmirrors, and the detailed conditions of the signal detectionsystem surrounding the specimen (XAFS Metadata Schema,2024). On the other hand, as outlined in the constructionpolicy, the only keys used for cross-database searches in IXDBare absorption edge and specimen name, both of which areconsistently recorded in XAFS measurements. The limitedkeys indicate that the aim of such searches is to retrieve abroad array of spectra based on a few common conditions,rather than pinpointing a single spectrum with highly detailedconditions. The essential challenge here is not the unificationof keys but rather the standardization of their descriptions(values). Specifically, specimen names must be standardizedacross institutions; otherwise, IXDB will function more like atraditional individual specimen search rather than an effectivecross-database search. Vocabulary unification can be achievedthrough a dictionary that maps various local specimen namesto unique material names.2.2. Knowledge unification (ontology)Although there are variations in database services, whendealing with XAFS spectral data, there should be a commonphysical concept, referred to as ‘XAFS knowledge’, that isindependent of the service styles. This XAFS knowledge canbe made ontologically machine-understandable. Obviously,it is not easy to rigorously describe the XAFS knowledgeinvolved in X-ray excitation process. The knowledge here doesnot treat XAFS as a word in natural language, but rathermeans scientifically relating it to other vocabularies. In otherwords, it states that the absorption edge is determined by theelement to be excited and the electron to be excited. From amaterials science perspective, it also defines that there is anexcited element in the target sample and that there are excitedelectrons within it. Strictly speaking, we would also need todefine dynamic states during the excitation, and this would beuseful for advanced measurements and spectral interpreta-tion. However, such advanced concepts are not useful for thecross-searching dealt with in this article. Therefore, it issufficient to set ontological constraints based on vocabularylinkages described above. This machine-readable notation isexplained in detail in Section 3. Additionally, there is attri-bution information (hereafter referred to as the ‘attributionknowledge’) that is crucial for data reuse, such as the URL tothe spectrum and the name of the data holder. Practically,when selecting from a list of spectra, one may consider theresearch papers2 of 8 Masashi Ishii et al. � Global cross-database search system for XAS J. Synchrotron Rad. (2025). 32strengths and limitations of the data-providing institutions,such as the photon energy range of intense X-rays from theirstorage ring, which is generally known even if not explicitlystated in the metadata. While the URL linking to the spectrumis not experimental data itself, it is essential for data access.We refer to the combination of XAFS knowledge and attri-bution knowledge as ‘dataset knowledge’. Regardless ofwhether the database is MDR XAFS DB or another, datasetknowledge should be uniquely determined. This uniquenessenables universal cross-search across databases using asingle algorithm.2.3. Sharing vocabulary and knowledge (semantics)The vocabularies and knowledge discussed in Sections 2.1and 2.2 must be shared and interoperable across varioussystems. Additionally, version forking of the actual data(instances) must be avoided. Ideally, both vocabulary andknowledge should be centralized in a master data endpointthat anyone can refer to, with updates from this endpointbeing immediately reflected in all linked services, includingIXDB. Once such endpoints are established, the connectedservice systems will operate as an economical and reliablenetwork, always providing consistent data. More specifically,this means that a database (the ‘master’ in Fig. 1) is availableon the Internet, accepting queries written in a semanticlanguage and returning appropriate responses. Additionally,there is an open port (the ‘endpoint’ in Fig. 1) to receive thesequeries. In modern databases, queries are typically sent fromthe UI to the main database within a closed system throughan application programming interface (API), with the resultsdisplayed on the UI. Now, considering that the database andUI are spatially separated and both are located on theInternet, the mechanism in Fig. 1 is easier to understand. Sinceall UIs on the Internet (the ‘services’ in Fig. 1) can access thecommon ‘master’ database and retrieve the original data atany time, there is no risk of data discrepancies. Consequently,IXDB functions as one such service, allowing multiple users toaccess data through semantic conversations via endpoints,which minimizes data management costs. While data sharingis often equated with publishing data on a public server, themechanism illustrated in Fig. 1 represents what we considertrue data sharing.3. Implementation and discussionThis section describes the following three implementationscorresponding to the requirements defined in Section 2 andthe considerations obtained from them.(i) XAFS dictionary creation.(ii) XAFS knowledge creation.(iii) Portal creation.3.1. XAFS dictionary creationAs discussed in Section 2.1, the role of the dictionary is toprovide unique material names for various local specimenname inputs. For machine readability, unique material namesare managed using IDs rather than text. A well knownmachine-readable material ID is the CAS registry number.Although the CAS number is widely used, the difficulty innomenclature for different instances may result in multipleIDs for a single material, some of which have been reorga-nized or integrated over time. For example, the CAS numbersfor FeOOH, which has polymorphic forms, are 1310-14-1 and20344-49-4, both of which have had their numbers deleted orreplaced in the past. This variability indicates that substanceidentification can differ across domains, highlighting the needfor a dictionary specific to XAFS. Consequently, we havestrengthened the NIMS XAFS DB Project MaterialsDictionary within MatVoc (MatVoc, 2024), which was devel-oped as part of the domestic initiative described in Section 1for international cross-database search (hereafter referred toas the ‘XAFS dictionary’). In MatVoc, vocabulary registrationinvolves automatic assignment of a vocabulary ID (‘QID’provided by Q + number) categorized in the XAFS dictionarywhen a set of representative names (labels), definitions(descriptions) and broader categories (upper classes) areregistered. Synonyms can also be registered, enhancing thesystem’s robustness against variations in search terms. Atpresent, the XAFS dictionary contains 955 material names andover 9100 synonyms. Note that these vocabularies are definedwithin the MatVoc namespace (https://matvoc.nims.go.jp/entity/) and have shareable URIs (uniform resourceidentifiers), i.e. global IDs that can be accessed fromanywhere. These URIs are used for semantic databasesearches in IXDB.The XAFS dictionary produces closed lexical space inMatVoc with the following four classes: Chemicals (Q2828),X-ray absorption edge (Q2487), Element (Q2392) and Elec-tron (Q2823). For instance, ‘Chemicals (Q2828)’ indicates thatthe class is identified by the QID (Q2828) and is representedby the name ‘Chemicals’. This nomenclature allows us to viewa conceptual hierarchy. The actual hierarchy is considerablymore detailed; material names are subclasses of Chemicals(Q2828), and X-ray absorption edges are subclasses of X-rayresearch papersJ. Synchrotron Rad. (2025). 32 Masashi Ishii et al. � Global cross-database search system for XAS 3 of 8Figure 1A mechanism for sharing master data on a worldwide scale. The databaseis managed in one place, and each service can design its serviceseconomically.https://matvoc.nims.go.jp/entity/https://matvoc.nims.go.jp/entity/absorption edge (Q2487). The implementation of these lowerclasses will be discussed in Sections 3.1.1 and 3.1.2, respec-tively.3.1.1. Implementation of lower classes for Chemicals(Q2828)The three subclasses under Q2828 are Organic materials(Q714), Inorganic materials (Q715) and Biomaterials(Q3735). Fig. 2 illustrates a schematic diagram of the chemicalhierarchy, including representative subclasses for each cate-gory. For the complete hierarchy, refer to MatVoc (https://matvoc.nims.go.jp/explore/en/results/Q2828). In particular,organic materials (Q714) are often restricted to organiccompounds containing heavy metals that can be excited byhard X-rays, meaning its subclasses do not cover the generalorganic chemistry. Despite this limitation, our strategy is toallow creation of materials dictionary only for XAFS, and toexplicitly link (bridge) to other dictionaries that cover theentire range of materials. In fact, we previously reported (Ishiiet al., 2023) that a bridge to PubChem, a comprehensive low-molecular-weight substance dictionary, can be establishedusing skos:closeMatch. Here, skos is the Simple KnowledgeOrganization System, where predicates for describing a bodyof knowledge are compiled by W3C (SKOS, 2024). For moredetails, the definition is shown at https://www.w3.org/2004/02/skos/core#closeMatch.Even for Inorganic materials (Q715), which have relativelysimple and comprehensive molecular structures, the hier-archical structure of XAFS often does not align with themacroscopic structure of the material. Specifically, whenobserving impurities in bulk, XAFS, which is sensitive to localstructures, focuses on the impurity rather than the extensivebulk surrounding it. For example, ‘Mouse lung exposed toCeO2 particles’ in SSHADE/FAME (Chaurand & Collin,2017) would, from a medical point of view, be in the categoryof ‘mouse lung’ or ‘lung’. However, XAFS is focused on CeO2,specifically on the local structure of cerium within it. Conse-quently, it does not identify the lungs as an organ but assessesthe detection capability in measuring the distribution of CeO2in the lungs (Chaurand et al., 2018). In fact, no other lung dataare available in the databases beyond this specimen, renderingcross-database searches ineffective for this particular context.On the other hand, 11 CeO2-related spectra, which is thetarget of the cross-search, are provided from 7 differentdatabases. Clearly, the XAFS perspective necessitates adifferent approach to cross-database search compared withthe medical perspective. Here are a few examples. Whenexamining the Co absorption edge of cobalt-doped aluminiumoxide (Q3785) using XAFS (Vichery & Maurin, 2012), thelocal structure around cobalt may be classified as a metaloxide complex like cobaltate, Oxide_ate (Q736), despiteAl2O3 being an oxide. In contrast, when analyzing chondrites(e.g. Garenne et al., 2013), the primary focus is on componentratios rather than local structures like additives, leveraging thehigh transmissivity of X-rays (Garenne et al., 2019). In thisscenario, having an astronomical category, such as Carbo-naceous chondrite (Q3744), is advantageous. For instance,57 carbonaceous chondrite specimens constitute a sufficientcategory for cross-database searches in XAFS. Ultimately,dictionaries must be tailored to the research purpose andare not universally applicable. Of course, as with organiccompounds, it may be possible to achieve a universal materialclassification by linking through skos:closeMatch. It is impor-tant to recognize that a purpose-oriented dictionary, ratherthan a general one, will facilitate cross-database searches forXAFS spectra.3.1.2. Implementation of lower classes for X-ray absorptionedge (Q2487)In the constructed XAFS dictionary, absorption edges aredirectly categorized under Q2487 and encompass the K-edgeand L-edge, spanning from hydrogen to plutonium. Forinstance, the Cu K-edge, a standard in XAFS, is defined byQ2516. It is important to note that an absorption edge isdefined by the combination of an absorption element and aninner-shell electron, with these vocabularies being classified asElement (Q2392) and Electron (Q2823), respectively (see thefour hierarchical categories outlined in Section 3.1). Althoughthe absorption edges are fewer in number compared withmaterial names, the key objective is to establish machine-readable descriptions that link absorption edges with elec-tronic states, thereby integrating XAFS spectra into broaderphysical knowledge. Fig. 3 illustrates the machine-readabledescription (schema) applied to the Cu K-edge (Q2516),formatted according to the internationally standardizedResource Description Framework (RDF) (RDF, 2024). RDFis a semantic data representation framework that depictsinformation as subject–predicate–object combinations, knownresearch papers4 of 8 Masashi Ishii et al. � Global cross-database search system for XAS J. Synchrotron Rad. (2025). 32Figure 2Three subclasses under Q2828: Chemicals. Q714: Organic materials;Q715: Inorganic materials; Q3735: Biomaterials. Representative sub-classes in each hierarchy are also shown.https://matvoc.nims.go.jp/explore/en/results/Q2828https://matvoc.nims.go.jp/explore/en/results/Q2828https://www.w3.org/2004/02/skos/core#closeMatchhttps://www.w3.org/2004/02/skos/core#closeMatchas ‘triples’. A visual representation of the schema using adirected graph is shown in Fig. 4 of Section 3.2, which includeshuman-readable labels to enhance understanding of thepredicates in these triples. For non-experts in semantics, theschema in Fig. 3 is explained as follows. It asserts that Q2516(Cu K-edge) has attributes Q2421 (excited element: Cu)and Q2824 (excited electron: K-shell electron), with‘absorption edge’, ‘excited element’ and ‘excited electron’being defined in the MDR-XAFS ontology under thefollowing namespace: https://dice.nims.go.jp/ontology/mdr-xafs-ont/Schema#. For example, for the absorption edge(mdr-xafs:AbsorptionEdge), the definition can be found atthe following web-accessible URI: https://dice.nims.go.jp/ontology/mdr-xafs-ont/Schema#AbsorptionEdge. As will beshown later, the general physical knowledge of inner-shellexcitations provides a data link between XAFS and othersynchrotron radiation experiments.3.2. Creation of XAFS common knowledgeThe dictionary discussed in Section 3.1 can be used toconvert vocabularies from human-readable to machine-read-able expressions. Given that databases have varying tablestructures and search algorithms, achieving cross-databasesearch requires not only a unified vocabulary QID but alsounified dataset knowledge. As described in Fig. 3, the repre-sentation of knowledge using RDF (‘triples’) can be illustratedmathematically as a directed graph. In other words, if thesubject and object in a triple are represented as nodes andconnected by a predicate as an edge, a chain of data can beexpressed, and ultimately knowledge can be constructed. Justas arbitrary nodes and edges can be manipulated in graphtheory, arbitrary information can be extracted from thisknowledge graph. Fig. 4 illustrates the directed graph structureof the knowledge designed in this study; XAFS knowledge andattribute knowledge, as described in Section 2.2, are displayedon the left and right sides of the figure, respectively. The keypoints for knowledge design are as follows:(i) We define the XAFS dataset creation process as ‘Work’.Here, both XAFS and attribution knowledge are associatedwith the Work.(ii) In the attribution knowledge, URL, DOI and dataholders are specified.(iii) In this Work, the specimen is referred to as a ‘partici-pant’ to emphasize that it is central to the study, rather thanmerely an object to be measured. This distinction clarifies thatthe primary aim of this Work is not the development ofinstruments or similar aspects.(iv) Several roles are required to conduct the Work, one ofwhich is the data holder role. This role is fulfilled by theparticipating institution.(v) The absorption edge is an outcome of the Work, and, asexplained in Section 3.1.2, the XAFS knowledge specifyingthe absorption edge includes two attributes: the absorptionelement and the inner-shell electron. Therefore, this figureincorporates Fig. 3, which is represented using a directedgraph.This dataset knowledge meets the requirements necessaryfor cross-database search and is applicable to any XAFSdatabase. The primary goal of introducing dataset knowledgeis to enable batch searches of this unified structure using asingle query based on the SPARQL Protocol and RDF QueryLanguage (SPARQL) (SPARQL, 2024). This SPARQL queryplays the role of extracting information from the directedgraph mentioned above. This semantic schema can beexpanded later without changing the query. For example, thecrystal structure can be used as an attribute of the specimen.The ontological linkage with other vocabularies (ExPaNDS,2024) will ultimately lead to the construction of a largescientific knowledge base.3.3. Creation of the portalThe IXDB portal has a straightforward yet practical role:providing links to databases worldwide. It is built on a Python-based application that enables anyone, anywhere, to quicklycreate similar services (see Fig. 1). Currently, IXDB is acces-sible to the public as an official service under the sub-domainresearch papersJ. Synchrotron Rad. (2025). 32 Masashi Ishii et al. � Global cross-database search system for XAS 5 of 8Figure 4Directed graph structure of ‘knowledge’ combining ‘XAFS knowledge’(left side) and ‘attribute knowledge’ (right side).Figure 3An example of a machine-readable description of an absorption edge(Q2516: Cu K-edge).https://dice.nims.go.jp/ontology/mdr-xafs-ont/Schema#https://dice.nims.go.jp/ontology/mdr-xafs-ont/Schema#https://dice.nims.go.jp/ontology/mdr-xafs-ont/Schema#AbsorptionEdgehttps://dice.nims.go.jp/ontology/mdr-xafs-ont/Schema#AbsorptionEdgeof the Japanese XAFS Society: https://ixdb.jxafs.org/. Themain page is shown in Fig. 5. Users can perform cross-databasesearches by material name and absorption edge. Search resultsdisplay the corresponding QID and representative name. Byselecting the desired material name, users can obtain a URLlink to the databases holding the data. This screen transition isfamiliar to XAFS users, ensuring that they will not experienceconfusion during operation. Since MatVoc includes Japanesesynonyms, it is possible to search XAFS databases in Europeand USA using Japanese terms. Although this serves as asymbolic demonstration of synonym search capability, cross-database searches are also feasible with English commonnames and chemical formulas. While an ideal solution wouldbe a collaborative editing wiki system to enrich synonyms, itsimplementation remains a future challenge, given that thepurpose-oriented dictionary discussed in Section 3.1 alsorequires control over polysemy.3.4. Maintaining and expanding the portalWhile the services connected to the endpoints will no longerrequire data maintenance, managing the vocabulary andknowledge as a master resource will still be necessary. Parti-cularly, when aiming to extend cross-database searchesbeyond the four databases currently implemented to includeothers worldwide, substantial effort will be required.Specifically, the primary tasks will involve merging thenames of measured samples, assigning new QIDs to newmaterials in MatVoc, and associating them with absorptionedges. For instance, this process was completed in about amonth for roughly 500 spectra from SSHADE/FAME. Ideally,this work could be carried out collaboratively by thecommunity. To support this effort, a MatVoc wiki system isneeded, along with clearly defined rules and guidelines. Ourgoal is to develop a collaborative system with the communityto strengthen the data infrastructure.4. Operation procedure of IXDBFinally, we summarize the actual search flow in IXDB. Thesearch flow consists of the following three steps:(1) Machine-readable conversion of vocabulary (QIDconversion).(2) Cross-database search using knowledge.(3) Indication of links to datasets and their data holders.Fig. 6 is a schematic diagram of the system including query-and-responses. The numbers in the figure correspond to thenumbers above steps. The principle of each process isdescribed below.(1) Machine-readable conversion of vocabulary (QIDconversion). In many cases, different specimen names are usedacross databases. For example, ‘Cobalt-iron oxide’ in MDRXAFS DB corresponds to ‘CoFe2O4’ in SSHADE/FAME,‘Sodium selenate’ in SSHADE/FAME is the same as‘Na2SeO3’ in XASLIB, and ‘Pyrolusite’ in XASLIB isequivalent to ‘Manganese (IV) oxide’ in MDR XAFS DB.Even though it should be possible to search using any term,when individual databases are used separately, humanknowledge is required to recognize that they are the samematerial. Advanced databases like SSHADE/FAME haveinternal synonym dictionaries to manage variations in searchterms, but for the international cross-database search imple-mented here the created XAFS dictionary is shared among alltarget databases. This conversion returns the representativename of the material to IXDB along with its QID for visibility,though the QID is the primary reference used in subsequentsearches. In this example shown in Fig. 6, the representativename is ‘Copper’ and the QID is Q1426. The MatVocSPARQL endpoint used for this QID conversion is: https://matvoc.nims.go.jp/graph/sparql. Appendix A gives an exampleof the curl (client for URL) command including SPARQL thatobtains the representative name and corresponding QID forchemical formulars containing ‘Cu’.research papers6 of 8 Masashi Ishii et al. � Global cross-database search system for XAS J. Synchrotron Rad. (2025). 32Figure 5International XAFS DB Portal as a service of Japanese XAFS Society(https://ixdb.jxafs.org/).Figure 6Schematic diagram of the IXDB system including query-and-responsesflows.https://ixdb.jxafs.org/https://matvoc.nims.go.jp/graph/sparqlhttps://matvoc.nims.go.jp/graph/sparqlhttps://ixdb.jxafs.org/(2) Cross-database search using ‘knowledge’. Once a QIDis assigned, as described in (1), the SPARQL query usedto search for knowledge is consistent across all databases,facilitating cross-database searches. In this study, we custo-mized the SPARQL to return both the name of the dataholder and links to the data for each QID. The developedknowledge including the MDR-XAFS ontology is storedon the NIMS public SPARQL endpoint, which is https://materials-open-rdf.nims.go.jp/sparql. Appendix B gives anexample of the curl command including SPARQL that actuallyobtains the name of the data holder and links to the datafor the Cu K-edge of copper. In essence, although all of thequery descriptions here are represented using QIDs, thecontents describe the procedure to retrieve information bytracing the directed graph in Fig. 4. Specifically, the stepsare as follows: (a) Select Works associated with thenormalized name of copper (Q1426). (b) These Worksshould output the absorption edge, where the excited atom iscopper (Q2421), and the K-shell electron (Q2824) is excited.(c) Finally, the data holder for each Work and the corre-sponding URLs where the specific data are located shouldbe indicated. This process enables the display of 21 linksto spectra from all the databases involved in this project onthe IXDB portal.This procedure can be adapted to derive the material nameby specifying a different absorption edge, such as the L3-edge(Q2599), or to generate a list of data provided by a specificdata holder. This is a key advantage of semantic data repre-sentation, which allows the user to obtain the desired infor-mation as though engaging in the semantic conversationwith the endpoint. The high degree of flexibility in designingSPARQL queries allows IXDB to incorporate additionalfunctionalities. By generalizing the knowledge to inner-shellexcitations, other measurement methods with similar physicalprocesses to XAFS can be integrated into the same systemfor cross-database searches. For instance, IXDB currentlydemonstrates linkage with Hard X-ray Photoelectron Spec-troscopy (HAXPES).(3) Indication of links and data holding institutions. As aresult of (2), IXDB provides links to spectra based on theuser’s input of material name or absorption edge. For example,if the material name is specified as ‘silver’ (Q1292) alone, a listof 12 data holders and links will be displayed. If only theabsorption edge is specified as ‘Fe K-edge’ (Q2513), a list of291 spectrum data holders and links will be obtained. Theselinks direct users to the relevant databases, with IXDB’s rolebeing solely to make the data discoverable, while the indivi-dual databases manage the remaining functions related to datautilization. Users can access the data only after being redir-ected to the individual databases, ensuring that IXDB doesnot infringe upon the rights of data holders. Moreover, itshould be emphasized that (1) and (2) utilize endpointsexternal to IXDB, with IXDB responsible only for managingdata linkage. Consequently, IXDB functions as a cost-effectivesystem without database capabilities. The clear segmentationof roles and the complete systemic and spatial separation ofID conversion, search and display functions means that notonly IXDB but also other portals can be built and operated byanyone anywhere, and can be operated simultaneously.Provided that the query design in the portals is accurate, anyportal will display the most current information withoutdiscrepancies. For example, a service that has been reorga-nized from the perspective of X-ray photoelectron spectro-scopy (XPS) rather than XAFS is being released on a trialbasis at https://materiage.org/. Detailed specifications of theendpoints will be published in the MDR.5. Summary and prospectsThe International XAFS DB (IXDB) portal for globalcross-database searches in X-ray absorption spectroscopyhas been developed and made publicly accessible. This portalhas achieved unification of vocabulary and knowledge,addressing variations in search terms and normalizing data-base structures, thus facilitating the findability of allXAFS data. The knowledge developed is machine-under-standable for inner-shell excitation and can be extended toother synchrotron radiation data. Notably, IXDB has imple-mented cross-database searches for XAFS and HAXPES.This strongly indicates the potential for integrating searchesacross synchrotron radiation data and other measurementdata by further extending the unifications of vocabularyand knowledge.APPENDIX AAn example of vocabulary unification using the MatVocendpointAn example of a curl command including SPARQL thatobtains the representative name and corresponding QID forchemical formulars containing ‘Cu’. Line breaks have beeninserted into the SPARQL for visibility. SPARQL can also besubmitted from an online editor at https://matvoc.nims.go.jp/query/.APPENDIX BAn example of semantic conversation using datasetknowledge via NIMS public endpointAn example of a curl command including SPARQL thatactually obtains the name of the data holder and links to thedata for the Cu K-edge of copper. Line breaks have beeninserted into the SPARQL for visibility. SPARQL can also beresearch papersJ. Synchrotron Rad. (2025). 32 Masashi Ishii et al. � Global cross-database search system for XAS 7 of 8https://materials-open-rdf.nims.go.jp/sparqlhttps://materials-open-rdf.nims.go.jp/sparqlhttps://materiage.org/https://matvoc.nims.go.jp/query/https://matvoc.nims.go.jp/query/submitted from an online editor at https://materials-open-rdf.nims.go.jp/sparql/.AcknowledgementsThis effort is based on the help of many people from theparticipating institutions of the MDR XAFS DB. The coremembers of this project are: Kiyotaka Asakura, MasaoKimura, Masao Tabuchi, Yasuhiro Inada, Takahiro Matsu-moto and Eiichi Kobayashi. We thank Matthew Newville,Mauro Rovezzi, Francesco d’Acapito and Alessandro Puri fortheir assistance in the construction of IXDB.Conflict of interestI declare that there are no conflicts of interest.Data availabilityThe data for the implemented application, vocabularydictionary and knowledge base can be accessed via the URLsgiven in the paper.Funding informationThis work is partly supported by the MEXT Program: DataCreation and Utilization-Type Material Research and Devel-opment Project (Digital Transformation Initiative Center forMagnetic Materials), Grant Number JPMXP1122715503, andby the Council for Science, Technology and Innovation(CSTI), Cross-ministerial Strategic Innovation PromotionProgram (SIP), the 3rd period of SIP ‘Materials InformaticsInfrastructure Linkage and Human Resource Developmentfor Fostering Material Unicorns’ (Funding agency: NIMS).ReferencesChaurand, P. & Collin, B. (2017). SSHADE database infrastructure,https://doi.org/10.26302/SSHADE/EXPERIMENT_PC_20180420_001.Chaurand, P., Liu, W., Borschneck, D., Levard, C., Auffan, M., Paul,E., Collin, B., Kieffer, I., Lanone, S., Rose, J. & Perrin, J. (2018). Sci.Rep. 8, 4408.ExPaNDS (2024). ExPaNDS-experimental-techniques-ontology,https://github.com/ExPaNDS-eu/ExPaNDS-experimental-techniques-ontology.Garenne, A., Beck, P., Montes–Hernandez, G., Bonal, L., Quirico, E.,Proux, O. & Hazemann, J. L. (2019). Meteorit. Planet. Sci. 54, 2652–2665.Garenne, A., Bonal, L., Beck, P. & Proux, O. (2013). SSHADEdatabase infrastructure, https://doi.org/10.26302/SSHADE/EXPERIMENT_LB_20191211_005.IXDB (2024). International XAFS Database Portal, https://ixdb.jxafs.org/.Ishii, M., Nagao, H., Tanabe, K., Matsuda, A. & Yoshikawa, H.(2021). MDR XAFS DB, https://doi.org/10.48505/nims.1447.Ishii, M., Tanabe, K., Matsuda, A., Ofuchi, H., Matsumoto, T., Yaji, T.,Inada, Y., Nitani, H., Kimura, M. & Asakura, K. (2023). Sci.Technol. Adv. Mater. Methods, 3, 2197518.Kieffer, I. & Testemale, D. (2024). SSHADE: the Solid Spectroscopydatabase infrastructure, https://www.sshade.eu/doi/10.26302/SSHADE/FAME.Kincaid, B. M. & Eisenberger, P. (1975). Phys. Rev. Lett. 34, 1361–1364.MatVoc (2024). NIMS XAFS DB Project Materials Dictionary,https://matvoc.nims.go.jp/wiki/Item:Q713.PDB (2024). Protein Data Bank, https://www.rcsb.org/#Category-welcome.Puri, A. (2024). LISA XAS Database, https://zenodo.org/doi/10.5281/zenodo.10778068.Ravel, B. & Newville, M. (2016). J. Phys. Conf. Ser. 712, 012148.Rehr, J. J. & Albers, R. C. (2000). Rev. Mod. Phys. 72, 621–654.RDF (2024). Resource Description Framework (RDF), https://www.w3.org/2001/sw/wiki/RDF.SKOS (2024). SKOS Simple Knowledge Organization System Primer,https://www.w3.org/TR/skos-primer/.SPARQL (2024). SPARQL Query Language for RDF, https://www.w3.org/2001/sw/wiki/SPARQL.Strasser, B. J. (2019). Collecting Experiments Making Big DataBiology. The University of Chicago Press.Vichery, C. & Maurin, I. (2012). SSHADE database infrastructure,https://doi.org/10.26302/SSHADE/EXPERIMENT_IM_20120926_001.XAFS Metadata Schema (2024). xafs-db/xafs-schema, https://github.com/xafs-db/xafs-schema.XASLIB (2024). IXAS X-ray Absorption Data Library, https://xaslib.xrayabsorption.org/.research papers8 of 8 Masashi Ishii et al. � Global cross-database search system for XAS J. Synchrotron Rad. (2025). 32https://materials-open-rdf.nims.go.jp/sparql/https://materials-open-rdf.nims.go.jp/sparql/https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB1https://doi.org/10.26302/SSHADE/EXPERIMENT_PC_20180420_001https://doi.org/10.26302/SSHADE/EXPERIMENT_PC_20180420_001https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB2https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB2https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB2https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB3https://github.com/ExPaNDS-eu/ExPaNDS-experimental-techniques-ontologyhttps://github.com/ExPaNDS-eu/ExPaNDS-experimental-techniques-ontologyhttps://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB4https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB4https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB4https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB5https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB5https://doi.org/10.26302/SSHADE/EXPERIMENT_LB_20191211_005https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB6https://ixdb.jxafs.org/https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB7https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB7https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB8https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB8https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB8https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB9https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB9https://www.sshade.eu/doi/10.26302/SSHADE/FAMEhttps://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB10https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB10https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB11https://matvoc.nims.go.jp/wiki/Item:Q713https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB12https://www.rcsb.org/#Category-welcomehttps://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB13https://zenodo.org/doi/10.5281/zenodo.10778068https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB14https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB15https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB16https://www.w3.org/2001/sw/wiki/RDFhttps://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB17https://www.w3.org/TR/skos-primer/https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB18https://www.w3.org/2001/sw/wiki/SPARQLhttps://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB19https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB19https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB20https://doi.org/10.26302/SSHADE/EXPERIMENT_IM_20120926_001https://doi.org/10.26302/SSHADE/EXPERIMENT_IM_20120926_001https://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB21https://github.com/xafs-db/xafs-schemahttps://scripts.iucr.org/cgi-bin/cr.cgi?rm=pdfbb&cnor=ing5006&bbid=BB22https://xaslib.xrayabsorption.org/ Abstract 1. Introduction 2. Definition of academic requirements for IXDB 2.1. Vocabulary unification (terminology) 2.2. Knowledge unification (ontology) 2.3. Sharing vocabulary and knowledge (semantics) 3. Implementation and discussion 3.1. XAFS dictionary creation 3.1.1. Implementation of lower classes for Chemicals (Q2828) 3.1.2. Implementation of lower classes for X-ray absorption edge (Q2487) 3.2. Creation of XAFS common knowledge 3.3. Creation of the portal 3.4. Maintaining and expanding the portal 4. Operation procedure of IXDB 5. Summary and prospects APPENDIX A: An example of vocabulary unification using the MatVoc endpoint APPENDIX B: An example of semantic conversation using dataset knowledge via NIMS public endpoint Acknowledgements Conflict of interest Data availability Funding information References