# Fileset

[applsci-15-10511-with-cover.pdf](https://mdr.nims.go.jp/filesets/5b201afa-c36b-4a17-804c-56df35ed975a/download)

## Creator

[Michiko Yoshitake](https://orcid.org/0000-0002-0973-5666), [Takahiro Nagata](https://orcid.org/0000-0002-8591-2943)

## Rights

[Creative Commons BY Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/)

## Other metadata

[A Method for LLM-Based Construction of a Materials Property Knowledge Graph: A Case Study](https://mdr.nims.go.jp/datasets/ff9d8c80-a789-4295-bf5c-5372ef11be63)

## Fulltext

A Method for LLM-Based Construction of a Materials Property Knowledge Graph: A Case Study5.52.5A Method for LLM-BasedConstruction of a MaterialsProperty Knowledge Graph: ACase StudyMichiko Yoshitake and Takahiro NagataSpecial IssueApplications of Natural Language Processing to Data ScienceEdited byProf. Dr. Vincenza Carchiolo and Dr. Michele MalgeriArticlehttps://doi.org/10.3390/app151910511https://www.mdpi.com/journal/applscihttps://www.scopus.com/sourceid/21100829268https://www.mdpi.com/journal/applsci/statshttps://www.mdpi.com/journal/applsci/special_issues/8TIJ9V695Whttps://www.mdpi.comhttps://doi.org/10.3390/app151910511Academic Editors: Vincenza Carchioloand Michele MalgeriReceived: 29 July 2025Revised: 16 September 2025Accepted: 24 September 2025Published: 28 September 2025Citation: Yoshitake, M.; Nagata, T. AMethod for LLM-Based Constructionof a Materials Property KnowledgeGraph: A Case Study. Appl. Sci. 2025,15, 10511. https://doi.org/10.3390/app151910511Copyright: © 2025 by the authors.Licensee MDPI, Basel, Switzerland.This article is an open access articledistributed under the terms andconditions of the Creative CommonsAttribution (CC BY) license(https://creativecommons.org/licenses/by/4.0/).ArticleA Method for LLM-Based Construction of a Materials PropertyKnowledge Graph: A Case StudyMichiko Yoshitake 1,2,* and Takahiro Nagata 11 National Institute for Material Science, Tsukuba 305-0047, Japan2 MatQ-lab, Chiba 271-0092, Japan* Correspondence: yoshitake.michikol@nims.go.jp or materials.curationl@gmail.com; Tel.: +81-29-863-5496AbstractIn the field of materials science, experimental data or simulation results on material prop-erties are often unevenly distributed. In addition to the vast unexplored material space,properties of lesser interest have not been measured even for well-studied materials, asexemplified by the discovery of the superconductivity of the long-known MgB2. To over-come such challenges, utilizing relationships among material properties based on scientificprinciples can be beneficial. We have been constructing a knowledge graph of materialproperty relationships using natural language-processing techniques for years. Now, withthe surprising development of large language models, constructing a knowledge graph hasbecome much easier. This article explains what a knowledge graph of material propertyrelationships is, presents several types of applications for the knowledge graph, and de-scribes how the constructed knowledge graph can be implemented in machine learning forpredicting material property values. We also demonstrate the construction of a knowledgegraph of material property relationships and a search system using ChatGPT, without anyprogramming, which will be made publicly available.Keywords: materials property relationship; knowledge graph; graph search; data interpolation;generative AI1. IntroductionMaterials informatics initially was developed with numerical data such as electricalconductivity values and process temperatures, aiming to predict material property values(e.g., electrical conductivity) or to optimize conditions, such as chemical compositions orheating temperature, in processes. Numerical data, including simulated data, are now inpractical use in many industrial settings. For textual data, patents were the first category tobe utilized in data science due to their relatively well-defined literary format. The utilizationof scientific papers and textbooks has lagged behind because of their unstructured format.Before the recent explosive development of generative large language models (generativeLLMs), collecting and finding targeted reference documents from vast patent databases orscientific articles, and extracting numerical values from tables or texts in scientific articlesto construct material databases have been the main uses of LLMs in materials science.With the emergence of generative LLMs, entity extraction from scientific paperscan now be performed without coding. Data analysis by generative LLMs [1], promptengineering for chemistry [2], AI scientists [3], and the development of AI agents [4] allemerged. All these techniques are based on LLMs.Appl. Sci. 2025, 15, 10511 https://doi.org/10.3390/app151910511https://doi.org/10.3390/app151910511https://doi.org/10.3390/app151910511https://creativecommons.org/licenses/by/4.0/https://creativecommons.org/licenses/by/4.0/https://www.mdpi.com/journal/applscihttps://www.mdpi.comhttps://orcid.org/0000-0002-0973-5666https://orcid.org/0000-0002-8591-2943https://doi.org/10.3390/app151910511https://www.mdpi.com/article/10.3390/app151910511?type=check_update&version=1Appl. Sci. 2025, 15, 10511 2 of 18The above developments are mostly based on a question–answer type of responsefrom generative LLMs. However, to grasp the overall picture or place an issue in context,other forms of visualization can be more powerful—one of which represents relationshipsamong pieces of information using a knowledge graph. A knowledge graph is a typeof information representation in which the connections between data are emphasized,rather than numerical values or the contents of text. It is explained in Wikipedia [5] as,“A knowledge graph is a knowledge base that uses a graph-structured data model ortopology to represent and operate on data. Knowledge graphs are often used to storeinterlinked descriptions of entities—objects, events, situations or abstract concepts—whilealso encoding the free-form semantics or relationships underlying these entities.” Anexample of a knowledge graph is shown in Figure 1, where the relationships among leadingglobal semiconductor companies are visualized. This was obtained by inputting thefollowing prompt to ChatGPT-o4-mini: “Please generate an image of a knowledge graphshowing the relationships among leading global semiconductor companies, including theirsuppliers.” Company names are categorized as Foundries (blue), Equipment Suppliers(orange), Wafer Material Suppliers (yellow), Chip Designers and Fabless Companies (green),and Memory Manufacturers (light green).Figure 1. An example of a knowledge graph, which was obtained by inputting the following promptto ChatGPT-o4-mini, “Please generate an image of a knowledge graph showing the relationshipsamong leading global semiconductor companies, including their suppliers.”.The output shown in Figure 1 could be different for each input due to slightly differentcompany names in the graph. However, Figure 1 is only to show what a knowledge graphis, and the reproducibility is not an issue.There are many articles on general aspects of knowledge graphs, such as generalguides to knowledge graphs [6], use cases [7,8], industrial applications [9], and applicationsin data analytics [10]. Following the emergence of generative language AI models, varioustypes of knowledge graphs have been created in the field of materials science [11–15],including those generated from correlations between numerical data and those constructedthrough entity extractions from scientific papers.Appl. Sci. 2025, 15, 10511 3 of 18In addition to constructing knowledge graphs with LLMs, such graphs can also beused to adjust generative LLMs to specific domains. A well-known technique for domainadaptation is RAG (Retrieval-Augmented Generation) [16,17], in which a generative LLMreferences vectors derived from domain-specific texts. Here, similarity between texts ismeasured in a vector space. As an alternative, another technique has recently emergedthat adapts LLMs using the connections among items in knowledge graphs, known aseither KAG (Knowledge Augmented Generation) [18] or graph RAG (Graph Retrieval-Augmented Generation) [19].While the applications of LLMs and knowledge graphs have advanced in general,practical use appears limited mainly to simple applications of LLMS in searching andextracting numerical information in materials science. Knowledge graphs are not familiarto materials scientists. For practical use of these techniques in materials science, theapplicability of the techniques to users’ individual objects without coding skills wouldbe key. In this article, we describe the usability of knowledge graphs, especially formaterial property relationships, and show an example of how to construct knowledgegraphs on material property relationships of users’ interests with generative LLMs andwithout coding.2. Knowledge Graph on Materials Property RelationshipsAmong knowledge graphs in materials science, we focused on a graph representingrelationships between material properties that are derived from scientific principles, ratherthan from correlations between numerical data. Because these relationships are based onscientific principles rather than empirical data, they can be applied to materials that havenot yet been reported. Figure 2a shows a schematic example of a knowledge graph ofmaterial property relationships, where only relationships—not property values—are storedas information. (a) Formation enthalpyHOMO-LUMO gapBand gapOptical adsorption / transmission spectraElectrical conductivityRedox potentialGibbs energyAdsorption energyPermittivityThermal conductivityFigure 2. Cont.Appl. Sci. 2025, 15, 10511 4 of 18(b) Electrical conductivity Thermal conductivityNext, we return to a statement which we made in Chapter 18. We pointed out there that Wiedemann and Franz observed that good electrical conductors are also good thermal conductors. We are now in a position to compare the thermal conductivity (21.12) with the electrical conductivity (7.15),.                                    (21.13)Figure 2. (a). Schematic example of a material property relationship knowledge graph. (b). Schematicexample of relationship extraction from texts for the connectivity between electrical conductivity andthermal conductivity.We proposed the utilization of a knowledge graph of material property relationships [20]well before the release of ChatGPT, or even that of BERT [21], which uses a transformermodel with self-attention and was the state-of-the-art model prior to ChatGPT. Techniquesfor utilizing the knowledge graph were patented under NIMS [22]. The constructionof the knowledge graph on material property relationships using NLP, along with thedevelopment of a prototype search system, was carried out in collaboration with a companyusing several textbooks in materials science [23]. Due to the technical limitations of naturallanguage processing at the time, the prototype was built using older techniques suchas morphological analysis and parsing. However, this approach brought one significantadvantage—even in the current era of generative AI, such as ChatGPT—in that it allowscitation of references for any relationship within the documents used to construct thegraph, even when the source texts are not in HTML format. The method for extractingrelationships from texts is schematically illustrated in Figure 2b. For example, from thephase “compare the thermal conductivity (21.12) with the electrical conductivity”, thematerial properties “electrical conductivity” and “thermal conductivity” are identifiedas related. After preprocessing the documents (e.g., converting PDFs to text, removingunnecessary contents such as page numbers, and performing entity matching), phrases thatconnect two material property names (using a predefined dictionary of material propertynames, which serve as graph nodes) were automatically extracted by NLP techniques, andthe two material properties were connected—creating an edge between the two nodes.Examples of search results in the prototype system [23] are shown in Figure 3. Ingeneral, there are two types of searches in graph data: path-based and connectivity-based.Figure 3a shows an example of a path search, where the shortest path between two materialproperties, “dielectric constant” and “thermal expansion coefficient”, is displayed. Anexample of a connection search is shown in Figure 3b, where material properties connectedto “dielectric constant” are sequentially retrieved (dielectric constant -> polarizability ->hardness -> many properties shown in pale yellow nodes). Path search is particularlyuseful when addressing trade-offs between two properties: identifying material propertiesthat lie along the paths between the two can provide insights into why these two propertiesare in trade-offs, or how such trade-offs might be avoided. Connection search, on the otherside, is useful for identifying material properties that can potentially substitute the originalmaterial properties—especially in cases where no numerical data are available.Appl. Sci. 2025, 15, 10511 5 of 18 Figure 3. Examples of search results in the prototype system: (a) result of shortest path search betweentwo material properties “dielectric constant” and “thermal expansion coefficient”; (b) sequentialconnection search (dielectric constant -> polarizability -> hardness. (Figures from [24]). Japanesecharacter at the top of the left box in both figures means that the colors below are different materialscience categories. Japanese character at the second top of the left box in Figure 3a means ‘trade-off’so that by checking the small square next to the character, hints for avoiding trade-off will appear.Details are in ref. [24].It should be noted that textbooks, not scientific articles, are used to construct thisknowledge graph. Textbooks describe material property relationships not based on nu-merical data but on scientific principles or scientific reasoning. These scientific principles,of course, were established with the help of experimental numerical data in the past, buttextbooks give scientific reasons to explain relationships. Relationships with scientificreasoning provide great advantages: (1) the relationships can apply beyond materials withnumerical data; (2) materials that cannot be applied to the relationships are clearly defined.Furthermore, the prototype knowledge graph shows a sentence that describes relationshipsin the textbooks by clicking the corresponding edge [23], where users can see the reasoningof the relationships and the limitations of the relationship applications by reading thecorresponding paragraphs in the textbooks.Appl. Sci. 2025, 15, 10511 6 of 183. Application of Materials Property Knowledge Graph to Data ScienceThe simplest application of the materials property knowledge graph is probably theestimation of missing numerical data using known relationships. For example, there is alinear relationship between electrical conductivity and thermal conductivity in metallicmaterials—as electrons are the main heat carrier—as illustrated in Figure 2b from a physicalprinciples perspective. Indeed, a strong linear correlation is observed in the experimentallyobtained numerical data for these two properties (data taken from [25–31]), as shown inFigure 4.One of the biggest problems in materials informatics using numerical values is the lackof numerical data. For example, regarding thermal conductivity, alloying Cu with additiveelements in a small amount to improve the strength of electric wire in thin film form is acommon technique. Although the electric conductivity of the alloyed material is measured(to be used as an electric wire), there is no numerical data on thermal conductivity for suchalloyed materials whose additives are in small amounts for each. This is because measuringthermal conductivity is far more difficult compared to electrical conductivity due to thedifficulty of thermal isolation from the environment, resulting in a lack of experimentaldata for alloys containing many minor additive elements in general. However, thermalconductivity is important in electric wire because the heat generated by the electric currentshould be released through thermal conduction to avoid heat damage to surroundingdevices. Therefore, it is advantageous to know the scientific relationship between electricalconductivity and thermal conductivity without numerical data in related materials, so thatthe thermal conductivity values of alloyed materials with small additives can be estimated.The biggest merit of using a material property relationship knowledge graph is that therelationship is extended to materials where there is little or no numerical data in similar orrelated materials.This example demonstrates that the interpolation of numerical values is possible usingthe materials property knowledge graph, even when experimental data are nearly absent.Figure 4. Correlation of experimental values between electrical conductivity and thermal conductivity.Appl. Sci. 2025, 15, 10511 7 of 18Another application of the materials property knowledge graph is identifying a ma-terial property that can be used as a descriptor for machine learning. Since the values of“work function” in carbon-deficient transition metal carbides are not available in a databaseor are difficult to measure, the author attempted to find an alternative property. Initially,Vicker’s hardness was identified as a viable alternative and was successfully used to explainand predict the work function values of carbon-deficient transition metal carbides [32].However, through the materials property knowledge graph, the author discovered that“density” could also serve as an alternative property, as it is connected to “work function”via “bonding energy”, as shown in Figure 5a. In fact, experimental data revealed thatdensity variations with carbon deficiency closely resemble that variations in hardness [33],as illustrated in Figure 5b,c, suggesting that density could also be an alternative descriptorfor the work function.Work function is a very important property in electronics, and transition metal carbidesare good for electrodes, whose function is determined by the work function. Transitionmetal carbides often have a carbon deficiency, and work function values are greatly influ-enced by carbon deficiency. However, the influence has been measured and calculatedonly for two materials, TaC0.5 and HfC0.6, where the work function increased by carbondeficiency for TaC, while it decreased for HfC. Even the direction of the influence is oppositebetween the two. The difficulty of reliable and repeatable work function measurementscauses such conditions, making the correlation of numerical data impossible. However,with the help of the logic in the calculations for the two materials, we could find a reasonfor the influence of carbon deficiency on work function values and relate it to hardness [32].With the help of the knowledge graph, the work function is related to a more commonproperty, density, than hardness. Hardness is only measured when researchers are inter-ested in mechanical applications, while density (mostly calculated from XRD) is measuredfor almost all crystalline materials regardless of the researchers’ interest. Therefore, thenumber of numerical data for density is far more than that of hardness. There are manysimilar advantages in referring to the knowledge graph of material property relationshipsto overcome the shortage of experimental and simulated data.(a) Figure 5. Cont.Appl. Sci. 2025, 15, 10511 8 of 18Figure 5. (a) The result of the sequential connection search relating to “work function” showsa connection with “density”. (Figure from [24]). Japanese character at the top of the left box inFigure 5a means that the colors below are different material science categories. (b) Correlationbetween relative density and carbon deficiency (stoichiometry); (c) Correlation between Hardnessand carbon deficiency (stoichiometry).The materials property graph can also be applied to the properties of organic materials.Figure 6a demonstrates how “solubility parameter” relates to other properties, showing that“glass transition” is one of the connected properties [24]. From a polymer database [34],it is evident that the solubility parameter correlates strongly with the glass transitiontemperature, as shown in Figure 6b. This correlation suggests that the “glass transitiontemperature” can be used as an alternative descriptor for the “solubility parameter”, andvice versa.Appl. Sci. 2025, 15, 10511 9 of 18Figure 6. (a) Results of sequential connection search from “solubility parameter” showing connectionwith “glass transition” (Figure from [24]); Japanese character at the top of the left box in Figure 6ameans that the colors below are different material science categories. (b) Correlation of experimentalvalues between glass transition temperature and solubility parameter. The green circle is an eye guidefor the correlation. Blue and Red dots mean neat resin and composite/compound, respectively.4. Generation of Materials Property Knowledge Graph and Its SearchTool Using ChatGPTThe prototype developed in collaboration with the company is no longer availableafter the termination of the partnership. Therefore, the author attempted to develop anew materials property knowledge graph and a corresponding search system with thehelp of ChatGPT. Here, we demonstrate how a knowledge graph and its search tool canbe generated.To begin with, a list of material property names should be prepared. It is technicallypossible to construct a material property knowledge graph without such a list by performingsimultaneous entity extraction and relation extraction with simply asking generative AI“Extract material property names and their mutual relationship from the uploaded textbookand make a knowledge graph from the extracted relationship”. However, preparing alist in advance results in much cleaner and more accurate knowledge graphs, with fewererrors and less noise. To create this list, generative language AIs such as ChatGPT canAppl. Sci. 2025, 15, 10511 10 of 18be employed by providing several examples of material property names (e.g., “electricalconductivity” and “thermal conductivity”). In this demonstration, we prepared a list of onehundred material property names by asking ChatGPT, “Output a list of hundred materialproperty names such as ‘electrical conductivity’ and ‘dielectric constant’ as a text file.” Then,the names in the list were manually checked to see whether they were appropriate or not.The names on the list slightly deviate from input to input; however, all names appearedto be appropriate for the demonstration purpose. The list used for the demonstration(List S1) is attached as a Supplementary file of the article. It is also possible to make alist manually without the help of generative LLMs, of course. Depending on the users, adifferent list should be uploaded for the knowledge graph generation of their interests. Inthe second step, the text file containing prepared material property names and a PDF file(or multiple files) of materials science textbooks—in which relationships among materialproperties are described—were uploaded to ChatGPT-4o (or a more advanced model).The following prompt was used: “Please extract pairs of material property name listedin the uploaded xxx file (name of text file of the list) among which there are relationshipdescribed in the uploaded xxxx.pdf (name of textbook file). Output the extracted pairsas a csv file to be downloaded.” In this demonstration, [35] was used as a textbook, andits PDF file was uploaded. Figure 7 shows an example of this prompt and the response.A downloadable CSV file was successfully generated. In the CSV file, “property name 1”and “property name 2” are stored in the first and second columns. We eliminated pairswhere property name 1 and property name 2 are identical (this is not necessary if we addto exclude them in the prompt). The resulting CSV file is also attached as a Supplementaryfile (List S2). The repeated input and output revealed that ChatGPT-4o always outputs thesame pairs. We asked ChatGPT-4o to also output the sentences that ChatGPT-4o foundthe relationships between two material properties, in addition to the two property names.The output of the sentences allows us to confirm the correctness of the pair extraction ofmaterial property names. Furthermore, the extracted relationships were exactly the sameas one using the prompt in Figure 7 (without asking to list the sentence that the LLM foundthe relationship). Therefore, the accuracy and reproducibility appear very good with thistask, possibly because this task does not need to “generate” but just “compare” words intwo files. There seems to be no problem with OCR, possibly because the current PDFs areprovided as a structured PDF.PDF file used for the extraction ofmaterial property relationshipList of material property namesInstruct to extract pairsof material propertynames, whoserelationship is describedin the PDF file andoutput the list of pairs ascsv fileDownloadable csv fileFigure 7. Example of the prompt and the response for making a database of material propertyrelationships.Appl. Sci. 2025, 15, 10511 11 of 18Next, by uploading the CSV file and prompting ChatGPT to draw a network of mate-rial properties using the property pairs as nodes and their relationship as edges, a graphsuch as the one shown in Figure 8a was produced. Figure 8b shows the response to aprompt requesting all shortest paths between “glass transition temperature and “thermalexpansion coefficient” (the generated graph is undirected). Once the relationship graph isgenerated, both types of searches—path search (as in the example above) and connectivitysearch—are easily performed. In this case, since no specific modules were designated, Chat-GPT used the default ‘networkx’ in the Python package, as indicated in the response whenthe source code for the analysis was displayed. It should be noted that the arrangement ofnodes and lengths of edges is different for each input. However, the nodes and edges arethe same because their information, given as a file, is the same, and the Python package isused for graph generation, where no statistics in generative LLMs are involved. To obtainFigure 8a, instructions “locate the two nodes, “glass transition temperature”, “thermalexpansion coefficient at the left and right sides”, and “use orange color for the two nodes”were used.Figure 8. (a) Knowledge graph representation of material property relationship obtained by ChatGPTin Figure 7. (b) ChatGPT’s output upon the instruction of graph drawing of all shortest paths between“glass transition temperature” and “thermal expansion coefficient”.Appl. Sci. 2025, 15, 10511 12 of 18To enable others to conduct similar knowledge graph searches, a MyGPT instancecalled “property graph-EN” was created. MyGPT is a customizable GPT service avail-able to GPT-plus ($20/month) users, allowing users to build original GPT models withfile upload capabilities. In the “property graph-EN” developed for searching materialproperty relationships, users are prompted to choose either path search or connectivitysearch. Once selected, users are then asked to input one or two material properties ofinterest (two for path search, one for connectivity search). In the “property graph-EN”,the CSV file where pairs of related two material properties are stored is uploaded, and theinstruction to output a partial graph according to a user’s requests is written. Figure 9ashows an example output of a path search between “glass transition temperature” and“thermal expansion coefficient”, and Figure 9b shows an example of a connectivity searchcentered around “dielectric constant”. Since the original knowledge graph (CSV file) isthe same, Figures 8b and 9a are the same as expected, though the arrangement of nodes isdifferent. The “property graph-EN” will be made publicly accessible upon the publicationof this article.Figure 9. Output of MyGPT, property graph-EN, of (a) all shortest paths between “glass transitiontemperature and “thermal expansion coefficient” and (b) connectivity around “dielectric constant”.Appl. Sci. 2025, 15, 10511 13 of 18It should be noted that the “property graph-EN” was created solely for demonstra-tion purposes and is not intended for commercial use. It was developed for researchersinterested in utilizing material property relationships, but who may lack programmingexperience to create their own knowledge graphs as described in published referencesand GitHub repositories. The number of nodes and edges in this version is limited. Sig-nificantly more advanced analyses are possible using additional features and functionscovered by patents held by a Japanese government institution. For commercial use of thepatented technologies—including the utilization of material property relationship knowl-edge graphs in machine learning applications—a license agreement is required, as outlinedin the relevant patents.It should be noted that constructing a similar knowledge graph starting with a differ-ent material property name list of a specific domain, such as magnetism or ferroelectrics, ispossible. For such cases, different textbook(s) suitable for the chosen domain should beuploaded. Then users can make their own knowledge graph on their interests, includingspecific properties in magnetism, ferroelectrics, properties related to chemical reaction,and so forth, without any coding. Furthermore, knowledge graphs of not only materialproperty relationships but also other relationships are able to be generated. For example,making a list of chemical compounds and asking generative LLMs to find pairs of differentchemical compounds in the list from the uploaded literature (in this case, not necessarilytextbooks) would result in a knowledge graph of chemical reactions. By selecting appropri-ate chemical compounds in a list and the literature uploaded, various knowledge graphscan be generated for each purpose.Furthermore, if users subscribe to a subscription plan and learn how to use a MyGPT-like service (Google, Anthropic, and other companies that supply generative LLM servicesalso provide similar functions), searching for such a custom-tuned knowledge graphbecomes available, like a software operation.5. DiscussionThe most important issue in knowledge graph generation is determining what typesof information should be chosen as nodes (entities) and what types of relationships shouldbe defined as edges (connections). Depending on how nodes and edges are defined,completely different knowledge graphs can be generated from the same information source,leading to diverse applications.The knowledge graph of material property relationships demonstrated here wasconstructed with material property names as nodes and relationships among materialproperties as edges, where the relationships were taken from a textbook of materials science.Therefore, it is oriented toward scientific principles and does not focus on specific materialcategories such as metal, oxides, or organic materials. Scientific principles are generallyapplicable to all materials, regardless of their categories or intended applications. Thismakes the application of the material property relationship knowledge graph applicableto estimate missing numerical data of material properties by interpolation and to replacematerial properties for use by other material properties, as described in the examplesin Section 3. Due to this applicability, the knowledge graph can serve as backgroundinfrastructure for machine learning. When data for a certain material property is missing,automatic interpolation is possible using alternative material properties that are linked tothe intended property. If many data points are missing for a given property, the knowledgegraph can be used to automatically identify an alternative descriptor for use in machinelearning. It is also possible to reduce the number of input descriptors (material properties)by identifying and removing properties that are strongly correlated, thereby eliminatingAppl. Sci. 2025, 15, 10511 14 of 18redundancy. All these processes can be handled in the background, without the user beingexplicitly aware of the underlying operations.The material property relationship knowledge graph can also be used in a graph RAGframework to support large language models in materials science contexts, as is commonlypracticed in generative AI in general [36–38].Being based on scientific principles, not numerical data, where the relationships areprimarily applicable to all materials, makes the material property relationship knowledgegraph unique among other knowledge graphs in materials science. Because of this unique-ness, it enables users to think beyond existing frameworks and not be constrained by theuneven distribution of experimental or computational data across materials. A schematicrepresentation of this advantage is shown in Figure 10 [39]. As mentioned in Section 3about thermal conductivity and work function, materials having numerical data on specificproperties being reported are very limited. This limitation is schematically shown as agreen plane in Figure 10, where the whole material search space is expressed as a three-dimensional space. While machine learning is a strong tool to search for the optimummaterial when numerical data are available, there is a huge space that is not being ex-plored, and no numerical data exists. Occasionally, revolutionary materials such as famousYBCO-like oxide superconductors [40] were discovered outside of the exploration spacein green. However, it has been known that the discovered revolutionary materials are notoutside of already known scientific principles. Therefore, the knowledge graph of materialproperty relationships based on scientific principles has the potential to make users thinkin an interdisciplinary way and to search for materials beyond the exploration space.Trial-and-error in this areaLocal optimumAlready explored spaceExploration space for machine learning  with numerical data (inside the plane)Expand exploration space based on scientific knowledgeMachine learning is a tool to efficiently find local minimumFigure 10. Knowledge graph of material property relationships helps one think without being con-strained by the uneven distribution of materials with available experimental or computational data.This knowledge graph also has potential for many other applications beyond thealready mentioned ones. For example, it can help identify previously unconsidered ap-plications of known materials: If material-a is used for application-A due to favorablecharacteristics in property-x, and property-x has a strong positive correlation with property-y, and property-y is known to be important for application-B, then, it can be suggestedthat material-a may also be suitable for application-B. Although the same inference mightbe derived using a material–application-specific knowledge graph, the material propertyrelationship graph allows for broader and more flexible exploration. Other applicationsinvolving machine learning or algorithmic inference are also possible.Appl. Sci. 2025, 15, 10511 15 of 18Because graph-structured data can be easily added to or removed from, combiningthe material property relationship knowledge graph with other domain-specific knowl-edge graphs can be especially powerful [32]. For example, a new subgraph related to“ferroelectricity” can be merged to extend the knowledge graph’s relevance to ferroelec-tric materials. Likewise, a graph detailing characterization methods for various materialproperties—where each method is connected to the material property it can measure—canbe added [41]. Many other types of simple, specific knowledge graphs can be integrateddepending on the intended use.To compare the knowledge graph of material property relationships with other knowl-edge graphs in materials science fields, knowledge graphs are divided into two categories.One is a rather general knowledge graph, with different kinds of entities as nodes andvarious types of relationships as edges, such as those in [13–15]. For example, entities in“material”, “application”, and “property” categories may serve as nodes, with edges repre-senting different relationships: between “Cu” node in “material” category and “electricalconductor” node in “application” category, or between “Li battery” node in “application”category and “ion conductivity” node in “property” category. This type of knowledgegraph collects a broad range of material-related information from a wide range of scientificarticles. These graphs are often massive, containing approximately 70,000 to 163,000 nodesand between 0.7 and 5.4 million edges. In such graphs, material properties are directlyconnected to applications, synthesis methods, or characterization techniques, as well asmaterials—not to other material properties like the knowledge graph demonstrated inthis article. These massive graphs are effective for searching alternative materials or pro-cesses among known options and are often used as background data structures in machinelearning or graph RAG applications [42]. Since the information in these general graphs isextensive and rapidly evolving, frequent updates are desired for practical use.The other type of knowledge graphs is constructed with specific nodes and relation-ships, as the knowledge graph demonstrated here. Other examples of this type are theknowledge graphs representing relationships between specific catalytic reactions and cata-lysts extracted from scientific articles [43,44]. In this type of knowledge graph, the type ofinformation to be used as nodes and edges is clearly defined. Therefore, they are relativelysimple, focus on specific issues of interest, and are useful for targeted applications.The knowledge graph of material property relationships also has clear, narrow def-initions on nodes and edges, and focuses on specific information for the construction.The biggest difference in the knowledge graph demonstrated from others in both typesis that the relationships were extracted from textbooks, which have scientific reasoningfor the relationships. It appears there are no similar knowledge graphs reported. Sincethe relationships are scientifically reasoned, not relying on the correlation in numericaldata, the relationships are extended to groups of materials with no numerical data, whichis the largest advantage of this knowledge graph. According to this advantage, there aretwo main merits in the applications, as in the examples described in Section 3 and above.One is that when there is not enough numerical data on an input property for machinelearning, (1) a possibility of interpolation using other numerical data can be suggested, or(2) a possibility of replacing an input property with another property with more numericaldata. The other merit is that it enables thinking beyond existing frameworks, withoutbeing constrained by the uneven distribution of experimental or computational data acrossmaterials. This kind of thinking is very important to discover revolutionary materials.Regarding the method of knowledge graph construction, most knowledge graphsin materials science were constructed before ChatGPT-4o and used primarily entity ex-traction techniques with different LLMs and pre-treatments through complicated coding.To construct a huge, general knowledge graph on materials science, such complicatedAppl. Sci. 2025, 15, 10511 16 of 18coding with LLMs and pre-treatments may not be avoidable. However, this study revealedthat the construction of a simple knowledge graph is plausible with a current generativeLLM without coding, if the information extracted as nodes is specifically defined or a listof possible nodes is given in advance, information type of edges is specifically defined,and the reference for extraction relationships is supplied to the generative LLM. Thatmaterials scientists can construct their own knowledge graphs of their interests, specificallyregarding properties in magnetism, ferroelectrics, properties related to chemical reactions,or whatever, without coding, is good news.6. ConclusionsAlthough numerical data analysis has played a central role in materials informatics,the shortage of numerical data has been a long-time issue in practice. Thanks to the rapidlyemerging LLMs, information extraction from textual data and retrieving scientific articlesare now possible with high accuracy. With the remarkable development of generativeLLMs, the construction of a knowledge graph—even without coding—has become feasible.In this article, we discussed the features of a knowledge graph on material propertyrelationships extracted from textbooks, where the relationships are scientifically reasoned;we demonstrated an example of a material property relationship knowledge graph, itsapplications for estimating missing numerical data and replacing a materials property withanother one, and the code-free construction of such a graph using ChatGPT and a graphsearch system with MyGPT.The main advantage of the knowledge graph based on scientific reasoning is theapplication of relationships beyond materials whose numerical data already exist. Twomerits arise from this advantage. One is the possibility of interpolating the value of therequired property by that of another property, or replacing the input property with anotherproperty in data analysis. The other is the possibility of searching materials beyond existingframeworks, without being constrained by the uneven distribution of experimental orsimulation data across materials.Regarding the construction of the knowledge graph of material property relationshipswith generative LLMs without coding, it was revealed that construction with few errorsand good reproducibility was possible when a list of possible nodes and a reference tobe extracted from are provided. This result suggests that material scientists who are notfamiliar with programming can make their own knowledge graph of interest.7. PatentsThe techniques for the utilization of knowledge graphs of material property relation-ships are patented, which are all granted as JP: Nos. 6719748, 6876344, 7169685, 7352313,7142325, 7026973, 7111354, 7082414, 7186436, 7396619, 7411977, and 7352315. US: Nos.US11,138,772B2: US 11,163,829 B2: US 11,449,552 B2: US 11,544,295 B2: US 12,105,741 B2and EP: No. EP3812923 B1.Supplementary Materials: The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/app151910511/s1, List S1: list of materials property names; List S2:list of extracted materials property name pairs having relationship.Author Contributions: Project administration and administration, T.N.; all other contributions, M.Y.All authors have read and agreed to the published version of the manuscript.Funding: This research was partially funded by the Japan Science and Technology Agency (JST)Mirai Program. The JST-Mirai Program ‘Materials Exploration space Expansion Platform (MEEP)’[Grant No. JPMJMI21G2].https://www.mdpi.com/article/10.3390/app151910511/s1https://www.mdpi.com/article/10.3390/app151910511/s1Appl. Sci. 2025, 15, 10511 17 of 18Data Availability Statement: The original contributions presented in this study are included in thearticle. Further inquiries can be directed to the corresponding author.Conflicts of Interest: The authors declare no conflicts of interest.References1. Irvine, D.J.; Halloran, L.J.S.; Brunner, P. Opportunities and limitations of the ChatGPT Advanced Data Analysis plugin forhydrological analyses. Hydrol. Process. 2023, 37, e15015. [CrossRef]2. Hatakeyama, S.K.; Yamane, N.; Igarashi, Y.; Nabae, Y.; Hayakawa, T. Prompt engineering of GPT-4 for chemical research: Whatcan/cannot be done? Sci. Technol. Adv. Mater. Meth. 2023, 3, 2260300. [CrossRef]3. Lu, C.; Lu, C.; Lange, R.T.; Foerster, J.; Clune, J.; Ha, D. The AI Scientist: Towards Fully Automated Open-Ended ScientificDiscovery. arXiv 2024, arXiv:2408.06292v3. [CrossRef]4. Shir, O.M. Towards AI Research Agents in the Chemical Sciences. ChemRxiv, 23 January 2024. [CrossRef]5. WikiPedia, Knowledge Graph. Available online: https://en.wikipedia.org/wiki/Knowledge_graph (accessed on 25 July 2025).6. Dilmegani, C. In-Depth Guide to Knowledge Graph: Use Cases 2025. AIMultiple, 10 July 2025. Available online: https://research.aimultiple.com/knowledge-graph/ (accessed on 25 July 2025).7. Tesfaye, L. Top Graph Use Cases and Enterprise Applications (with Real World Examples). Enterprise Knowledge Newsletter, 22February 2023. Available online: https://enterprise-knowledge.com/top-graph-use-cases-and-enterprise-applications-with-real-world-examples/ (accessed on 25 July 2025).8. Shakudo, Top 9 Knowledge Graphs Use Cases. Available online: https://cdn.prod.website-files.com/625447c67b621ab49bb7e3e5/67a3c0688035b75e2f4ca37a_pdf-knowledge%20graph%20use%20cases.pdf (accessed on 25 July 2025).9. Sajid, H. 20 Real-World Industrial Applications of Knowledge Graphs. Wisecube, 16 November 2022. Available online: https://www.wisecube.ai/blog/20-real-world-industrial-applications-of-knowledge-graphs/ (accessed on 25 July 2025).10. Mishram, C. Popular and Unique Knowledge Graph Use Cases for Data Analytics. SCIKIQ, 23 February 2023. Available online:https://scikiq.com/blog/popular-and-unique-knowledge-graph-use-cases-for-data-analytics/ (accessed on 25 July 2025).11. Mrdjenovich, D.; Horton, M.K.; Montoya, J.H.; Legaspi, C.M.; Dwaraknath, S.; Tshitoyan, V.; Jain, A.; Persson, K.A. propnet: AKnowledge Graph for Materials Science. Matter 2020, 2, 464–480. [CrossRef]12. Zhao, X.; Greenberg, J.; McClellan, S.; Hu, Y.-J.; Lopez, S.; Saikin, S.K.; Hu, X.; An, Y. Knowledge Graph-Empowered MaterialsDiscovery. In Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, 15–18 December2021. [CrossRef]13. Venugopal, V.; Pai, S.; Olivetti, E. The Largest Knowledge Graph in Materials Science—Entities, Relations, and Link Predictionthrough Graph Representation Learning. In Proceedings of the 36th Conference on Neural Information Processing Systems(NeurIPS 2022), New Orleans, LA, USA, 28 November–9 December 2022; Available online: https://openreview.net/forum?id=xyJ_0-WCIZN (accessed on 23 July 2025).14. Ye, Y.; Ren, J.; Wang, S.; Wan, Y.; Razzak, I.; Hoex, B.; Wang, H.; Xie, T.; Zhang, W. Construction and Application of MaterialsKnowledge Graph in Multidisciplinary Materials Science via Large Language Model. In Proceedings of the 38th Conference onNeural Information Processing Systems (NeurIPS 2024), Vancouver, BC, Canada, 9–15 December 2024.15. Venugopal, V.; Olivetti, E. MatKG: An autonomously generated knowledge graph in Material Science. Sci. Data 2024, 11, 217.[CrossRef]16. Lewis, P.; Perez, E.; Piktus, A.; Petroni, F.; Karpukhin, V.; Goyal, N.; Küttler, H.; Lewis, M.; Yih, W.; Rocktäschel, T.; et al.Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arXiv 2021, arXiv:2005.11401v4.17. IBM Newsletter. What Is Retrieval-Augmented Generation? 22 August 2023. Available online: https://research.ibm.com/blog/retrieval-augmented-generation-RAG?ref=blog.zatrok.com (accessed on 25 July 2025).18. Liang, L.; Sun, M.; Gui, Z.; Zhu, Z.; Jiang, Z.; Zhong, L.; Qu, Y.; Zhao, P.; Bo, Z.; Yang, J.; et al. KAG: Boosting LLMs in ProfessionalDomains via Knowledge Augmented Generation. arXiv 2024, arXiv:2409. 13731v3.19. Procko, T.T.; Ochoa, O. Graph Retrieval-Augmented Generation for Large Language Models: A Survey. In Proceedings of the2024 Conference on AI, Science, Engineering, and Technology (AIxSET), Laguna Hills, CA, USA, 30 September–2 October 2024;pp. 166–169. [CrossRef]20. Yoshitake, M. Searching System on Network of Various Materials Properties for Materials Curation. In Proceedings of the 63rdJSAP Spring Meeting, Tokyo, Japan, 29–22 March 2016. Presentation #21p-S322-2.21. Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understand-ing. arXiv 2018, arXiv:1810.04805. [CrossRef]22. NIMS patents: See patents section.23. Yoshitake, M.; Kawano, H. Material Curation® Support System: Prototype. Jxiv 2023. (In Japanese) [CrossRef]24. Yoshitake, M. Materials Curation® Support System: Case studies. Jxiv 2023. [CrossRef]https://doi.org/10.1002/hyp.15015https://doi.org/10.1080/27660400.2023.2260300https://doi.org/10.48550/arXiv.2408.06292https://doi.org/10.26434/chemrxiv-2024-lf2xxhttps://en.wikipedia.org/wiki/Knowledge_graphhttps://research.aimultiple.com/knowledge-graph/https://research.aimultiple.com/knowledge-graph/https://enterprise-knowledge.com/top-graph-use-cases-and-enterprise-applications-with-real-world-examples/https://enterprise-knowledge.com/top-graph-use-cases-and-enterprise-applications-with-real-world-examples/https://cdn.prod.website-files.com/625447c67b621ab49bb7e3e5/67a3c0688035b75e2f4ca37a_pdf-knowledge%20graph%20use%20cases.pdfhttps://cdn.prod.website-files.com/625447c67b621ab49bb7e3e5/67a3c0688035b75e2f4ca37a_pdf-knowledge%20graph%20use%20cases.pdfhttps://www.wisecube.ai/blog/20-real-world-industrial-applications-of-knowledge-graphs/https://www.wisecube.ai/blog/20-real-world-industrial-applications-of-knowledge-graphs/https://scikiq.com/blog/popular-and-unique-knowledge-graph-use-cases-for-data-analytics/https://doi.org/10.1016/j.matt.2019.11.013https://doi.org/10.1109/BigData52589.2021.9671503https://openreview.net/forum?id=xyJ_0-WCIZNhttps://openreview.net/forum?id=xyJ_0-WCIZNhttps://doi.org/10.1038/s41597-024-03039-zhttps://research.ibm.com/blog/retrieval-augmented-generation-RAG?ref=blog.zatrok.comhttps://research.ibm.com/blog/retrieval-augmented-generation-RAG?ref=blog.zatrok.comhttps://doi.org/10.1109/AIxSET62544.2024.00030https://doi.org/10.48550/arXiv.1810.04805https://doi.org/10.51094/jxiv.246https://doi.org/10.51094/jxiv.391Appl. Sci. 2025, 15, 10511 18 of 1825. WikiPedia. List of Thermal Conductivities. Available online: https://en.wikipedia.org/wiki/List_of_thermal_conductivities(accessed on 25 July 2025).26. WikiPedia. 1370 Aluminium Alloy. Available online: https://en.wikipedia.org/wiki/1370_aluminium_alloy (accessed on 25July 2025).27. Thermtest. Materials Thermal Properties Database. Available online: https://thermtest.com/thermal-resources/materials-database (accessed on 25 July 2025).28. Engineering ToolBox. Thermal Conductivity of Metals and Alloys: Data Table & Reference Guide. Available online: https://www.engineeringtoolbox.com/thermal-conductivity-metals-d_858.html (accessed on 25 July 2025).29. WikiPedia. Titanium. Available online: https://en.wikipedia.org/wiki/Titanium (accessed on 25 July 2025).30. WikiPedia. Tungsten. Available online: https://en.wikipedia.org/wiki/Tungsten (accessed on 25 July 2025).31. WikiPedia. Platinum. Available online: https://en.wikipedia.org/wiki/Platinum (accessed on 25 July 2025).32. Yoshitake, M. Generic trend of work functions in transition-metal carbides and nitrides. J. Vac. Sci. Technol. 2014, A32, 061403.[CrossRef]33. Yoshitake, M. Tool for Designing Breakthrough Discovery in Materials Science. Materials 2021, 14, 6946. [CrossRef]34. PoLyInfo. National Institute for Materials Science (NIMS). Available online: https://polymer.nims.go.jp/ (accessed on 25July 2025).35. Callister, W.D., Jr.; Rethwisch, D.G. Materials Science and Engineering—An Introduction, 8th ed.; John Wiley & Sons: Hoboken, NJ,USA, 2010.36. Yoshitake, M. Utilizing Knowledge on Scientific Principles on Material Properties for Materials R&D. J. Surf. Anal. 2019, 26,134–135. [CrossRef]37. Lü, J.; Wen, G.; Lu, R.; Wang, Y.; Zhang, S. Networked Knowledge and Complex Networks: An Engineering View. IEEE/CAA J.Autom. Sin. 2022, 9, 1366–1383. [CrossRef]38. Gao, Y.; Xiong, Y.; Gao, X.; Jia, K.; Pan, J.; Bi, Y.; Dai, Y.; Sun, J.; Wang, H. Retrieval-Augmented Generation for Large LanguageModels: A Survey. arXiv 2023, arXiv:2312.10997v1.39. Zhang, Q.; Chen, S.; Bei, Y.; Yuan, Z.; Zhou, H.; Hong, Z.; Dong, J.; Chen, H.; Chang, Y.; Huang, X. A Survey of GraphRetrieval-Augmented Generation for Customized Large Language Models. arXiv 2025, arXiv:2501.13958v1.40. Cava, R.J. Oxide Superconductors. J. Am. Ceram. Soc. 2000, 83, 5–28. [CrossRef]41. National Institute for Materials Science, Japan. Retrieval System and Retrieval Method. Patent JP: No. 7186436, 9 December 2022.(In Japanese)42. Ye, Y.; Ren, J.; Wang, S.; Wan, Y.; Razzak, I.; Hoex, B.; Wang, H.; Xie, T.; Zhang, W. Construction and Application of MaterialsKnowledge Graph in Multidisciplinary Materials Science via Large Language Model. arXiv 2024, arXiv:2404.03080v5.43. Gao, Y.; Wang, L.; Chen, X.; Du, Y.; Wang, B. Revisiting Electrocatalyst Design by a Knowledge Graph of Cu-Based Catalysts forCO2 Reduction. ACS Catal. 2023, 13, 8525–8534. [CrossRef]44. Behr, A.S.; Chernenko, D.; Koßmann, D.; Neyyathala, A.; Hanf, S.; Schunk, S.A.; Kockmann, N. Generating knowledge graphsthrough text mining of catalysis research related literature. Catal. Sci. Technol. 2024, 14, 5699–5713. [CrossRef]Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individualauthor(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury topeople or property resulting from any ideas, methods, instructions or products referred to in the content.https://en.wikipedia.org/wiki/List_of_thermal_conductivitieshttps://en.wikipedia.org/wiki/1370_aluminium_alloyhttps://thermtest.com/thermal-resources/materials-databasehttps://thermtest.com/thermal-resources/materials-databasehttps://www.engineeringtoolbox.com/thermal-conductivity-metals-d_858.htmlhttps://www.engineeringtoolbox.com/thermal-conductivity-metals-d_858.htmlhttps://en.wikipedia.org/wiki/Titaniumhttps://en.wikipedia.org/wiki/Tungstenhttps://en.wikipedia.org/wiki/Platinumhttps://doi.org/10.1116/1.4901014https://doi.org/10.3390/ma14226946https://polymer.nims.go.jp/https://doi.org/10.1384/jsa.26.134https://doi.org/10.1109/JAS.2022.105737https://doi.org/10.1111/j.1151-2916.2000.tb01142.xhttps://doi.org/10.1021/acscatal.3c00759https://doi.org/10.1039/D4CY00369A Introduction  Knowledge Graph on Materials Property Relationships  Application of Materials Property Knowledge Graph to Data Science  Generation of Materials Property Knowledge Graph and Its Search Tool Using ChatGPT  Discussion  Conclusions  Patents  References