# Fileset

[nimsweek2019.poster.pdf](https://mdr.nims.go.jp/filesets/6b30fa99-078a-4ca4-a7a9-87eb6393b3aa/download)

## Creator

[AMANO, Kou](https://orcid.org/0000-0002-8079-4941)

## Rights



## Other metadata

[tq : A Comprehensive Disciplinary Language for Materials Science](https://mdr.nims.go.jp/datasets/c7070aed-7374-4409-8d09-9c755d6fd174)

## Fulltext

tq :A Comprehensive Disciplinary Language for Materials ScienceKou Amano† Koichi Sakamoto†† Natinal Institute for Materials ScienceIntroductionMaterials science is based on multi-scale andmulti-physical disciplines (scientific discipline);therefore, in this field, there are many types ofdata, models, and terms with various meanings,making it difficult to operate data on unified dis-cipline (data discipline).However, a well-defined uni-language that treatsmultimodal forms can help operations.Therefore, we are developing a language, named”tq”, that can parse tree or graph structures, en-abling the operation of several data formats, mod-els and dictionaries for materials science.Objectivetq should satisfy•parsing tree structure•parsing graph structure• searching dictionary•matching terms using dictionary• reforming from unstructured data to structureddata• conversion to other well-known formats such asJSON•matching or searching tree or graph structure•Term Rewriting by Network Similarity (TRNS)•daemonizing dictionary system•parallelizing.The languageShort example#1$Op$Name($#1[1])↓ tq in=/dev/stdin -FT -Pin data=test.csv#1$Op$Name($#1[1]@@#1$Op$Name(Length))#1 : < label >$Op$ : < operator >Name : < name >$#1 : < reference >[1] : < data bind dimension >@@ : < bind mark >#1$Op$Name : < binded object >Length : < binded data >Data structureTable: Members of the data structureLv Adr PAd Ref LT LN Hpt H D VC VSt Cj NC0 0 14153344 0 0 h 1 2 #1$Op$Name 0 0 11 1 14154608 14153344 14153344 -1 0 $#1[1] [1 1 Length 0 0ParsingParsing treeInterpreted structure StatementsInput:A(B(#1C),$#1(D))Output:A(B(#1C),$#1@#1C(D))AB# 1C$# 1DParsing graphImplicit graphInterpreted structure StatementsInput:A(B(#1C),$#1(D))Adjacency matrix:[:A:0],1,2,3,,,[:B:1],2,,,,,[:#1C:2],,,,,,[:$#1:3→2],4,,,,,[:D:4],AB CDExplicit graphInterpreted structure StatementsInput:$G$($V$(#0,#1,#2),$E$($#0($#1,$#2),$#1($#0,$#2), $#2($#0,$#1)))Adjacency matrix:,,,3,4,,[:$#0:6→2],7,8,,,,,,,,,,,,,,[:$#1:7→3],,,,,,,,,,,,,,,,[:$#2:8→4],,,,,,,,,2,,4,,,,,[:$#1:9→3],10,11,,,,,,,,,,,,,,[:$#0:10→2],,,,,,,,,,,,,,,,[:$#2:11→4],,,,,,2,3,,,,,,,,,[:$#2:12→4],13,14,,,,,,,,,,,,,,[:$#0:13→2],,,,,,,,,,,,,,,,[:$#1:14→3],$#1$#2Binding and reforming dataInterpreted structure StatementsInput:(#1$1[2],#2$2[2],$3[3](#4$4[2]));$PI$($#1,Quantity($#4,$#2))Data:Length,Weight, mm,kg, 1,2, 322,4,5,68Output:(((Length,Quantity(1,mm)),(Weight,Quantity(2,kg))),((Length,Quantity(322,mm)),(Weight,Quantity(4,kg))),((Length,Quantity(5,mm)),(Weight,Quantity(68,kg))))Development statusProgram construction• tq parser (designated as ”tq”) ... Done• converter ... Developing• analyzer ... TBDSupported formstqS(Lisp)JSONWolframPerformanceTable: Parsing performance @ E5-2650Program Size (nodes) Time (min:sec) Memory (bytes)tq (stable) 124,653,854 45 26Gtq (exptl.) 124,653,854 27 12Gjq 124,653,854 2:25 36Gtq (stable) 498,615,417 3:30 104Gtq (exptl.) 498,615,417 1:55 51Gjq 498,615,417 10:50 144GTable: Parsing and converting performanceForm Size (nodes) Time (min:sec) Memory (bytes)JSON 124,653,854 1:21 29GJSON 498,615,417 5:35 136GFuture planAs a next step, we are restructuring the datastructure of tq for parallelizing. In the currentstructure, the tree structure and node propertyare strongly related; therefore, parallelizing is dif-ficult. We are attempting to divide the data struc-ture into tree structure and node table.Conclusiontq can handle various types of data in a uniformmanner, especially in the field of materials science.Adopting the syntax of S-expressions, tq incorpo-rates the binding and node referencing mechanismto represent a graph structure that defines inputand output data formats. Due to its expressivepower, users can write a set of rules that reformunstructured data (e.g.CSV) into those of an arbi-trary format as they need, such as a tensor formatfor machine learning.