# Fileset

[TSTM-2025-0072_data.zip](https://mdr.nims.go.jp/filesets/12b4f626-f7e9-4e4f-96c4-8cfd2d7e4216/download)

## Creator

Masaki Imamura, Kazutoshi Takahashi

## Rights

[Creative Commons BY Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/)

## Other metadata

[Non-negative matrix factorization analysis of spatially-resolved photoemission spectra for epitaxially grown graphene on SiC](https://mdr.nims.go.jp/datasets/1add5ee5-c498-4812-9b9b-36a476334e05)

## Fulltext

TSTM-2025-0072_data/fig2b.csv  Number of Base Vectors  Summation of Squared Error  2  69164496.14606424  3  49025791.93100633  4  31877185.11278905  5  21005010.735295996  6  17615038.235243995  7  14470198.91312809  8  12290967.140212148  9  11529972.94138222  10  10448912.07284665  11  10084115.078691473  12  8965277.995861966  13  7873670.937406655  14  7030323.884355624  15  6900672.768780998  16  4990258.235634069  17  4855006.971976918  18  4605309.882153125  19  4492318.8190450175  20  4309051.080711518  21  4082484.911435497  22  3871536.330675771  23  3778701.8567163087  24  3621061.0370459883__MACOSX/TSTM-2025-0072_data/._fig2b.csvTSTM-2025-0072_data/fig3d.csvPosition,W0_activation_ratio,W3_activation_ratioA,3,78B,18,58C,26,48D,46,43E,65,19__MACOSX/TSTM-2025-0072_data/._fig3d.csvTSTM-2025-0072_data/fig4d.csvPosition,W1_activation_ratio,W4_activation_ratioF,60,0G,29,40H,16,60I,52,12J,34,31__MACOSX/TSTM-2025-0072_data/._fig4d.csvTSTM-2025-0072_data/README.txtMinimal datasets to reproduce the line/point graphs in the manuscript"Machine learning based approach to ... NMF analysis of spatially-resolved ARPES of graphene on SiC"(STAM Methods, accepted)Files-----fig2b.csv  Reproduces Fig. 2(b): Summation of squared error between experimental ARPES  spectra and NMF-reconstructed spectra, as a function of the number of NMF  basis vectors (k = 2-24).  Columns:    Number of Base Vectors        : number of NMF components k    Summation of Squared Error     : sum over all measurement positions of the                                      mean-squared reconstruction error                                      (normalized intensity scale, see Methods)fig3d.csv  Reproduces Fig. 3(d): contribution ratio of the activation vector H for the  basis vectors W0 and W3 (NMF, k = 7) at the five representative positions  A-E indicated in Fig. 3(a),(b) (x = 14.15-16.15 mm, z = 1.0 mm).  Columns:    Position               : map position label (A-E), corresponds to Fig. 3(a),(b)    W0_activation_ratio     : H(W0) normalized to percent of total activation at that position    W3_activation_ratio     : H(W3) normalized to percent of total activation at that positionfig4d.csv  Reproduces Fig. 4(d): contribution ratio of the activation vector H for the  basis vectors W1 and W4 (NMF, k = 7) at the five representative positions  F-J indicated in Fig. 4(a),(b) (x = 15.15 mm, z = -6.6 to -2.6 mm).  Columns:    Position               : map position label (F-J), corresponds to Fig. 4(a),(b)    W1_activation_ratio     : H(W1) normalized to percent of total activation at that position    W4_activation_ratio     : H(W4) normalized to percent of total activation at that positionMethods (summary)------------------NMF (k = 7, init='random', max_iter=2000, random_state=1) was applied to thePCA-reconstructed (n=390, i.e. full-rank), normalized spatially-resolved ARPESdataset of graphene on 6H-SiC(0001) (390 spatial positions, each a 52 x 123binding-energy x angle map after 3x3 rebinning). For Fig. 2(b), NMF was runfor k = 1-24 and the reconstruction error was evaluated on a 0-10000 normalizedintensity scale. For Figs. 3(d)/4(d), the activation matrix H (k=7) wasnormalized at each spatial position to percentages summing to 100.__MACOSX/TSTM-2025-0072_data/._README.txt