# Fileset

[Physica Status Solidi  b - 2025 - Guimarães - Noise Reduction in Nano‐Raman Spectroscopy Using Principal Component Analysis.pdf](https://mdr.nims.go.jp/filesets/bf6b184b-a964-4ecf-844c-71f744d50fce/download)

## Creator

Jane Elisa Guimarães, Rafael Nadas, Wenjin Zhang, Takahiko Endo, [Kenji Watanabe](https://orcid.org/0000-0003-3701-8119), [Takashi Taniguchi](https://orcid.org/0000-0002-1467-3105), Riichiro Saito, [Yasumitsu Miyata](https://orcid.org/0000-0002-9733-5119), [Ado Jorio](https://orcid.org/0000-0002-5978-2735)

## Rights

[Creative Commons BY Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/)

## Other metadata

[Noise Reduction in Nano‐Raman Spectroscopy Using Principal Component Analysis](https://mdr.nims.go.jp/datasets/dd7c4397-9b4a-417e-9d5b-5d17f76adeab)

## Fulltext

Noise Reduction in Nano‐Raman Spectroscopy Using Principal Component AnalysisNoise Reduction in Nano-Raman Spectroscopy UsingPrincipal Component AnalysisJane Elisa Guimarães, Rafael Nadas, Wenjin Zhang, Takahiko Endo, Kenji Watanabe,Takashi Taniguchi, Riichiro Saito, Yasumitsu Miyata, and Ado Jorio*1. IntroductionWhen tip-enhanced Raman spectroscopy (TERS) mapping isperformed, the acquired hyperspectral data, comprising hun-dreds or even thousands of individual spectra, enables robuststatistical analysis. TERS combines Raman spectroscopy withscanning probe microscopy (SPM), using a metallic tip toenhance local electromagnetic fields and achieve nanoscale spa-tial resolution.[1–4] This technique enables chemical characteri-zation beyond the diffraction limit.[5–8]In this work, 4096 spectra were collected in a single TERSmeasurement, corresponding to a 64 � 64 pixel map from amonolayer MoSe2 flake. This extensive dataset allows theapplication of techniques such as principalcomponent analysis (PCA).[9,10] PCA isparticularly advantageous in this context,as it efficiently extracts the relevant spec-tral components and substantially mini-mizes noise contributions.[11–13] Conse-quently, this approach becomes a powerfultool for highlighting the most significantspectroscopic features and interpretingthe results with greater accuracy andconfidence.When applied to large datasets, the PCAmethod seeks to identify a set of principalcomponents (PCs) that capture the mostsignificant variance in the data.[10,13,14]This process involves computing the eigen-values and eigenvectors of the covariancematrix of the variables. Each eigenvectordefines the direction of a PC, while its corresponding eigenvaluequantifies the amount of explained variance. The componentsare then ranked in decreasing order of variance, allowing theselection of the most significant ones. By projecting the originaldata onto the new set of axes that capture the largest variation,defined by the number of PCs chosen, the dataset can be effec-tively represented in a lower-dimensional space that retains theessential structure and variability of the original data.The integration of Raman spectroscopy with PCA has provenefficiency in extracting detailed chemical information fromRaman mappings across a wide range of materials.[13,15,16]This work presents a reproducible method for reducing noisein large nano-Raman hyperspectral datasets. By applying PCAJ. E. Guimarães, A. JorioDepartamento de FísicaUniversidade Federal de Minas GeraisBelo Horizonte 31270-901, MG, BrazilE-mail: adojorio@fisica.ufmg.brR. NadasInstitut für PhysikHumboldt-Universität zu BerlinNewtonstraße 15, 12489 Berlin, GermanyThe ORCID identification number(s) for the author(s) of this articlecan be found under https://doi.org/10.1002/pssb.202500291.© 2025 The Author(s). physica status solidi (b) basic solid state physicspublished by Wiley-VCH GmbH. This is an open access article under theterms of the Creative Commons Attribution License, which permits use,distribution and reproduction in any medium, provided the original work isproperly cited.DOI: 10.1002/pssb.202500291W. Zhang, T. Endo, R. Saito, Y. MiyataDepartment of PhysicsTokyo Metropolitan UniversityTokyo 192-0397, JapanW. Zhang, T. Endo, T. Taniguchi, Y. MiyataResearch Center for Materials NanoarchitectonicsNational Institute for Materials ScienceTsukuba 305-0044, JapanK. WatanabeResearch Center for Electronic and Optical MaterialsNational Institute for Materials ScienceTsukuba 305-0044, JapanR. SaitoDepartment of PhysicsTohoku UniversitySendai 980-8578, JapanTip-enhanced Raman spectroscopy (TERS) combined with principal componentanalysis (PCA) offers a robust approach for enhancing signal quality anduncovering spectroscopic features otherwise concealed by noise. This studydemonstrates that integrating TERS with PCA in large-scale datasets effectivelyreduces noise and enhances the extraction of weak Raman signals that are oftenobscured by random spectral fluctuations. The methodology was applied tohyperspectral datasets acquired from MoSe2 monolayers exhibiting nanoscalesurface features. Through this approach, previously hidden nano-Raman peakswere successfully isolated, enabling reliable chemical identification at thenanoscale. The combined use of TERS and PCA significantly improves sensitivityand resolution in the spectroscopic analysis of 2D materials, advancing theircharacterization with respect to interfacial and environmental effects.RESEARCH ARTICLEwww.pss-b.comPhys. Status Solidi B 2026, 263, e202500291 e202500291 (1 of 5) © 2025 The Author(s). physica status solidi (b) basic solid state physicspublished by Wiley-VCH GmbHmailto:adojorio@fisica.ufmg.brhttps://doi.org/10.1002/pssb.202500291http://creativecommons.org/licenses/by/4.0/http://www.pss-b.comhttp://crossmark.crossref.org/dialog/?doi=10.1002%2Fpssb.202500291&domain=pdf&date_stamp=2025-07-29and carefully selecting the appropriate number of PCs for datareconstruction, it becomes possible to enhance the signal-to-noise ratio and reveal spectral features that were previouslyobscured. This approach not only facilitates the detection of weakor hidden peaks but also improves the reliability and interpret-ability of complex spectroscopic data.2. MethodologyThe sample studied in this work was prepared using the dry-stamping technique, where MoSe2 grains, grown on a SiO2/Sisubstrate via salt-assisted chemical vapor deposition, wereretrieved with an hexagonal boron nitride flake. The resultingheterostructure was then transferred onto a glass slide using apolymer stamp.[17,18]Hyperspectral data were obtained by TERS in the porto labo-ratory prototype system, which operates in an atomic forcemicroscope in non-contact mode using a tuning fork andPlasmon tunable tip pyramids (PTTP) probes.[19–22] The excita-tion source was a He─Ne radially polarized laser, and the spec-trometer used was an Andor shamrock 303i with a 600 l/mmgrating.PCA was conducted using an automated routine implementedin the portoflow analysis software. The objective was to enhancedata quality by reconstructing the dataset using only the five PCscarrying the largest variational contribution, thus preserving themost relevant information within the data. The following sectiondetails the method used to select the specific number of PCs.Subsequently, the original dataset was reconstructed using theinverse transformation as a linear combination of only the fiveselected PCs.Before the PCA-based noise reduction process, spikes result-ing from cosmic rays or experimental artifacts were carefullyremoved from each individual spectrum presenting such anom-alies. This step was crucial, as it prevents these anomalies fromaffecting the decomposition into the PCs. After applying PCA,both the full spectra and relevant spectral regions were selectedusing the software, and background subtraction was applied toeach region using a built-in routine that sets minimum valuesto zero, thereby minimizing baseline contributions. This proc-essing resulted in spectra with well-defined peaks, enablingthe generation of corresponding intensity maps.3. ResultsThe region analyzed was a 1� 1 μm2 area of monolayerMoSe2,[23] scanned as a 64� 64 pixel map, resulting in a hyper-spectral dataset comprising 4096 spectra. A representativespectrum is shown in Figure 1a. After applying the PCA-basednoise reduction procedure that will be described here, the samespectrum appears as shown in Figure 1b. The signal-to-noiseratio, considering the most intense A1g mode of MoSe2, wasinitially 48. After applying the PCA-based noise reduction pro-cedure, it increased to 1040, representing a signal-to-noiseimprovement by a factor of ≈21.7.To evaluate whether the five components used to reconstructthe data are sufficient for accurate Raman hyperspectral imaging,we can compare a Raman map built from the MoSe2 databefore and after the PCA-based noise reduction. The resultingdifference between these maps, as shown in Figure 1c, consistsonly of noise, which confirms that the reconstructed spectraindeed keep the spectral features that truly define the mapimage, and only noise remains after subtraction. Therefore,the essential features of the data were effectively captured bythe five selected PCs.Although the well-known peaks of MoSe2 were readilyobserved even in the as-measured data prior to PCA, the aimwas to visualize features that could be associated with thenanometric structures, namely the nanoprotuberances, sincethe presence of noise can obscure weak peaks that mightotherwise be detected, particularly those not related to theintense, well-known MoSe2 features below 650 cm�1. Toaddress this, PCA was applied to decompose the dataset intocomponents representing the main sources of spectral variationand potential artifacts, thereby enhancing the detection of weakTERS peaks that would otherwise remain hidden. Figure 2ashows the proportion of variance considering the first 10PCs, from PC1 to PC10. The plot indicates that the first fivecomponents (PC1 to PC5) together account for more than90% of the total variance, which is the rationale behind select-ing these five. If the spectra is reconstructed with the inversePCA transformation, using only these five components, mostof the representative spectral features will be kept, while therandom spectral variation related to noise will be removed.One way to illustrate this is to observe the spectral varianceof PC2 to PC5 in PC1. Figures 2b–f present the projectionsRaman Shift [cm-1]15020Raman Shift [cm-1]Intensity(a) (b) (c)Figure 1. a) Representative as-measured spectrum from MoSe2, indicated in (c) by the open white circle. b) The same spectrum after PCA-based noisereduction. c) Integrated Raman intensity map after PCA-based noise reduction, subtracted from the map built using the as-measured data. The insetshows a typical subtracted spectrum, where one can only see residual noise.www.advancedsciencenews.com www.pss-b.comPhys. Status Solidi B 2026, 263, e202500291 e202500291 (2 of 5) © 2025 The Author(s). physica status solidi (b) basic solid state physicspublished by Wiley-VCH GmbH 15213951, 2026, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/pssb.202500291 by National Institute For, Wiley Online Library on [08/06/2026]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons Licensehttp://www.advancedsciencenews.comhttp://www.pss-b.comof the 4096 data points onto each of the first five PCs, witheach plot showing a specific PC plotted against PC1.Figure 2b displays PC1 plotted against itself. Similarly,Figure 2c presents PC2 versus PC1, and so forth for thesubsequent components. This analysis demonstrates that thevariance associated with each component progressivelydiminishes up to PC5, which is the last component includedin the dataset reconstruction.Once the dataset analyzed in this work reaches a sufficientvolume, the application of the PCA method significantlyenhances the detection of spectral peaks associated with contam-inants, as shown in Figure 3. Before the analysis, these peakswere obscured by noise, even after background subtraction,making it difficult to distinguish them from the randomvariations originating from noise. After applying PCA, severalpeaks become prominent and, when their intensity is mapped,a distinct pattern is revealed, corresponding to the spatial distri-bution of the contaminants. This improved spectral clarity andenabled a more precise correlation between the chemical featuresand the observed morphological structures.(b) (c)(f)(e)(d)(a)Figure 2. a) Normalized variance considering the first ten PCs obtained from the PCA decomposition of the hyperspectral TERS data, and respectivecumulative variance. b–f ) Projections of the dataset onto the first five PCs, with each plot representing a specific PC (PC1-PC5) against PC1. Scale bars arethe same for (c–f ).1362 cm-1Intensity [arb. units]Before PCAAfter PCA1362 cm-1-51168 cm-101168 cm-1Intensity [arb. units]Intensity [arb. units]Intensity [arb. units]Before PCAAfter PCA250 nmFigure 3. Comparison between as-measured (top row, Before PCA) and PCA-based noise reduction processed (bottom row, After PCA) TERS spectra fortwo distinct peaks. The intensity maps were obtained by selecting the spectral region highlighted in yellow in the spectra. Triangles represent the pixelsfrom which the spectra were acquired. The scale bars correspond to 250 nm in all images, and the color-coded bars represent the intensities in arbitraryunits. The maps were acquired from two different regions of the sample.www.advancedsciencenews.com www.pss-b.comPhys. Status Solidi B 2026, 263, e202500291 e202500291 (3 of 5) © 2025 The Author(s). physica status solidi (b) basic solid state physicspublished by Wiley-VCH GmbH 15213951, 2026, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/pssb.202500291 by National Institute For, Wiley Online Library on [08/06/2026]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons Licensehttp://www.advancedsciencenews.comhttp://www.pss-b.comFigure 3 presents a direct comparison between the spectrabefore (first row) and after (second row) PCA-based noisereduction, clearly demonstrating that noise initially obscuresrelevant features, which only become discernible after dataprocessing. The corresponding intensity maps obtained foreach region before and after PCA-noise reduction revealrelevant features associated with chemical signatures likelyoriginating from contamination, further validating our meth-odology and findings.4. ConclusionsPCA proved to be an effective tool for removing noise from TERShyperspectra. Together, these two techniques enable the detec-tion of weak nano-Raman peaks that would otherwise remainundetectable due to both spectral noise and the limitations ofconventional optical microscopy in visualizing features at thenanoscale. The comparison of spectra before and after PCAapplication demonstrates an improvement in data quality.Furthermore, reconstruction using only five PCs was sufficientto preserve all relevant information, indicating that the evaluationof cumulative variance is a reliable criterion to determine theappropriate number of components. This is further supportedby the observation that subtracting the original data from thereconstructed yields only residual noise. This methodologicalapproach significantly enhances the capacity for nanoscale chem-ical and morphological characterization and can be extended toother complex systems, for which weak signals are typicallymasked by noise. Therefore, PCA stands out as a powerfuland versatile tool for large-scale data analysis.Although PCA is a powerful tool for enhancing these weaksignals, it is important to note that subtle spectral features withlow variance, represented by discarded PCs, may be lost alongwith the noise. While these features may not represent signifi-cant variance overall, they can be crucial for capturing specificdetails in the sample.[12] Furthermore, it is important toemphasize that PCA has limitations in capturing complex non-linear variances, and its effectiveness is highly dependent onthe data volume to obtain reliable results. Moreover, PCs aremathematical constructs rather than actual spectra, which lim-its the direct physical interpretation of their significance.[11,13]In conclusion, all data processing should always be performedwith due care.AcknowledgementsThe authors thank financial support by FAPEMIG (APQ - 04852-23, APQ -01860-22, RED - 00081-23, APQ-01402-23, RED-00079-23), the JapanScience and Technology Agency (JST), the JST FOREST Program (grantno. JPMJFR213X), the CREST (grant no. JPMJCR24A5), Kakenhi Grants-in-Aid (grant nos. JP21H05232, JP21H05233, JP21H05234, JP22H00283,JP22H04957, and JP23H02052) from the Japan Society for the Promo-tion of Science (JSPS), World Premier International Research CenterInitiative (WPI), MEXT, Japan, and software resources and technical assis-tance provided by FabNS. R.S. acknowledges a JSPS KAKENHI Grant(grant no. JP22H00283), Japan, and the Yushan Fellow Program by theMinistry of Education (MOE), Taiwan.The Article Processing Charge for the publication of this research wasfunded by the Coordenação de Aperfeiçoamento de Pessoal de NívelSuperior - Brasil (CAPES) (ROR identifier: 00x0ma614).Conflict of InterestThe authors declare no conflict of interest.Data Availability StatementThe data that support the findings of this study are available from the cor-responding author upon reasonable request.KeywordsMoSe2, nano-Raman spectroscopy, principal component analysis,tip-enhanced Raman spectroscopyReceived: May 30, 2025Revised: July 13, 2025Published online:[1] R. M. Stockle, Y. D. Suh, Y. D. Suh, V. Deckert, R. Zenobi, Chem. Phys.Lett. 2000, 318, 1.[2] N. Hayazawa, Y. Inouye, Z. Sekkat, S. Kawata, Opt. Commun. 2000,183, 1.[3] M. S. Anderson, Appl. Phys. Lett. 2000, 76, 21.[4] N. Kumar, S. Mignuzzi, W. Su, D. Roy, EPJ Tech. Instrum. 2015, 2, 1.[5] A. C. Gadelha, D. A. Ohlberg, C. Rabelo, E. G. Neto, T. L. Vasconcelos,J. L. Campos, J. S. Lemos, V. Ornelas, D. Miranda, R. Nadas,F. C. Santana, Nature 2021, 590, 7846.[6] F. B. Sousa, R. Nadas, R. Martins, A. P. Barboza, J. S. Soares,B. R. Neves, I. Silvestre, A. Jorio, L. M. Malard, Nanoscale 2024,16, 27.[7] A. Jorio, R. Nadas, A. G. Pereira, C. Rabelo, A. C. Gadelha,T. L. Vasconcelos, W. Zhang, Y. Miyata, R. Saito, M. D. Costa,L. G. Cançado, 2D Mater. 2024, 11, 3.[8] C. Höppener, J. Aizpurua, H. Chen, S. Gräfe, A. Jorio, S. Kupfer,Z. Zhang, V. Deckert, Nat. Rev. Methods Primers 2024, 4, 1.[9] I. T. Jolliffe, Principal Component Analysis, Springer, New York, NY 2002.[10] H. Abdi, L. J. Williams, Wiley Interdiscip. Rev.: Comput. Statist.2010, 2, 4.[11] G. Rusciano, G. Zito, R. Isticato, T. Sirec, E. Ricca, E. Bailo, A. Sasso,ACS Nano 2014, 8, 12300.[12] S. Jiang, X. Zhang, Y. Zhang, C. Hu, R. Zhang, Y. Zhang,Y. Liao, Z. J. Smith, Z. Dong, J. G. Hou, Light: Sci. Appl. 2017,6, e17098.[13] J. L. E. Campos, H. Miranda, C. Rabelo, E. Sandoz-Rosado,S. D. Pandey, J. Riikonen, A. G. Cano-Marquez, A. G. Cano-Márquez,A. Jorio, J. Raman Spectrosc. 2018, 49, 1.[14] M. Greenacre, P. J. Groenen, T. Hastie, A. I. d’Enza, A. Markos,E. Tuzhilina, Nature Rev. Methods Primers 2022, 2, 1.[15] H. Shinzawa, K. Awa, W. Kanematsu, Y. Ozaki, J. Raman Spectrosc.2009, 40, 12.[16] Y. Luo, X. Zhang, Z. Zhang, R. Naidu, C. Fang, Anal. Chem.2022, 94, 7.www.advancedsciencenews.com www.pss-b.comPhys. Status Solidi B 2026, 263, e202500291 e202500291 (4 of 5) © 2025 The Author(s). physica status solidi (b) basic solid state physicspublished by Wiley-VCH GmbH 15213951, 2026, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/pssb.202500291 by National Institute For, Wiley Online Library on [08/06/2026]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons Licensehttp://www.advancedsciencenews.comhttp://www.pss-b.com[17] S. Masubuchi, M. Sakano, Y. Tanaka, Y. Wakafuji, T. Yamamoto,S. Okazaki, K. Watanabe, T. Taniguchi, J. Li, H. Ejima, T. Sasagawa,Sci. Rep. 2022, 12, 1.[18] H. Naito, Y. Makino, W. Zhang, T. Ogawa, T. Endo, T. Sannomiya,M. Kaneda, K. Hashimoto, H. E. Lim, Y. Nakanishi, K. Watanabe,T. Taniguchi, K. Matsuda, Y. Miyata, Nanoscale Adv. 2023, 5, 18.[19] T. L. Vasconcelos, B. S. Archanjo, B. Fragneaud, B. S. Oliveira,J. Riikonen, C. Li, D. S. Ribeiro, C. Rabelo, W. N. Rodrigues,A. Jorio, C. A. Achete, ACS Nano 2015, 9, 6.[20] T. L. Vasconcelos, B. S. Archanjo, B. S. Archanjo, B. S. Oliveira,R. Valaski, R. C. Cordeiro, H. G. Medeiros, C. Rabelo,A. R. Ribeiro, A. R. Ribeiro, P. Ercius, C. A. Achete, A. Jorio,L. G. Cançado, Adv. Opt. Mater. 2018, 6, 20.[21] C. Rabelo, H. Miranda, T. L. Vasconcelos, L. G. Cançado,A. Jorio, in 2019 4th Int. Symp. on Instrumentation Systems,Circuits and Transducers (INSCIT), São Paulo, Brazil 2019,pp. 1–6.[22] H. Miranda, C. Rabelo, T. L. Vasconcelos, L. G. Cançado, A. Jorio,Phys. Status Solidi-rapid Res. Lett. 2020, 14, 9.[23] J. Guimarães, R. Nadas, R. Alves, W. Zhang, T. Endo, K. Watanabe,T. Taniguchi, R. Saito, Y. Miyata, B. Neves, A. Jorio, 2025, ArXiv pre-print arXiv:2505.19224.www.advancedsciencenews.com www.pss-b.comPhys. Status Solidi B 2026, 263, e202500291 e202500291 (5 of 5) © 2025 The Author(s). physica status solidi (b) basic solid state physicspublished by Wiley-VCH GmbH 15213951, 2026, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/pssb.202500291 by National Institute For, Wiley Online Library on [08/06/2026]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons Licensehttp://www.advancedsciencenews.comhttp://www.pss-b.com Noise Reduction in Nano-Raman Spectroscopy Using Principal Component Analysis 1. Introduction 2. Methodology 3. Results 4. Conclusions