Study on Raman Spectral Characteristics of Breast Cancer Based on
Multivariable Spectral Data Analysis Methods
ZHANG Bao-ping1, NING Tian1, ZHANG Fu-rong1, CHEN Yi-shen1, ZHANG Zhan-qin2, WANG Shuang1*
1. Institute of Photonics and Photon-Technology,Northwest University,Xi’an 721710,China
2. The First Affiliated Hospital of Xi’an Jiaotong University,Xi’an 710061,China
Abstract:Compared to cell and sliced tissue samples, blood samples could be collected easier, and its biomedical constitution would show some relavant variations before clinical pathological symptoms. Raman spectroscopy provides molecular-related information about biomedical contents for clinical investigations in a rapid, nonlabeled, nondestructive and noninvasive way, presenting a significant application prospect for blood sample-based diagnosis. In this study, we present a reliable method for detecting breast cancer using blood serum combined with multivariate analysis methods. The blood serum samples were divided into healthy, early, and advanced cancer groups based on clinical pathological diagnosis. Using a quatz capillary tubes as sample holder, the spectral information was acquired to illustrating the biomedical constitution nature of the serum sample. The spectral classification models, which were built on the method of principal component analysis (PCA), linear discriminant analysis (LDA), supporting vector machines (SVM) and partial least squares discriminant analysis (PLS-DA), were utilized for unveiling the spectral variances among different investigated groups. And the leave-one-out cross-validation (LOOCV) method was adopted for evaluating the model classification performance. After that, we not only observed the resonance Raman spectral phenomena of carotenoid contents in serum but also identified the spectral variations of protein and lipid contents during breast cancer progression. By using the multivariate analysis methods, the representative spectral identities were recognized. Since then, the spectral classification accuracy of PCA-LDA model was found to be 99%. For three types kernel based PCA-SVM model, it was found that the linear kernel model reached 92% accuracy with parameter c=0.003, the classification accuracy of the RBF kernel model was 94% with parameter c=0.125 and γ=256, and the polynomial model presented 92% accuracy with parameter c=0.003 and d=11. Meanwhile, the spectral classification accuracy of PLS-DA was 80%. The obtained results could pave a theoretical and experimental foundation for serum Raman spectroscopy-based breast cancer early screening and diagnosis.
Key words:Raman spectroscopy technology;Multivariate data classification models;Breast cancer;Serum;Carotene
张宝萍,宁 甜,张富荣,陈一申,张占琴,王 爽. 基于多变量光谱数据分析方法的乳腺癌血清拉曼光谱特征研究[J]. 光谱学与光谱分析, 2023, 43(02): 426-434.
ZHANG Bao-ping, NING Tian, ZHANG Fu-rong, CHEN Yi-shen, ZHANG Zhan-qin, WANG Shuang. Study on Raman Spectral Characteristics of Breast Cancer Based on
Multivariable Spectral Data Analysis Methods. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(02): 426-434.