Diagnosis of Lung Cancer by Human Serum Raman Spectroscopy
Combined With Six Machine Learning Algorithms
NI Qin-ru1, OU Quan-hong1*, SHI You-ming2, LIU Chao3, ZUO Ye-hao1, ZHI Zhao-xing1, REN Xian-pei4, LIU Gang1
1. College of Physics and Electronic Information, Yunnan Normal University, Kunming 650500, China
2. College of Physics and Electronic Engineering, Qujing Normal University, Qujing 655011, China
3. Department of Nuclear Medicine, Yunnan Cancer Hospital, Kunming 650118, China
4. College of Physics and Electronic Engineering, Sichuan University of Science & Engineering, Zigong 643000, China
Abstract:Lung cancer is a serious threat to human health. In recent years, the incidence of lung cancer has been increasing in China. Imaging examination and histopathological examination are the main screening methods for lung cancer. Imaging examinations are widely used as a preliminary screening method, but they have some uncertainties. The result of the histopathological examination is accurate, so the histopathological examination is the “gold standard” of a lung cancer diagnosis. However, the acquisition of tissue samples can cause traumatic lung injury. Therefore, developing a reliable and minimally invasive method for lung cancer diagnosis is necessary. Acquiring serum samples is more convenient and less invasive than pathological tissue samples. Raman spectroscopy has the advantages of a simple operation, rapid sensitivity, and the ability to provide biochemical information on serum samples. This study obtained Raman spectra of the serum in 155 healthy subjects and 92 lung cancer patients. Curve fitting was applied to the Raman spectra data, and characteristic differences between healthy subjects and lung cancer patients were found in the range of 1 800~800 cm-1. The curve fitting results showed that compared with healthy subjects, the area percentages of sub-peaks around 1 005 and 1 091 cm-1 of lung cancer patients increased by 3.36% and 5.32%. On the contrary, the area percentage of sub-peaks around 964, 1 522 and 1 586 cm-1 of lung cancer patients decreased by 2.3%, 2.82%,and 5.6%. The preliminary results of curve fitting showed that the biochemical substances of carotenoids, amino acids, ribose, and nucleic acids in the serum of lung cancer patients were altered. To investigate the Raman spectral characteristics of serum in healthy subjects and lung cancer patients, machine learning methods were used to obtain the hidden information of the Raman spectral data. First, principal component analysis (PCA) was used to extract the characteristic variables of the spectra. The characteristic variables were applied to support vector machine (SVM), random forest (RF), k-nearest neighbors (kNN), logistic regression classification (LRC), Decision Tree (DT), and Bayesian algorithm, respectively, to build classification models. The model's predictive performance was evaluated by the leave-one cross-validation method. The results showed that the SVM model best classifies serum Raman spectra. The accuracy, sensitivity, specificity, and F1 are 98%, 94.44%, 100% and 97.14%, respectively. The average of values of the 9-fold cross-verification ROC area under the curve for the SVM model was 0.94, which indicated that the SVM model had a good predictive performance. The result showed that serum Raman spectroscopy combined with machine learning methods can effectively diagnose lung cancer. This technique is minimally invasive and highly accurate; it is a potential diagnostic technology for lung cancer.
倪钦如,欧全宏,时有明,刘 超,左烨豪,智兆星,任先培,刘 刚. 人体血清拉曼光谱结合六种机器学习算法对肺癌的诊断研究[J]. 光谱学与光谱分析, 2025, 45(03): 685-691.
NI Qin-ru, OU Quan-hong, SHI You-ming, LIU Chao, ZUO Ye-hao, ZHI Zhao-xing, REN Xian-pei, LIU Gang. Diagnosis of Lung Cancer by Human Serum Raman Spectroscopy
Combined With Six Machine Learning Algorithms. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2025, 45(03): 685-691.
[1] Siegel R L, Miller K D, Jemal A. CA: A Cancer Journal for Clinicians, 2020, 70(1): 7.
[2] Chen X Y, Chen C, Tian X C, et al. Talanta, 2023, 266(2): 125052.
[3] Zhang S Y, Qi Y, Bi R Z, et al. Biosensors(Basel), 2023, 13(5): 557.
[4] ZHANG Bao-ping, NING Tian, ZHANG Fu-rong, et al(张宝萍, 宁 甜, 张富荣, 等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2023, 43(2): 426.
[5] Chen C, Wu W, Dong X G, et al. Journal of Raman Spectroscopy, 2021, 52(11): 1798.
[6] Li H T, Wang S S, Zeng Q G, et al. Photodiagnosis and Photodynamic Therapy, 2022, 40: 103115.
[7] Yang X E, OU Q H, Yang W Y, et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2021, 263: 120181.
[8] Ullah R, Khan S, Farman F, et al. Biomed Optics Express, 2019, 10(2): 600.
[9] Zhang L H, Li C J, Peng D, et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2022, 264: 120300.
[10] Adam O, Jadwiga H, Anna W, et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2017, 183: 239.
[11] Li Y Z, Chen C, Chen F F, et al. Photodiagnosis and Photodynamic Therapy, 2021, 35: 102382.
[12] Xiao R, Zhang X H, Rong Z, et al. Nanomedicine: Nanotechnology, Biology, and Medicine, 2016, 12: 2475.
[13] Yang X E, Wu Z Y, OU Q H, et al. Frontiers in Chemistry, 2022, 10: 810837.
[14] Maryam B, Ahamd H, Arian R, et al. Talanta, 2019, 204: 826.
[15] Vicario A, Sergo V, Toffoli G, et al. Colloids and Surfaces B: Biointerfaces, 2015, 127: 41.
[16] Qian H Y, Zhang H, Wang Y Q, et al. Nanomedicine: Nanotechnology, Biology, and Medicine, 2020, 29: 102245.
[17] Wang W N, Zhang W, Duan Y K, et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2013, 115: 57.
[18] Yan Z W, Ma C L, Mo J Q, et al. Optik, 2020, 208: 164473.
[19] Wubulitalifu D, Jingrui D, Jintian L, et al. Photodiagnosis and Photodynamic Therapy, 2023, 42: 103544.
[20] Lin Y, Xu G H, Wei F D, et al. Journal of Pharmaceutical and Biomedical Analysis, 2016, 121: 135.
[21] Liting S, Aiying Z, Zhen R, et al. Nanomedicine: Nanotechnology, Biology, and Medicine, 2023, 42: 103544.
[22] Susan C, Natalia S, Rachel H, et al. Clinical Spectroscopy, 2022, 4: 100020.
[23] Christabel Y E T, Daniel O A, Nii A T, et al. Food Chemistry, 2024, 431: 137077.
[24] LIU Xiao-huan, LIU Cui-ling, SUN Xiao-rong, et al(刘晓欢, 刘翠玲, 孙晓荣, 等). Food Science and Technology(食品科技), 2021, 46(4): 244.
[25] ZHENG Kai-yi, FENG Tao, ZHANG Wen, et al(郑开逸, 封 韬, 张 文, 等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2021, 41(3): 984.
[26] MA Xiao-jie, GAO Rui, YAN Zi-wei, et al(马晓洁, 高 瑞, 严紫薇,等). Optoelectronics and Lasers(光电子·激光), 2020, 31(7): 767.
[27] Salin L, Setlios B, Chaos, et al. Solitons & Fractals, 2020, 133: 109641.
[28] Malti B, Apoorva G, Apoorva C. Decision Analytics Journal, 2022, 3: 100071.
[29] WANG Qi, ZENG Wan-dan, XIA Zhi-ping, et al(王 其, 曾万聃, 夏志平, 等). Chinese Laser(中国激光), 2021, 48(03): 136.
[30] Liu Y, Chen C, Tian X C, et al. Expert Systems with Applications, 2024, 238: 121787.
[31] ZENG Qi, LIU Xiang(曾 琪, 刘 翔). Biotechnology World(生物技术世界), 2015,(11): 253.