Research on Feature Extraction of Near-Infrared Spectroscopy Based on Joint Matrix Local Preserving Projection
HU Shan-ke1, QIN Yu-hua1*, DUAN Ru-min2, WU Li-jun2, GONG Hui-li3
1. College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
2. Technical Research Center, China Tobacco Yunnan Industrial Co., Ltd., Kunming 650024, China
3. College of Information Science and Engineering, China Ocean University, Qingdao 266100, China
Abstract:Aiming at the problem that the high-dimensional, high-noise, overlap and nonlinear features of the near-infrared spectrum seriously affect the modeling accuracy, a feature extraction method based on joint matrix local preservation projection (JMLPP) is proposed in this paper. First, the cluster-based spectral feature selection is used for effective features extraction. According to kinds of indicators with a strong correlation of classification, the samples are divided into kinds of different clustering modes. Based on the idea of strong intra-class correlation and great inter-class difference, the intra-class threshold and the inter-class threshold are determined by adjusting the intra-class parameter and the inter-class parameter . The spectral feature regions are selected according to kinds of different clustering modes, and feature matrices are obtained, whereas a joint matrix is generated by the union operation. Cluster-based feature extraction eliminates features with low intra-class correlation and high correlation between classes, and realizes the elimination of noise information in the spectrum. Secondly, the local preservation projection algorithm (LPP) is improved in this paper from two aspects: the geodesic distance is introduced to construct the neighborhood distance matrix, and the topology between the high-dimensional sample data is better expressed than the Euclidean distance. Meanwhile, the edge weight matrix is also improved, which solves the uncertainty caused by sample sparseness and avoids the loss of effective information. Finally, the improved LPP algorithm is used to reduce the dimensionality of the joint matrix, and the optimal spectral feature subset of the low-dimensional mapping is obtained. In order to verify the effectiveness of the JMLPP algorithm, this paper first compares the JMLPP with PCA and LPP from the perspective of spectral projection. The results show that JMLPP has better classification ability, and the tobacco samples in the projection space are clearly classified, and the effect is obviously better than PCA and LPP. In addition, the results of the model classification are also compared. The classification models were established by using the full spectra and dimension reduction features of the PCA, LPP and JMLPP. The experimental results show that the accuracy of the classification model established by JMLPP algorithm is 93.8%. The sensitivity of the five categories of tobacco grading classification are 95.2%, 93.1%, 94.2%, 92.1%, 92.5%, and the specificities are 99.3%, 98.4%, 98.6%, 97.5%, and 97%, respectively. The accuracy, sensitivity and specificity of the model are significantly higher than the other three methods. The JMLPP algorithm effectively extracts useful information of classification based on cluster-based feature extraction and local preserving projection algorithm, and maintains the local linear relationship of the original data. The stability and accuracy of model are desirable.
胡善科,秦玉华,段如敏,吴丽君,宫会丽. 联合矩阵局部保持投影的近红外光谱特征提取[J]. 光谱学与光谱分析, 2020, 40(12): 3772-3777.
HU Shan-ke, QIN Yu-hua, DUAN Ru-min, WU Li-jun, GONG Hui-li. Research on Feature Extraction of Near-Infrared Spectroscopy Based on Joint Matrix Local Preserving Projection. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2020, 40(12): 3772-3777.
[1] KONG Qing-qing, DING Xiang-qian, GONG Hui-li(孔清清,丁香乾,宫会丽). Laser & Optoelectronics Progress(激光与光电子学进展), 2018, 55(1): 013006.
[2] YAO Xue-lian, HE Fu-qiang, PING An, et al(姚学练,贺福强,平 安,等). Tobacco Science & Technology(烟草科技), 2009, 42(11): 197.
[3] ZHANG-Ying, HE Li-yuan(章 英,贺立源). Transactions of the Chinese Society of Agricultural Engineering(农业工程学报), 2011, 27(4): 350.
[4] Yun Yonghuan, Li Hongdong, Deng Baichuan, et al. Trends in Analytical Chemistry, 2019, 113: 102.
[5] HE Yong, LI Xiao-li(何 勇,李晓丽). Journal of Infrared and Millimeter Waves(红外与毫米波学报), 2006, 25(3): 192
[6] GAO Quan-xue, XIE De-yan, XU Hui(高全学,谢德燕,徐 辉,等). Acta Automatica Sinica(自动化学报), 2010, 36(8): 1107.
[7] ZHAO Hai-dong, SHEN Jin-yuan, LIU Run-jie, et al(赵海东,申金媛,刘润杰,等). Infrared Technology(红外技术), 2013, 35(10):659.
[8] Tenenbaum J B, de Silva V, Langford J C. Science, 2000, 290(5500): 2319.
[9] He Xiaofei, Niyogi P. Cambridge, Locality Preserving Projections, NIPS’03: Proceedings of the 16th International Conference on Neural Information Processing Systems, 2003. 153.
[10] Belkin M, Niyogi P. Neural Computation, 2003, 15(6): 1373.
[11] Zhai Y, Zhang L, Wang N, et al. IEEE Geoscience & Remote Sensing Letters, 2017, 13(8): 1059.
[12] HU Yong, WANG Yu-heng, LIU Wei, et al(胡 涌,王宇恒,刘 伟,等). Journal of China Agricultural University(中国农业大学学报), 2018, 23(3): 106.