|
|
|
|
|
|
Research on Feature Extraction of Near-Infrared Spectroscopy Based on Joint Matrix Local Preserving Projection |
HU Shan-ke1, QIN Yu-hua1*, DUAN Ru-min2, WU Li-jun2, GONG Hui-li3 |
1. College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
2. Technical Research Center, China Tobacco Yunnan Industrial Co., Ltd., Kunming 650024, China
3. College of Information Science and Engineering, China Ocean University, Qingdao 266100, China |
|
|
Abstract Aiming at the problem that the high-dimensional, high-noise, overlap and nonlinear features of the near-infrared spectrum seriously affect the modeling accuracy, a feature extraction method based on joint matrix local preservation projection (JMLPP) is proposed in this paper. First, the cluster-based spectral feature selection is used for effective features extraction. According to kinds of indicators with a strong correlation of classification, the samples are divided into kinds of different clustering modes. Based on the idea of strong intra-class correlation and great inter-class difference, the intra-class threshold and the inter-class threshold are determined by adjusting the intra-class parameter and the inter-class parameter . The spectral feature regions are selected according to kinds of different clustering modes, and feature matrices are obtained, whereas a joint matrix is generated by the union operation. Cluster-based feature extraction eliminates features with low intra-class correlation and high correlation between classes, and realizes the elimination of noise information in the spectrum. Secondly, the local preservation projection algorithm (LPP) is improved in this paper from two aspects: the geodesic distance is introduced to construct the neighborhood distance matrix, and the topology between the high-dimensional sample data is better expressed than the Euclidean distance. Meanwhile, the edge weight matrix is also improved, which solves the uncertainty caused by sample sparseness and avoids the loss of effective information. Finally, the improved LPP algorithm is used to reduce the dimensionality of the joint matrix, and the optimal spectral feature subset of the low-dimensional mapping is obtained. In order to verify the effectiveness of the JMLPP algorithm, this paper first compares the JMLPP with PCA and LPP from the perspective of spectral projection. The results show that JMLPP has better classification ability, and the tobacco samples in the projection space are clearly classified, and the effect is obviously better than PCA and LPP. In addition, the results of the model classification are also compared. The classification models were established by using the full spectra and dimension reduction features of the PCA, LPP and JMLPP. The experimental results show that the accuracy of the classification model established by JMLPP algorithm is 93.8%. The sensitivity of the five categories of tobacco grading classification are 95.2%, 93.1%, 94.2%, 92.1%, 92.5%, and the specificities are 99.3%, 98.4%, 98.6%, 97.5%, and 97%, respectively. The accuracy, sensitivity and specificity of the model are significantly higher than the other three methods. The JMLPP algorithm effectively extracts useful information of classification based on cluster-based feature extraction and local preserving projection algorithm, and maintains the local linear relationship of the original data. The stability and accuracy of model are desirable.
|
Received: 2019-11-10
Accepted: 2020-03-19
|
|
Corresponding Authors:
QIN Yu-hua
E-mail: yuu71@163.com
|
|
[1] KONG Qing-qing, DING Xiang-qian, GONG Hui-li(孔清清,丁香乾,宫会丽). Laser & Optoelectronics Progress(激光与光电子学进展), 2018, 55(1): 013006.
[2] YAO Xue-lian, HE Fu-qiang, PING An, et al(姚学练,贺福强,平 安,等). Tobacco Science & Technology(烟草科技), 2009, 42(11): 197.
[3] ZHANG-Ying, HE Li-yuan(章 英,贺立源). Transactions of the Chinese Society of Agricultural Engineering(农业工程学报), 2011, 27(4): 350.
[4] Yun Yonghuan, Li Hongdong, Deng Baichuan, et al. Trends in Analytical Chemistry, 2019, 113: 102.
[5] HE Yong, LI Xiao-li(何 勇,李晓丽). Journal of Infrared and Millimeter Waves(红外与毫米波学报), 2006, 25(3): 192
[6] GAO Quan-xue, XIE De-yan, XU Hui(高全学,谢德燕,徐 辉,等). Acta Automatica Sinica(自动化学报), 2010, 36(8): 1107.
[7] ZHAO Hai-dong, SHEN Jin-yuan, LIU Run-jie, et al(赵海东,申金媛,刘润杰,等). Infrared Technology(红外技术), 2013, 35(10):659.
[8] Tenenbaum J B, de Silva V, Langford J C. Science, 2000, 290(5500): 2319.
[9] He Xiaofei, Niyogi P. Cambridge, Locality Preserving Projections, NIPS’03: Proceedings of the 16th International Conference on Neural Information Processing Systems, 2003. 153.
[10] Belkin M, Niyogi P. Neural Computation, 2003, 15(6): 1373.
[11] Zhai Y, Zhang L, Wang N, et al. IEEE Geoscience & Remote Sensing Letters, 2017, 13(8): 1059.
[12] HU Yong, WANG Yu-heng, LIU Wei, et al(胡 涌,王宇恒,刘 伟,等). Journal of China Agricultural University(中国农业大学学报), 2018, 23(3): 106. |
[1] |
GAO Feng1, 2, XING Ya-ge3, 4, LUO Hua-ping1, 2, ZHANG Yuan-hua3, 4, GUO Ling3, 4*. Nondestructive Identification of Apricot Varieties Based on Visible/Near Infrared Spectroscopy and Chemometrics Methods[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 44-51. |
[2] |
BAO Hao1, 2,ZHANG Yan1, 2*. Research on Spectral Feature Band Selection Model Based on Improved Harris Hawk Optimization Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 148-157. |
[3] |
HU Cai-ping1, HE Cheng-yu2, KONG Li-wei3, ZHU You-you3*, WU Bin4, ZHOU Hao-xiang3, SUN Jun2. Identification of Tea Based on Near-Infrared Spectra and Fuzzy Linear Discriminant QR Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3802-3805. |
[4] |
LIU Xin-peng1, SUN Xiang-hong2, QIN Yu-hua1*, ZHANG Min1, GONG Hui-li3. Research on t-SNE Similarity Measurement Method Based on Wasserstein Divergence[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3806-3812. |
[5] |
BAI Xue-bing1, 2, SONG Chang-ze1, ZHANG Qian-wei1, DAI Bin-xiu1, JIN Guo-jie1, 2, LIU Wen-zheng1, TAO Yong-sheng1, 2*. Rapid and Nndestructive Dagnosis Mthod for Posphate Dficiency in “Cabernet Sauvignon” Gape Laves by Vis/NIR Sectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3719-3725. |
[6] |
WANG Qi-biao1, HE Yu-kai1, LUO Yu-shi1, WANG Shu-jun1, XIE Bo2, DENG Chao2*, LIU Yong3, TUO Xian-guo3. Study on Analysis Method of Distiller's Grains Acidity Based on
Convolutional Neural Network and Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3726-3731. |
[7] |
LUO Li, WANG Jing-yi, XU Zhao-jun, NA Bin*. Geographic Origin Discrimination of Wood Using NIR Spectroscopy
Combined With Machine Learning Techniques[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3372-3379. |
[8] |
ZHANG Shu-fang1, LEI Lei2, LEI Shun-xin2, TAN Xue-cai1, LIU Shao-gang1, YAN Jun1*. Traceability of Geographical Origin of Jasmine Based on Near
Infrared Diffuse Reflectance Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3389-3395. |
[9] |
YANG Qun1, 2, LING Qi-han1, WEI Yong1, NING Qiang1, 2, KONG Fa-ming1, ZHOU Yi-fan1, 2, ZHANG Hai-lin1, WANG Jie1, 2*. Non-Destructive Monitoring Model of Functional Nitrogen Content in
Citrus Leaves Based on Visible-Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3396-3403. |
[10] |
HUANG Meng-qiang1, KUANG Wen-jian2, 3*, LIU Xiang1, HE Liang4. Quantitative Analysis of Cotton/Polyester/Wool Blended Fiber Content by Near-Infrared Spectroscopy Based on 1D-CNN[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3565-3570. |
[11] |
HUANG Zhao-di1, CHEN Zai-liang2, WANG Chen3, TIAN Peng2, ZHANG Hai-liang2, XIE Chao-yong2*, LIU Xue-mei4*. Comparing Different Multivariate Calibration Methods Analyses for Measurement of Soil Properties Using Visible and Short Wave-Near
Infrared Spectroscopy Combined With Machine Learning Algorithms[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3535-3540. |
[12] |
KANG Ming-yue1, 3, WANG Cheng1, SUN Hong-yan3, LI Zuo-lin2, LUO Bin1*. Research on Internal Quality Detection Method of Cherry Tomatoes Based on Improved WOA-LSSVM[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3541-3550. |
[13] |
HUANG Hua1, LIU Ya2, KUERBANGULI·Dulikun1, ZENG Fan-lin1, MAYIRAN·Maimaiti1, AWAGULI·Maimaiti1, MAIDINUERHAN·Aizezi1, GUO Jun-xian3*. Ensemble Learning Model Incorporating Fractional Differential and
PIMP-RF Algorithm to Predict Soluble Solids Content of Apples
During Maturing Period[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3059-3066. |
[14] |
CHEN Jia-wei1, 2, ZHOU De-qiang1, 2*, CUI Chen-hao3, REN Zhi-jun1, ZUO Wen-juan1. Prediction Model of Farinograph Characteristics of Wheat Flour Based on Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3089-3097. |
[15] |
GUO Ge1, 3, 4, ZHANG Meng-ling3, 4, GONG Zhi-jie3, 4, ZHANG Shi-zhuang3, 4, WANG Xiao-yu2, 5, 6*, ZHOU Zhong-hua1*, YANG Yu2, 5, 6, XIE Guang-hui3, 4. Construction of Biomass Ash Content Model Based on Near-Infrared
Spectroscopy and Complex Sample Set Partitioning[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3143-3149. |
|
|
|
|