|
|
|
|
|
|
Study on an Algorithm for Near Infrared Spectrum Multiclass Identification and Measurement Based on Feature Hierarchical Selection and Sample Fusion Degree |
ZHU Cheng, GONG Hui-li*, DING Xiang-qian, HOU Rui-chun |
College of Information Science and Engineering, Ocean University of China, Qingdao 266100, China |
|
|
Abstract Aiming at solving the difficulty of getting the best feature subset from high dimensional and the low identification accuracy of existing models, this paper proposes an algorithm for near infrared spectrum identification and measurement based on feature hierarchical selection and sample fusion degree. The paper firstly introduces the concept of jump degree, and proposes a feature hierarchical method to divide all the features into different subsets in terms of their importance to sample, which avoid the complicated process of deleting unrelated features one by one when constructing feature subset from the original feature data; At the same time, this paper improves sample fusion degree, while regarding it as the category judgment type of the improved KNN algorithm that take the place of probability, which has increased the precision of multiclass identification. The low identification accuracy was solved better though it. In order to verify the validity of our algorithm,five kinds of 382 representative tobacco samples were chosen as the experimental objects to build tobacco producing area identification models and 64 tobacco samples were chose as test samples;At last, with Root Mean Square Error of Prediction (RMSEP), Root Means Square Error of Cross Validation (RMSECV) and Correlation Coefficient (r) as the evaluation index of stability and identification accuracy as evaluation standard, the algorithm above made a comparison with other algorithms. The experimental results show that the model constructed by our algorithm has better stability with lower RMSEP (0.117), RMSECV (0.106) and higher r (0.973). The identification accuracy of our algorithm is the highest, reaching at 98.44%. The algorithm proposed in this paper has an excellent identification performance for high dimensional spectral data.
|
Received: 2015-12-04
Accepted: 2016-05-16
|
|
Corresponding Authors:
GONG Hui-li
E-mail: huiligong@163.com
|
|
[1] GAO Rong-qiang, FAN Shi-fu(高荣强,范世福). Analytical Instrumentation(分析仪器), 2002,(3): 9.
[2] Philip Williams, Karl Norris. Near Infrared Technology in the Agriculture and Food Industries. 2nd ed. Inc. St., American Association of Cereal Chemists, Minnesota USA: AACC, 2001.
[3] CHU Xiao-li, YUAN Hong-fu, LU Wan-zhen(褚小立,袁洪福,陆婉珍). Analytical Instrumentation(分析仪器), 2006,(2): 1.
[4] CHU Xiao-li, LU Wan-zhen(褚小立,陆婉珍). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2014, 34(10): 2595.
[5] JIANG Jin-feng, LI Li, ZHAO Ming-yue, et al(蒋锦锋,李 莉,赵明月,等). Acta Tabacaria Sinica(中国烟草学报), 2006, 12(2): 8.
[6] Ni Lijin, Zhang Liguo, Xie Juan, et al. Analytical Chemica Acta, 2009, 633: 43.
[7] SHU Ru-xin, SUN Ping, YANG Kai,et al(束茹新,孙 平,杨 凯,等). Tobacco Science(烟草科技), 2011, 11: 50.
[8] ZHAO Hai-dong, SHEN Jin-yuan, LIU Run-jie, et al(赵海东,申金媛,刘润洁,等). Infrared Technology(红外技术), 2013, 35(10): 659-664.
[9] QIN Yu-hua, DING Xiang-qian, GONG Hui-li(秦玉华,丁香乾,宫会丽). Infrared and Laser Engineering(红外与激光工程), 2013, 42(5): 1355.
[10] Leo Breiman. Random Forests. Machine Learning, 2001,45(1): 5.
[11] ZHANG De-ran(张德然). Statistical Research(统计研究), 2003, 5: 53.
[12] YI Jun-kai, ZHANG Ya-cong, SUN Jian-wei(易军凯,张雅聪,孙建伟). Computer Engineering and Applications(计算机工程与应用), 2011, 16(3): 76.
[13] Simon Bernard, Laurent Heutte, Sebastien Adam. Lecture Notes in Computer Science, 2009, 5519: 171. |
[1] |
GAO Feng1, 2, XING Ya-ge3, 4, LUO Hua-ping1, 2, ZHANG Yuan-hua3, 4, GUO Ling3, 4*. Nondestructive Identification of Apricot Varieties Based on Visible/Near Infrared Spectroscopy and Chemometrics Methods[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 44-51. |
[2] |
BAO Hao1, 2,ZHANG Yan1, 2*. Research on Spectral Feature Band Selection Model Based on Improved Harris Hawk Optimization Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 148-157. |
[3] |
HU Cai-ping1, HE Cheng-yu2, KONG Li-wei3, ZHU You-you3*, WU Bin4, ZHOU Hao-xiang3, SUN Jun2. Identification of Tea Based on Near-Infrared Spectra and Fuzzy Linear Discriminant QR Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3802-3805. |
[4] |
LIU Xin-peng1, SUN Xiang-hong2, QIN Yu-hua1*, ZHANG Min1, GONG Hui-li3. Research on t-SNE Similarity Measurement Method Based on Wasserstein Divergence[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3806-3812. |
[5] |
BAI Xue-bing1, 2, SONG Chang-ze1, ZHANG Qian-wei1, DAI Bin-xiu1, JIN Guo-jie1, 2, LIU Wen-zheng1, TAO Yong-sheng1, 2*. Rapid and Nndestructive Dagnosis Mthod for Posphate Dficiency in “Cabernet Sauvignon” Gape Laves by Vis/NIR Sectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3719-3725. |
[6] |
WANG Qi-biao1, HE Yu-kai1, LUO Yu-shi1, WANG Shu-jun1, XIE Bo2, DENG Chao2*, LIU Yong3, TUO Xian-guo3. Study on Analysis Method of Distiller's Grains Acidity Based on
Convolutional Neural Network and Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3726-3731. |
[7] |
LUO Li, WANG Jing-yi, XU Zhao-jun, NA Bin*. Geographic Origin Discrimination of Wood Using NIR Spectroscopy
Combined With Machine Learning Techniques[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3372-3379. |
[8] |
ZHANG Shu-fang1, LEI Lei2, LEI Shun-xin2, TAN Xue-cai1, LIU Shao-gang1, YAN Jun1*. Traceability of Geographical Origin of Jasmine Based on Near
Infrared Diffuse Reflectance Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3389-3395. |
[9] |
YANG Qun1, 2, LING Qi-han1, WEI Yong1, NING Qiang1, 2, KONG Fa-ming1, ZHOU Yi-fan1, 2, ZHANG Hai-lin1, WANG Jie1, 2*. Non-Destructive Monitoring Model of Functional Nitrogen Content in
Citrus Leaves Based on Visible-Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3396-3403. |
[10] |
HUANG Meng-qiang1, KUANG Wen-jian2, 3*, LIU Xiang1, HE Liang4. Quantitative Analysis of Cotton/Polyester/Wool Blended Fiber Content by Near-Infrared Spectroscopy Based on 1D-CNN[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3565-3570. |
[11] |
HUANG Zhao-di1, CHEN Zai-liang2, WANG Chen3, TIAN Peng2, ZHANG Hai-liang2, XIE Chao-yong2*, LIU Xue-mei4*. Comparing Different Multivariate Calibration Methods Analyses for Measurement of Soil Properties Using Visible and Short Wave-Near
Infrared Spectroscopy Combined With Machine Learning Algorithms[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3535-3540. |
[12] |
KANG Ming-yue1, 3, WANG Cheng1, SUN Hong-yan3, LI Zuo-lin2, LUO Bin1*. Research on Internal Quality Detection Method of Cherry Tomatoes Based on Improved WOA-LSSVM[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3541-3550. |
[13] |
HUANG Hua1, LIU Ya2, KUERBANGULI·Dulikun1, ZENG Fan-lin1, MAYIRAN·Maimaiti1, AWAGULI·Maimaiti1, MAIDINUERHAN·Aizezi1, GUO Jun-xian3*. Ensemble Learning Model Incorporating Fractional Differential and
PIMP-RF Algorithm to Predict Soluble Solids Content of Apples
During Maturing Period[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3059-3066. |
[14] |
CHEN Jia-wei1, 2, ZHOU De-qiang1, 2*, CUI Chen-hao3, REN Zhi-jun1, ZUO Wen-juan1. Prediction Model of Farinograph Characteristics of Wheat Flour Based on Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3089-3097. |
[15] |
GUO Ge1, 3, 4, ZHANG Meng-ling3, 4, GONG Zhi-jie3, 4, ZHANG Shi-zhuang3, 4, WANG Xiao-yu2, 5, 6*, ZHOU Zhong-hua1*, YANG Yu2, 5, 6, XIE Guang-hui3, 4. Construction of Biomass Ash Content Model Based on Near-Infrared
Spectroscopy and Complex Sample Set Partitioning[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3143-3149. |
|
|
|
|