|
|
|
|
|
|
Research Advance of Variable Selection Algorithms in Near Infrared Spectroscopy Analysis |
SONG Xiang-zhong, TANG Guo, ZHANG Lu-da, XIONG Yan-mei, MIN Shun-geng* |
College of Science, China Agricultural University, Beijing 100193, China |
|
|
Abstract Researchers begin to realize that near infrared spectroscopy analysis model can be simplified by removing some redundant variables from the full-spectrum with the growing understanding of near infrared spectroscopy. It is obvious that the simplified model constructed with retained informative variables can be interpreted more easily. Moreover, both prediction performance and robustness of calibration model can be improved wi hvariable selection, which has been proved in numerous applied examples. Therefore, variable selection has become a critical step in the process of constructing near infrared spectroscopy analysis models, and various kinds of variable selection algorithms and their derivative algorithms have been developed by chemometrics scientists. In order to help the researchers in near infrared spectroscopy analysis field to have a fast overview on variable selection algorithms, we try to review some variable selection algorithms commonly used in near infrared spectroscopy area in this article, including their main rationales and characteristics. These variable selection algorithms are divided into five categories according to their different features. These algorithms are based on parameters of partial least squares (PLS) model, intelligent optimization algorithms, successive projections strategy, model population analysis strategy, and spectral intervals respectively. During the process of carding literatures, we find that the development trends of variable selection algorithms mainly focus on two points: firstly, complexity of new proposed algorithms increaces continually; secondly, the combination of different algorithms becomes more and more popular. Furthermore, we also summarized several specific applied problems that may be occurred when variable selection algorithms are applied in near infrared spectroscopy analysis area. For example, how do different spectral pretreatment methods affect the performance of variable selection algorithm? How to address the poor stability and reliability of some variable selection algorithms?
|
Received: 2015-11-29
Accepted: 2016-04-06
|
|
Corresponding Authors:
MIN Shun-geng
E-mail: minsg@cau.edu.cn
|
|
[1] Manley M. Chemical Society Reviews, 2014, 43(24): 8200.
[2] Li H D, Liang Y Z, Long X X, et al. Chemometrics and Intelligent Laboratory Systems, 2013, 122: 23.
[3] Spiegelman C H, McShane M J, Goetz M J, et al. Analytical Chemistry, 1998, 70(1): 35.
[4] Yun Y H, Liang Y Z, Xie G X, et al. Analyst, 2013, 138(21): 6412.
[5] ZHU Xiao-li, YUAN Hong-fu, LU Wan-zhen(褚小立, 袁洪福, 陆婉珍). Progress in Chemsitry(化学进展), 2004, 16(4): 528.
[6] Zou Xiaobo, Zhao Jiewen, Povey M J, et al. Analytica Chimica Acta, 2010, 667(1): 14.
[7] Mehmood T, Liland K H, Snipen L, et al. Chemometrics and Intelligent Laboratory Systems, 2012, 118: 62.
[8] Li H D, Liang Y Z, Cao D S, et al. TrAC. Trends in Analytical Chemistry, 2012, 38: 154.
[9] Soares S F C, Gomes A A, Araujo M C U, et al. TrAC Trends in Analytical Chemistry, 2013, 42: 84.
[10] Liu F, He Y, Wang L. Analytica Chimica Acta, 2008, 615(1): 10.
[11] Centner V, Massart D L, de Noord O E, et al. Analytical Chemistry, 1996, 68(21): 3851.
[12] Andries J P, Vander Heyden Y, Buydens L M. Analytica Chimica Acta, 2013, 760: 34.
[13] Cai W, Li Y, Shao X. Chemometrics and Intelligent Laboratory Systems, 2008, 90(2): 188.
[14] Li H, Liang Y, Xu Q, et al. Analytica Chimica Acta, 2009, 648(1): 77.
[15] Zheng K Y, Li Q Q, Wang J J, et al. Chemometrics and Intelligent Laboratory Systems, 2012, 112: 48.
[16] Wang W T, Yun Y H, Deng B C, et al. RSC Advances, 2015, 5(116): 95771.
[17] Lin Z Z, Pan X N, Xu B, et al. Journal of Chemometrics, 2015, 29(2): 87.
[18] Yun Y H, Cao D S, Tan M L, et al. Chemometrics and Intelligent Laboratory Systems, 2014, 130: 76.
[19] Arakawa M, Yamashita Y, Funatsu K. Journal of Chemometrics, 2011, 25(1): 10.
[20] Zou X B, Zhao J W, Huang X Y, et al. Chemometrics and Intelligent Laboratory Systems, 2007, 87(1): 43.
[21] Hrchner U, Kalivas J H. Journal of Chemometrics, 1995, 9(4): 283.
[22] Cao H, Wang Y X , Yang S C, et al. Journal of Chemometrics, 2015.
[23] Allegrini F, Olivieri A C. Analytica Chimica Acta, 2011, 699(1): 18.
[24] Huang X W, Zou X B, Zhao J W, et al. Food Chemistry, 2014, 164: 536.
[25] Leardi R, Nrgaard L. Journal of Chemometrics, 2004, 18(11): 486.
[26] Zou X B, Zhao J W, Mao H P, et al. Applied Spectroscopy, 2010, 64(7): 786.
[27] Paiva H M, Soares S F C, Galvo R K H, et al. Chemometrics and Intelligent Laboratory Systems, 2012, 118: 260.
[28] Ye S F, Wang D, Min S G. Chemometrics and Intelligent Laboratory Systems, 2008, 91(2): 194.
[29] Li J B, Zhao C J, Huang W Q, et al. Analytical Methods, 2014, 6(7): 2170.
[30] Tang G, Huang Y, Tian K D, et al. Analyst, 2014, 139(19): 4894.
[31] Yun Y H, Wang W T, Tan M L, et al. Analytica Chimica Acta, 2014, 807: 36.
[32] Deng B C, Yun Y H, Liang Y Z, et al. Analyst, 2014, 139(19): 4836.
[33] Deng B C, Yun Y H, Ma P, et al. Analyst, 2015, 140(6): 1876.
[34] Yun Y H, Wang W T, Deng B C, et al. Analytica Chimica Acta, 2015, 862: 14.
[35] Deng B C, Yun Y H, Cao D S, et al. Analytica Chimica Acta, 2016, 908: 63.
[36] Norgaard L, Saudland A, Wagner J, et al. Applied Spectroscopy, 2000, 54(3): 413.
[37] Jiang J H, Berry R J, Siesler H W, et al. Analytical Chemistry, 2002, 74(14): 3555.
[38] Yun Y H, Li H D, Wood L R, et al. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2013, 111: 31.
[39] Song X Z, Huang Y, Yan H, et al. Analytica Chimica Acta, 2016, 948:19.
[40] de Araújo Gomes A, Galvo R K H, de Araújo M C U, et al. Microchemical Journal, 2013, 110: 202.
[41] Du Y P, Liang Y Z, Jiang J H, et al. Analytica Chimica Acta, 2004, 501(2): 183.
[42] Ranzan C, Trierweiler L F, Hitzmann B, et al. Chemometrics and Intelligent Laboratory Systems, 2015, 142: 78.
[43] Xu H, Liu Z C, Cai W S, et al. Chemometrics and Intelligent Laboratory Systems, 2009, 97(2): 189.
[44] Shao X G, Du G R, Jing M, et al. Chemometrics and Intelligent Laboratory Systems, 2012, 114: 44.
[45] Shan R F, Cai W S, Shao X G. Chemometrics and Intelligent Laboratory Systems, 2014, 131: 31.
[46] Fu G H, Xu Q S, Li H D, et al. Applied Spectroscopy, 2011, 65(4): 402.
[47] Shahbazikhah P, Kalivas J H. Chemometrics and Intelligent Laboratory Systems, 2013, 120: 142.
[48] Li Y K, Jing J. Chemometrics and Intelligent Laboratory Systems, 2014, 130: 45.
[49] Liu K, Chen X J, Li L M, et al. Analytica Chimica Acta, 2015, 858: 16.
[50] Han Q J, Wu H L, Cai C B, et al. Analytica Chimica Acta, 2008, 612(2): 121.
[51] Engel J, Gerretzen J, Szymańska E, et al. TrAC Trends in Analytical Chemistry, 2013, 50: 96.
[52] Tong P J, Du Y P, Zheng K Y, et al. Chemometrics and Intelligent Laboratory Systems, 2015, 143: 40. |
[1] |
GAO Feng1, 2, XING Ya-ge3, 4, LUO Hua-ping1, 2, ZHANG Yuan-hua3, 4, GUO Ling3, 4*. Nondestructive Identification of Apricot Varieties Based on Visible/Near Infrared Spectroscopy and Chemometrics Methods[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 44-51. |
[2] |
BAO Hao1, 2,ZHANG Yan1, 2*. Research on Spectral Feature Band Selection Model Based on Improved Harris Hawk Optimization Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 148-157. |
[3] |
HU Cai-ping1, HE Cheng-yu2, KONG Li-wei3, ZHU You-you3*, WU Bin4, ZHOU Hao-xiang3, SUN Jun2. Identification of Tea Based on Near-Infrared Spectra and Fuzzy Linear Discriminant QR Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3802-3805. |
[4] |
LIU Xin-peng1, SUN Xiang-hong2, QIN Yu-hua1*, ZHANG Min1, GONG Hui-li3. Research on t-SNE Similarity Measurement Method Based on Wasserstein Divergence[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3806-3812. |
[5] |
BAI Xue-bing1, 2, SONG Chang-ze1, ZHANG Qian-wei1, DAI Bin-xiu1, JIN Guo-jie1, 2, LIU Wen-zheng1, TAO Yong-sheng1, 2*. Rapid and Nndestructive Dagnosis Mthod for Posphate Dficiency in “Cabernet Sauvignon” Gape Laves by Vis/NIR Sectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3719-3725. |
[6] |
WANG Qi-biao1, HE Yu-kai1, LUO Yu-shi1, WANG Shu-jun1, XIE Bo2, DENG Chao2*, LIU Yong3, TUO Xian-guo3. Study on Analysis Method of Distiller's Grains Acidity Based on
Convolutional Neural Network and Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3726-3731. |
[7] |
LUO Li, WANG Jing-yi, XU Zhao-jun, NA Bin*. Geographic Origin Discrimination of Wood Using NIR Spectroscopy
Combined With Machine Learning Techniques[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3372-3379. |
[8] |
ZHANG Shu-fang1, LEI Lei2, LEI Shun-xin2, TAN Xue-cai1, LIU Shao-gang1, YAN Jun1*. Traceability of Geographical Origin of Jasmine Based on Near
Infrared Diffuse Reflectance Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3389-3395. |
[9] |
YANG Qun1, 2, LING Qi-han1, WEI Yong1, NING Qiang1, 2, KONG Fa-ming1, ZHOU Yi-fan1, 2, ZHANG Hai-lin1, WANG Jie1, 2*. Non-Destructive Monitoring Model of Functional Nitrogen Content in
Citrus Leaves Based on Visible-Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3396-3403. |
[10] |
HUANG Meng-qiang1, KUANG Wen-jian2, 3*, LIU Xiang1, HE Liang4. Quantitative Analysis of Cotton/Polyester/Wool Blended Fiber Content by Near-Infrared Spectroscopy Based on 1D-CNN[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3565-3570. |
[11] |
HUANG Zhao-di1, CHEN Zai-liang2, WANG Chen3, TIAN Peng2, ZHANG Hai-liang2, XIE Chao-yong2*, LIU Xue-mei4*. Comparing Different Multivariate Calibration Methods Analyses for Measurement of Soil Properties Using Visible and Short Wave-Near
Infrared Spectroscopy Combined With Machine Learning Algorithms[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3535-3540. |
[12] |
KANG Ming-yue1, 3, WANG Cheng1, SUN Hong-yan3, LI Zuo-lin2, LUO Bin1*. Research on Internal Quality Detection Method of Cherry Tomatoes Based on Improved WOA-LSSVM[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3541-3550. |
[13] |
HUANG Hua1, LIU Ya2, KUERBANGULI·Dulikun1, ZENG Fan-lin1, MAYIRAN·Maimaiti1, AWAGULI·Maimaiti1, MAIDINUERHAN·Aizezi1, GUO Jun-xian3*. Ensemble Learning Model Incorporating Fractional Differential and
PIMP-RF Algorithm to Predict Soluble Solids Content of Apples
During Maturing Period[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3059-3066. |
[14] |
CHEN Jia-wei1, 2, ZHOU De-qiang1, 2*, CUI Chen-hao3, REN Zhi-jun1, ZUO Wen-juan1. Prediction Model of Farinograph Characteristics of Wheat Flour Based on Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3089-3097. |
[15] |
GUO Ge1, 3, 4, ZHANG Meng-ling3, 4, GONG Zhi-jie3, 4, ZHANG Shi-zhuang3, 4, WANG Xiao-yu2, 5, 6*, ZHOU Zhong-hua1*, YANG Yu2, 5, 6, XIE Guang-hui3, 4. Construction of Biomass Ash Content Model Based on Near-Infrared
Spectroscopy and Complex Sample Set Partitioning[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3143-3149. |
|
|
|
|