Random Forests-Based Hybrid Feature Selection Algorithm for Soil Potassium Content Inversion Using Hyperspectral Technology
WANG Xuan-hui1, 2, ZHENG Xi-lai1*, HAN Zhong-zhi2, WANG Xuan-li3, WANG Juan4
1. Key Lab of Marine Environmental Science and Ecology, Ministry of Education, College of Environmental Science and Engineering, Ocean University of China, Qingdao 266100, China
2. Science and Information College, Qingdao Agricultural University, Qingdao 266109, China
3. Information Engineering and Automation Department, Shanxi Institute of Technology, Yangquan 045000, China
4. The Environmental Monitoring Center of North China Sea, State Oceanic Administration, Qingdao 266033, China
Abstract:In order to solve the problem of lower prediction performance caused by the difficulty in retrieving the key features from hyperspectral data of soil available potassium, this paper proposes a novel hybrid feature selection algorithm based on Random Forests. Firstly, wrapper-based feature selection methods were applied to rapidly remove the redundancies and preserve the related features. Secondly, an Improved-RF feature selection algorithm was applied to further accurately select the wavelength variables from the pre-selected feature sets. In this step, characteristic wavelength with strong robustness and discriminative could be selected through improving the dipartite degree between the key and redundant features and using an iterative feature selection method. Therefore, the problem of low prediction performance in the soil available potassium inversion model could be better solved by using our hybrid feature selection algorithm. In order to verify the validity of our algorithm, 124 representative soil samples collected from the Dagu River Basin were chosen. Using our algorithm, the optimal feature subset which contained 13 sensitive bands have been selected and used to build soil available potassium content inversion model. This work compared the model performance of full bands, current feature selection algorithms and our algorithm. The comparison results indicated that our algorithm not only selects minimum numbers of wavelength features and reduces the dimension of full bands, but also achieves better prediction performance with lower RMSEP (9.661 5), higher R (0.936 9) and RPD (2.14). As an effective method of soil available potassium inversion model, the algorithm proposed in this paper can provide theoretical basis for the design of real-time soil nutrient sensors.
Key words:Soil available potassium content; Hyperspectral; Characteristic wavelength selection; Hybrid feature selection; Random forests
王轩慧,郑西来,韩仲志,王轩力,王 娟. 混合式随机森林的土壤钾含量高光谱反演[J]. 光谱学与光谱分析, 2018, 38(12): 3883-3889.
WANG Xuan-hui, ZHENG Xi-lai, HAN Zhong-zhi, WANG Xuan-li, WANG Juan. Random Forests-Based Hybrid Feature Selection Algorithm for Soil Potassium Content Inversion Using Hyperspectral Technology. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2018, 38(12): 3883-3889.
[1] Gupta S, Yadav B S, Raj U, et al. Front in Plant Science, 2017, 8(1025): 1.
[2] Ministry of Agricultural of the People’s Republic of China, NY/T 889—2004. Determination of Exchangeable Potassium and Non-Exchangeable Potassium Content in Soil(中华人民共和国农业部. NY/T 889—2004. 土壤速效钾和缓效钾含量的测定),2005.
[3] Viscarra Rossel R A, Behrens T, Ben-Dor E, et al. Earth-Science Reviews, 2016, 155: 198.
[4] Vohland M, Ludwig M, Thiele-Bruhn S, et al. Geoderma, 2014, 223-225: 88.
[5] Iznaga A C, Orozco, M R, Alcantara E A, et al. Biosystems Engineering, 2014, 125: 105.
[6] ZHANG Hai-liang, LIU Xue-mei, HE Yong(章海亮,刘雪梅,何 勇). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2014, 34(5): 1348.
[7] Mehmood T, Liland K H, Snipen L, et al. Chemometricsand Intelligent Laboratory Systems, 2012, 118: 62.
[8] LIU Xue-mei, LIU Jian-she(刘雪梅, 柳建设). Transactions of the Chinese Society for Agricultural Machinery(农业机械学报), 2013, 44(3): 88.
[9] Jia S, Yang X, Zhang J, et al. Soil Science, 2014, 179(4): 211.
[10] Ben Ishak A. Intelligent Data Analysis, 2016, 20(1): 83.
[11] Leo Breiman. Random forests. Machine Learning, 2001, 45(1): 5.
[12] Yu X, Liu Q, Wang Y B, et al. Catena, 2016, 137: 340.
[13] Wang X M, Chen Y Y, Guo L, et al. Remote Sensing, 2017, 9: 201.
[14] Li H D, Xu Q S, Liang Y Z. Peer J Prepr, 2014, 2: e190v191.
[15] Huang N T, Lu G B, Xu D G. Energies, 2016, 9: 767.
[16] Leo Breiman. Bagging Predictors. Machine Learning, 1996, 24: 123.