Hyperspectral Estimation of Soil Lead Content Based on Random Frog Band Selection Algorithm
AN Bai-song1, 2, WANG Xue-mei1, 2*, HUANG Xiao-yu1, 2, KAWUQIATI Bai-shan1, 2
1. College of Geographic Science and Tourism of Xinjiang Normal University, Urumqi 830054, China
2. Xinjiang Uygur Autonomous Region Key Laboratory “Xinjiang Arid Lake Environment and Resources Laboratory”, Urumqi
830054, China
Abstract:Due to the large amount of redundant information in hyperspectral data, it greatly impacts the accuracy of hyperspectral estimation. The purpose of this study is to find the best algorithm for the screening of feature bands to realize the accurate monitoring of the lead content of heavy metals in soil and to provide a reference for soil pollution prevention and control. The lead contents and spectral data in the oasis soils of the Weigan-Kuqa river delta in Xinjiang were used as data sources, and 92 valid soil samples were identified using the Monte Carlo cross-validation algorithm (MCCV), and the spectral data processed by the first-order differential transformation of the reciprocal logarithm are selected through correlation analysis. The random frog (RF) algorithm is combined with the competitive adaptive reweighted sampling (CARS) algorithm. The iteratively retains informative variables (IRIV) algorithm and the successive projections algorithm (SPA). The RF-CARS, RF-IRIV and RF-SPA algorithms are constructed to screen the bands. Taking the reflectivity of feature bands as the independent variable and the content of heavy metal lead in the soil as the dependent variable, the extreme gradient boosting (XG Boost) and geographically weighted regression (GWR) methods were used to construct the estimation model of the lead content in the soil. The results show that: (1) The spectral transformation treatment can effectively enhance the sensitivity of the spectrum and lead content. The spectral characteristics after the first-order differential transformation of the reciprocal logarithm are obvious, and the correlation coefficient can reach 0.620 (p<0.001). (2) RF-CARS, RF-IRIV and RF-SPA algorithms extract 6, 9 and 7 feature bands from hyperspectral data, all located in the near-infrared spectral region. The three algorithms have strong feature extraction ability, greatly reducing redundant information in spectral data. (3) The accuracy and stability of the soil lead content estimation model constructed based on that the RF-IRIV algorithm are higher than those constructed by RF-CARS and RF-SPA, showing the RF-IRIV algorithm can more accurately retain the bands related to soil lead content. In addition, the performance of the GWR model is better than that of the XGBoost model, and the constructed RF-IRIV-GWR model has the good predictive ability, which can be used as the optimal estimation model for soil lead content in the study area. The R2, RMSE and RPD of the validation set of the RF-IRIV-GWR model are 0.892, 0.825 mg·kg-1 and 3.09 respectively. Based on the random frog (RF) and iteratively retains informative variables (IRIV) algorithm combined with geographically weighted regression (GWR) modeling method, it has certain advantages in quickly and accurately estimating soil lead content, which can be used for dynamic monitoring of soil heavy metal pollution.
安柏耸,王雪梅,黄晓宇,卡吾恰提·白山. 基于随机蛙跳波段选择算法的土壤铅含量高光谱估测[J]. 光谱学与光谱分析, 2023, 43(10): 3302-3309.
AN Bai-song, WANG Xue-mei, HUANG Xiao-yu, KAWUQIATI Bai-shan. Hyperspectral Estimation of Soil Lead Content Based on Random Frog Band Selection Algorithm. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3302-3309.
[1] Liu Z, Lu Y, Peng Y, et al. Remote Sensing, 2019, 11(12): 1464.
[2] Wei L, Yuan Z, Yu M, et al. Sensors, 2019, 19(18): 3904.
[3] Tan K, Ma W, Chen L, et al. Journal of Hazardous Materials, 2021, 401: 123288.
[4] Jiang G, Zhou S, Cui S, et al. Sensors, 2020, 20(21): 6325.
[5] LIN Nan, LIU Han-lin, MENG Xiang-fa, et al(林 楠, 刘翰霖, 孟祥发, 等). Transactions of the Chinese Society for Agricultural Machinery(农业机械学报), 2021, 52(3): 218.
[6] Wei L, Pu H, Wang Z, et al. Sensors, 2020, 20(14): 4056.
[7] MARHABA Turhun, MAMATTURSUM Eziz, WANG Wei-wei(麦尔哈巴·图尔贡, 麦麦提吐尔逊·艾则孜, 王维维). Journal of Ecology and Rural Environment(生态与农村环境学报), 2020, 36(12): 1626.
[8] CHENG Yong-sheng, ZHOU Yao(成永生, 周 瑶). The Chinese Journal of Nonferrous Metals(中国有色金属学报), 2021, 31(11): 3450.
[9] LIU Yan-ping, LUO Qing, CHENG He-fa(刘彦平, 罗 晴, 程和发). Journal of Agro-Environment Science(农业环境科学学报), 2020, 39(12): 2699.
[10] XIAO Ye-hui, SONG Ni-di, MENG Pan-pan, et al(肖烨辉, 宋妮迪, 孟盼盼, 等). Remote Sensing for Natural Resources(自然资源遥感), 2021, 33(4): 143.
[11] Wei L, Yuan Z, Zhong Y, et al. Applied Sciences, 2019, 9(9): 1943.
[12] FENG Shuai, CAO Ying-li, XU Tong-yu, et al(冯 帅, 曹英丽, 许童羽, 等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2020, 40(8): 2584.
[13] Wu T, Yu J, Lu J, et al. Agriculture, 2020, 10(7): 292.
[14] JIA Ping-ping, SHANG Tian-hao, ZHANG Jun-hua, et al(贾萍萍, 尚天浩, 张俊华, 等). Transactions of the Chinese Society of Agricultural Engineering(农业工程学报), 2020, 36(17): 125.
[15] GOU Yu-xuan, ZHAO Yun-ze, LI Yong, et al(勾宇轩, 赵云泽, 李 勇, 等). Transactions of the Chinese Society for Agricultural Machinery(农业机械学报), 2022, 53(3): 331.