|
|
|
|
|
|
Prediction of Oil Content in Oil Shale by Near-Infrared Spectroscopy Based on Stacking Ensemble Learning |
LI Quan-lun1, CHEN Zheng-guang1*, JIAO Feng2 |
1. College of Information and Electrical Engineering, Heilongjiang Bayi Agricultural University, Daqing 163319, China
2. College of Agriculture, Heilongjiang Bayi Agricultural University, Daqing 163319, China
|
|
|
Abstract Aims to overcome the shortcomings that the prediction accuracy of a single model is hard to improve further, A heterogeneous ensemble learning model based on the Stacking framework, combined with near-infrared spectroscopy analysis technology, was adopted to detect the oil content in oil shale in this study. A total of 230 oil shale core samples, collected from some block in Songliao Basin, were taken as the research object, whose oil content was measured by the low-temperature dry distillation method, and near-infrared spectral data corresponding to each sample was scanned simultaneously. The Monte Carlo algorithm was employed to eliminate outlier samples, and 213 samples, after removing outliers, were randomly divided into a training set and test set according to the ratio of 3∶1. The detrend coupled with the baseline correction method was used to eliminate the influence of noise and baseline drift in spectral data. After that, the random forest algorithm (RF) was used to extract the characteristic wavelength according to the importance of wavelength. In order to further reduce the data dimension, the CARS algorithm was used to extract the characteristic wavelength. Finally, PLS, SVM, RF and GBDT, whose parameters were optimized by grid search, were adopted as primary learners, and the PLS regression modelwas adopted as secondary learners to build the stacking ensemble learning model. The accuracy of the single and ensemble learning models for oil shale oil content prediction was compared under evaluation indicators of R2 and RMSE. The research results show that the RF-CASR method can effectively screen important wavelengths and improve the efficiency of the model, thereby improving the model efficiency. Heterogeneous integrated learning models based on Stacking have better predictive performance and greater stability than single models (SVM, PLS) and homogeneous integrated learning models (RF, GBDT). Based on multiple random divisions of the data set, the average R2 of the Stacking ensemble learning model is 0.894 2, an average increase of 0.062 3 compared with other models; the RMSEP of 0.5869 is an average of 0.147 4 lower than other models. The results of this study show that the heterogeneous integrated learning model based on stacking can combine the advantages of primary learners to predict the oil content of oil shale quickly and accurately, which provides a new fast and portable method for oil shale oil content detection.
|
Received: 2022-02-07
Accepted: 2022-05-04
|
|
Corresponding Authors:
CHEN Zheng-guang
E-mail: ruzee@sina.com
|
|
[1] LI Yu-hang,ZHANG Hong,ZHANG Jun-hua,et al(李宇航,张 宏, 张军华, 等). Computerized Tomography Theory and Applications(CT理论与应用研究), 2014, 23(6): 1051.
[2] CHEN Hua-zhou,CHEN Fu,SHI Kai,et al(陈华舟,陈 福,石 凯,等). Transactions of the Chinese Society for Agricultural Machinery(农业机械学报), 2015, 46(5): 233.
[3] PENG Hai-gen,JIN Ying,ZHAN You-guo,et al(彭海根, 金 楹, 詹莜国, 等). Journal of Instrumental Analysis(分析测试学报), 2020, 39(10): 1305.
[4] ZHANG Fu-dong,LIU Jie,WANG Zhi-hong(张福东, 刘 杰, 王智宏). Chemical Journal of Chinese Universities(高等学校化学学报), 2016, 37(10): 1792.
[5] Romeo M J, Adams M J, Hind A R, et al. Journal of Near Infrared Spectroscopy, 2002, 10(3): 223.
[6] WANG Zhi-hong,ZHANG Fu-dong,TENG Fei,et al(王智宏,张福东,滕 飞,等). Optics and Precision Engineering(光学精密工程), 2015, 23(2): 371.
[7] Breiman L. Machine Learning, 2001, 45(1): 5.
[8] Cortes C, Vapnik V. Machine Learning, 1995, 20(3): 273.
[9] Mahmood A, Li T, Yang Y, et al. Knowledge-Based Systems, 2015, 76: 53.
[10] Nag K, Pal N R. IEEE Transactions on Cybernetics, 2015, 46(2): 499.
[11] Wolpert D H. Neural Networks, 1992, 5(2): 241.
[12] LIU Cui-ling,HU Yu-jun,WU Sheng-nan,et al(刘翠玲, 胡玉君, 吴胜男, 等). Journal of Food Science and Technology(食品科学技术学报), 2014, 32(5): 74.
[13] QIN Yu-hua,GONG Hui-li,SONG Nan,et al(秦玉华, 宫会丽, 宋 楠, 等). Tobacco Science & Technology(烟草科技), 2014,(6): 64.
[14] JIANG Wei-wei,LU Chang-hua,ZHANG Yu-jun,et al(蒋薇薇, 鲁昌华, 张玉钧, 等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2021, 41(4): 1119.
|
[1] |
GAO Feng1, 2, XING Ya-ge3, 4, LUO Hua-ping1, 2, ZHANG Yuan-hua3, 4, GUO Ling3, 4*. Nondestructive Identification of Apricot Varieties Based on Visible/Near Infrared Spectroscopy and Chemometrics Methods[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 44-51. |
[2] |
BAO Hao1, 2,ZHANG Yan1, 2*. Research on Spectral Feature Band Selection Model Based on Improved Harris Hawk Optimization Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 148-157. |
[3] |
BAI Xue-bing1, 2, SONG Chang-ze1, ZHANG Qian-wei1, DAI Bin-xiu1, JIN Guo-jie1, 2, LIU Wen-zheng1, TAO Yong-sheng1, 2*. Rapid and Nndestructive Dagnosis Mthod for Posphate Dficiency in “Cabernet Sauvignon” Gape Laves by Vis/NIR Sectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3719-3725. |
[4] |
WANG Qi-biao1, HE Yu-kai1, LUO Yu-shi1, WANG Shu-jun1, XIE Bo2, DENG Chao2*, LIU Yong3, TUO Xian-guo3. Study on Analysis Method of Distiller's Grains Acidity Based on
Convolutional Neural Network and Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3726-3731. |
[5] |
LI Wei1, TAN Feng2*, ZHANG Wei1, GAO Lu-si3, LI Jin-shan4. Application of Improved Random Frog Algorithm in Fast Identification of Soybean Varieties[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3763-3769. |
[6] |
HU Cai-ping1, HE Cheng-yu2, KONG Li-wei3, ZHU You-you3*, WU Bin4, ZHOU Hao-xiang3, SUN Jun2. Identification of Tea Based on Near-Infrared Spectra and Fuzzy Linear Discriminant QR Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3802-3805. |
[7] |
LIU Xin-peng1, SUN Xiang-hong2, QIN Yu-hua1*, ZHANG Min1, GONG Hui-li3. Research on t-SNE Similarity Measurement Method Based on Wasserstein Divergence[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3806-3812. |
[8] |
LUO Li, WANG Jing-yi, XU Zhao-jun, NA Bin*. Geographic Origin Discrimination of Wood Using NIR Spectroscopy
Combined With Machine Learning Techniques[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3372-3379. |
[9] |
ZHANG Shu-fang1, LEI Lei2, LEI Shun-xin2, TAN Xue-cai1, LIU Shao-gang1, YAN Jun1*. Traceability of Geographical Origin of Jasmine Based on Near
Infrared Diffuse Reflectance Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3389-3395. |
[10] |
YANG Qun1, 2, LING Qi-han1, WEI Yong1, NING Qiang1, 2, KONG Fa-ming1, ZHOU Yi-fan1, 2, ZHANG Hai-lin1, WANG Jie1, 2*. Non-Destructive Monitoring Model of Functional Nitrogen Content in
Citrus Leaves Based on Visible-Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3396-3403. |
[11] |
HUANG Meng-qiang1, KUANG Wen-jian2, 3*, LIU Xiang1, HE Liang4. Quantitative Analysis of Cotton/Polyester/Wool Blended Fiber Content by Near-Infrared Spectroscopy Based on 1D-CNN[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3565-3570. |
[12] |
HUANG Zhao-di1, CHEN Zai-liang2, WANG Chen3, TIAN Peng2, ZHANG Hai-liang2, XIE Chao-yong2*, LIU Xue-mei4*. Comparing Different Multivariate Calibration Methods Analyses for Measurement of Soil Properties Using Visible and Short Wave-Near
Infrared Spectroscopy Combined With Machine Learning Algorithms[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3535-3540. |
[13] |
KANG Ming-yue1, 3, WANG Cheng1, SUN Hong-yan3, LI Zuo-lin2, LUO Bin1*. Research on Internal Quality Detection Method of Cherry Tomatoes Based on Improved WOA-LSSVM[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3541-3550. |
[14] |
HUANG Hua1, LIU Ya2, KUERBANGULI·Dulikun1, ZENG Fan-lin1, MAYIRAN·Maimaiti1, AWAGULI·Maimaiti1, MAIDINUERHAN·Aizezi1, GUO Jun-xian3*. Ensemble Learning Model Incorporating Fractional Differential and
PIMP-RF Algorithm to Predict Soluble Solids Content of Apples
During Maturing Period[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3059-3066. |
[15] |
CHEN Jia-wei1, 2, ZHOU De-qiang1, 2*, CUI Chen-hao3, REN Zhi-jun1, ZUO Wen-juan1. Prediction Model of Farinograph Characteristics of Wheat Flour Based on Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3089-3097. |
|
|
|
|