|
|
|
|
|
|
Identification of Boletus Species Based on Discriminant Analysis of Partial Least Squares and Random Forest Algorithm |
CHEN Feng-xia1, YANG Tian-wei2, LI Jie-qing1, LIU Hong-gao3, FAN Mao-pan1*, WANG Yuan-zhong4* |
1. College of Resources and Environmental Sciences, Yunnan Agricultural University, Kunming 650201, China
2. Yunnan Institute for Tropical Crops Research, Jinghong 666100, China
3. College of Agronomy and Biotechnology, Yunnan Agricultural University, Kunming 650201, China
4. Institute of Medicinal Plants, Yunnan Academy of Agricultural Sciences, Kunming 650200, China
|
|
|
Abstract As a famous wild edible mushroom, boletus has great edible and economic value. There are many kinds of boletus, and it is not easy to distinguish. An effective, rapid and credible species identification technology can be established to improve the quality of boletus.In this study, a total of 683 strains of 7 species of wild bolete from different regions of Yunnan were collected, the infrared and ultraviolet spectra of the samples were obtained, and the average spectral characteristics of different kinds of bolete were analyzed. Based on the single spectral data of multiple preprocessing combinations (SNV+SG, 2D+MSC+SNV, 1D+MSC+SNV+SG, MSC+2D) combined with two feature value extraction methods (PCA, LVs), the partial least squares discrimination analysis and random forest algorithm combined with data fusion strategy to identify the species of boletus.There is a certain degree of innovation. The results show: (1) The average spectral absorption peaks of different types of boletus in the mid-infrared spectrum and the ultraviolet spectrum have small differences, and the absorbance has subtle differences. (2) Appropriate preprocessing can improve spectral data information. The best preprocessing combination of mid-infrared spectral data and ultraviolet spectral data for partial least square discriminant analysis and random forest algorithm model is 2D+MSC+SNV, SNV+SG, 2D +MSC+SNV, 1D+MSC+SNV+SG. (3) The mid-infrared spectroscopy model is better than the ultraviolet spectroscopy model in the single spectrum model. The partial least squares discriminant analysis model of the best preprocessing combination of mid-infrared spectroscopy 2D+MSC+SNV has a correct rate of 99.78% in the training set and 99.12% in the validation set. The accuracy of the random forest model is 93.20% on the training set and 99% on the validation set. (4) The data fusion strategy improves classification accuracy. The accuracy of the low-level fusion partial least squares discriminant analysis model training set and validation set is 100%, 99.12%. The accuracy of the random forest model’s training set and validation set are 92.32% and 99.14%. (5) Random Forest Algorithm Intermediate Data Fusion latent variable (LVs) training set 92.76%, validation set 96%; Intermediate Data Fusion principal components analysis (CPA) training set 97.15%, validation set 100%. (6) Partial Least Squares Discriminant Analysis Intermediate Data Fusion (LVs) training set is 100%, and validation set is 99.56%; the accuracy of intermediate data fusion (CPA) training set and validation set can reach 100%. Based on the discriminant analysis of the partial least squares method and random forest algorithm combined with data fusion strategy, the species identification of boletus is satisfactory. Partial Least Squares Discriminant Analysis Intermediate Data Fusion (CPA) can be used as a low-cost and high-efficiency technology for identifying boletus species.
|
Received: 2021-01-04
Accepted: 2021-02-01
|
|
Corresponding Authors:
FAN Mao-pan, WANG Yuan-zhong
E-mail: mpfan@126.com;boletus@126.com
|
|
[1] GU Ke-fei, ZHOU Chang-yan, SHAO Yi, et al(顾可飞,周昌艳,邵 毅, 等). Food Research and Development(食品研究与开发), 2017, 38(17): 129.
[2] Qi L, Liu H, Li J, et al. Sensors, 2018, 18(1): 241.
[3] Mleczek M, Rzymski P, Budka A, et al. Journal of Food Composition and Analysis, 2018, 66: 168.
[4] Li Y, Zhang J, Wang Y. Analytical and Bioanalytical Chemistry, 2018, 410(1): 91.
[5] Gao R, Chen C, Wang H, et al. PLOS ONE, 2020, 15(8): e238149.
[6] HU Yi-ran, LI Jie-qing, LIU Hong-gao, et al(胡翼然,李杰庆,刘鸿高, 等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2020, 40(5): 1495.
[7] Yao S, Li T, Liu H, et al. Journal of the Science of Food and Agriculture, 2018, 98(6): 2215.
[8] YAO Sen, ZHANG Ji, LIU Hong-gao, et al(姚 森,张 霁,刘鸿高, 等). Food Science(食品科学), 2018, 39(1): 305.
[9] Li Y, Zhang J, Wang Y. Analytical and Bioanalytical Chemistry, 2018, 410(1): 91.
[10] Li X, Li J, Liu H, et al. International Journal of Food Properties, 2020, 23(1): 227.
[11] Yao S, Li J, Duan Z, et al. Analytical Letters, 2019, 53(7): 1019.
[12] Ríos-Reina R, Elcoroaristizabal S, Ocaña-González J A, et al. Food Chemistry, 2017, 230: 108.
[13] Probst, Philipp, Boulesteix, et al. Journal of Machine Learning Research, 2018,(18): 1.
|
[1] |
YANG Cheng-en1, 2, LI Meng3, LU Qiu-yu2, WANG Jin-ling4, LI Yu-ting2*, SU Ling1*. Fast Prediction of Flavone and Polysaccharide Contents in
Aronia Melanocarpa by FTIR and ELM[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 62-68. |
[2] |
DUAN Ming-xuan1, LI Shi-chun1, 2*, LIU Jia-hui1, WANG Yi1, XIN Wen-hui1, 2, HUA Deng-xin1, 2*, GAO Fei1, 2. Detection of Benzene Concentration by Mid-Infrared Differential
Absorption Lidar[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3351-3359. |
[3] |
YANG Qun1, 2, LING Qi-han1, WEI Yong1, NING Qiang1, 2, KONG Fa-ming1, ZHOU Yi-fan1, 2, ZHANG Hai-lin1, WANG Jie1, 2*. Non-Destructive Monitoring Model of Functional Nitrogen Content in
Citrus Leaves Based on Visible-Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3396-3403. |
[4] |
YAN Xing-guang, LI Jing*, YAN Xiao-xiao, MA Tian-yue, SU Yi-ting, SHAO Jia-hao, ZHANG Rui. A Rapid Method for Stripe Chromatic Aberration Correction in
Landsat Images[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3483-3491. |
[5] |
LIU Bo-yang1, GAO An-ping1*, YANG Jian1, GAO Yong-liang1, BAI Peng1, Teri-gele1, MA Li-jun1, ZHAO San-jun1, LI Xue-jing1, ZHANG Hui-ping1, KANG Jun-wei1, LI Hui1, WANG Hui1, YANG Si2, LI Chen-xi2, LIU Rong2. Research on Non-Targeted Abnormal Milk Identification Method Based on Mid-Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3009-3014. |
[6] |
DONG Jian-jiang1, TIAN Ye1, ZHANG Jian-xing2, LUAN Zhen-dong2*, DU Zeng-feng2*. Research on the Classification Method of Benthic Fauna Based on
Hyperspectral Data and Random Forest Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3015-3022. |
[7] |
HUANG Hua1, LIU Ya2, KUERBANGULI·Dulikun1, ZENG Fan-lin1, MAYIRAN·Maimaiti1, AWAGULI·Maimaiti1, MAIDINUERHAN·Aizezi1, GUO Jun-xian3*. Ensemble Learning Model Incorporating Fractional Differential and
PIMP-RF Algorithm to Predict Soluble Solids Content of Apples
During Maturing Period[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3059-3066. |
[8] |
LIU Fei1, TAN Jia-jin1*, XIE Gu-ai2, SU Jun3, YE Jian-ren1. Early Diagnosis of Pine Wilt Disease Based on Hyperspectral Data and Needle Resistivity[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3280-3285. |
[9] |
WU Chao1, QIU Bo1*, PAN Zhi-ren1, LI Xiao-tong1, WANG Lin-qian1, CAO Guan-long1, KONG Xiao2. Application of Spectral and Metering Data Fusion Algorithm in Variable Star Classification[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(06): 1869-1874. |
[10] |
LI Quan-lun1, CHEN Zheng-guang1*, JIAO Feng2. Prediction of Oil Content in Oil Shale by Near-Infrared Spectroscopy Based on Stacking Ensemble Learning[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(04): 1030-1036. |
[11] |
LIU Xin-yu1, SHAO Wen-wu2*, ZHOU Shi-rui3. Spectral Pattern Recognition of Cardiac Tissue in Electric Shock Death and Post-Mortem Electric Shock[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(04): 1126-1133. |
[12] |
LIU Si-qi1, FENG Guo-hong1*, TANG Jie2, REN Jia-qi1. Research on Identification of Wood Species by Mid-Infrared Spectroscopy Based on CA-SDP-DenseNet[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(03): 814-822. |
[13] |
YANG Cheng-en1, SU Ling2, FENG Wei-zhi1, ZHOU Jian-yu1, WU Hai-wei1*, YUAN Yue-ming1, WANG Qi2*. Identification of Pleurotus Ostreatus From Different Producing Areas Based on Mid-Infrared Spectroscopy and Machine Learning[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(02): 577-582. |
[14] |
LI Xiao1, CHEN Yong2, MEI Wu-jun3*, WU Xiao-hong2*, FENG Ya-jie1, WU Bin4. Classification of Tea Varieties Using Fuzzy Covariance Learning
Vector Quantization[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(02): 638-643. |
[15] |
FENG Hai-zhi1, LI Long1*, WANG Dong2, ZHANG Kai1, FENG Miao1, SONG Hai-jiang1, LI Rong1, HAN Ping2. Progress of the Application of MIR and NIR Spectroscopies in Quality
Testing of Minor Coarse Cereals[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(01): 16-24. |
|
|
|
|