基于近红外光谱技术建立木材产地鉴别模型

doi:10.3964/j.issn.1000-0593(2023)11-3372-08

摘要
参考文献
相关文章 (15)

全文: PDF (3068 KB)
输出: BibTeX | EndNote (RIS)

摘要：基于近红外光谱技术对木材产地进行识别必须依赖于光谱数据预处理方法和校准模型，然而大多数采用近红外光谱技术识别木材产地的研究工作都是采用经典的线性模型。构建木材地理溯源系统有利于促进木材市场的良性发展，打击乱砍滥伐，保护濒危树种。为提高木材产地识别效率，提出一种基于近红外光谱技术结合机器学习的木材产地识别方法。首先建立木材产地的光谱数据集，采集来自两种不同产地的樟子松、泡桐、榉木、柚木、椴木和臭椿的光谱数据，每个树种构成一个数据集，并将特征维度降至2维，以探索各数据集的数据分布情况；其次对原始光谱数据进行特征工程，即分别采用主成分分析法和线性判别分析法对高维光谱数据进行降维处理，以提高模型的泛化能力，并对比两种降维技术对模型准确率的影响；最后构建木材产地鉴别模型，分别从非线性算法、回归算法、分类算法、概率算法、集成算法和深度学习算法六个角度选取了支持向量机、逻辑回归、K最近邻、朴素贝叶斯、随机森林和人工神经网络6种算法建立模型，采用学习曲线、网格搜索法、K折交叉验证等算法优化模型参数以提高模型识别准确率及稳健性，并从模型的准确率与运行时间两个层面来评估模型效果。结果表明，基于近红外光谱技术结合机器学习是识别木材地理来源的有效手段，樟子松、泡桐、榉木、柚木、椴木和臭椿的准确率分别达到98.3%、100%、100%、100%、100%、98.3%，相应的模型运行时间分别为0.183、0.182、0.181、0.182、11.424和12.969 s。综合分析6种模型在各数据集上的表现，发现非线性的支持向量机和人工神经网络模型比其余模型更具有优势。其中，基于人工神经网络构建的木材产地鉴别模型表现优异，在各数据集中识别率最高，但运行时间远多于其余算法。

关键词：机器学习；近红外光谱；木材产地识别；主成分分析法；线性判别分析法；人工神经网络

Abstract：The illegal logging of valuable tree species is mainly motivated by the global market that consumes logs, lumber, veneers, and furniture. Rapid and reliable identification of the country of origin of protected timbers is one of the measures for combating illegal logging. There is a global need to create a wood origin identification system to ensure the integrity of wood supply and control the trade, exploitation, and smuggling of these products. Near-infrared spectroscopy (NIRS) is a promising technique for calibration-based and rapid species identification. In the present work, Near-Infrared Spectroscopy combined with machine learning techniques were used to discriminate six wood species (Pinus massoniana, Paulownia fortunei, Zelkova schneideriana, Tectona grandis, Tilia amurensis, Ailanthus altissima) originating from two regions. The initial step was to create a spectral dataset of tree origins by collecting spectral data on these six wood species from two distinct origins, each constituting a dataset. Then, reduce feature dimensionality to two dimensions to investigate the data distribution across datasets. Secondly, the high-dimensional spectral data were dimensionally reduced using principal component analysis and linear discriminant analysis, respectively, to improve the model's generalization and to compare the effects of the two techniques on the model's accuracy. Finally, six different machine learning, namely, Support vector machine, Logistic regression, K-Nearest neighbors, Naïve Bayes, Random Forest, and Artificial neural network, were used to train these wood samples' spectra and assess their discrimination performance. The results showed that the highest accuracies of Pinus massoniana, Paulownia fortunei, Zelkova schneideriana, Tectona grandis, Tilia amurensis, Ailanthus altissimaare 98.3%, 100%, 100%, 100%, 100%, 98.3%, and the fastest operation speed are 0.183, 0.182, 0.181, 0.182, 11.424 and 12.969 s respectively. We evaluated and compared the performance of six models based on different machine learning algorithms to predict the geographic origin of the wood. Compared to the other five models, the best results were obtained by the Artificial neural network approach, but its running time is more than other algorithms, and requires a higher number of tuned and optimized parameters. Moreover, both the linear and non-linear algorithms yielded positive results, but the non-linear models appear slightly better. The study revealed that applying NIRS assisted by machine learning technique is suitable for the rapid identification and discrimination of wood origin and can be an essential tool for tracing the origins of wood, contributing to a safe authentication method in a quick, relatively cheap, and non-destructive way.

Key words：Machine learning; Near-infrared spectroscopy; Wood origin identification; Principal component analysis; Linear discriminant analysis; Artificial neural network

收稿日期: 2022-05-14 修订日期: 2022-10-08

中图分类号:

O439

基金资助: 国家重点研发计划项目（2016YFD0600703）资助

通讯作者: 那斌 E-mail: nabin8691@126.com

作者简介: 骆立，1998年生，南京林业大学材料科学与工程学院硕士研究生 e-mail: luoli0044@163.com

引用本文:

骆立，王静仪，徐兆军，那斌. 基于近红外光谱技术建立木材产地鉴别模型[J]. 光谱学与光谱分析, 2023, 43(11): 3372-3379.
LUO Li, WANG Jing-yi, XU Zhao-jun, NA Bin. Geographic Origin Discrimination of Wood Using NIR Spectroscopy Combined With Machine Learning Techniques. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3372-3379.

链接本文:

https://www.gpxygpfx.com/CN/10.3964/j.issn.1000-0593(2023)11-3372-08 或 https://www.gpxygpfx.com/CN/Y2023/V43/I11/3372