Abstract:The illegal logging of valuable tree species is mainly motivated by the global market that consumes logs, lumber, veneers, and furniture. Rapid and reliable identification of the country of origin of protected timbers is one of the measures for combating illegal logging. There is a global need to create a wood origin identification system to ensure the integrity of wood supply and control the trade, exploitation, and smuggling of these products. Near-infrared spectroscopy (NIRS) is a promising technique for calibration-based and rapid species identification. In the present work, Near-Infrared Spectroscopy combined with machine learning techniques were used to discriminate six wood species (Pinus massoniana, Paulownia fortunei, Zelkova schneideriana, Tectona grandis, Tilia amurensis, Ailanthus altissima) originating from two regions. The initial step was to create a spectral dataset of tree origins by collecting spectral data on these six wood species from two distinct origins, each constituting a dataset. Then, reduce feature dimensionality to two dimensions to investigate the data distribution across datasets. Secondly, the high-dimensional spectral data were dimensionally reduced using principal component analysis and linear discriminant analysis, respectively, to improve the model's generalization and to compare the effects of the two techniques on the model's accuracy. Finally, six different machine learning, namely, Support vector machine, Logistic regression, K-Nearest neighbors, Naïve Bayes, Random Forest, and Artificial neural network, were used to train these wood samples' spectra and assess their discrimination performance. The results showed that the highest accuracies of Pinus massoniana, Paulownia fortunei, Zelkova schneideriana, Tectona grandis, Tilia amurensis, Ailanthus altissimaare 98.3%, 100%, 100%, 100%, 100%, 98.3%, and the fastest operation speed are 0.183, 0.182, 0.181, 0.182, 11.424 and 12.969 s respectively. We evaluated and compared the performance of six models based on different machine learning algorithms to predict the geographic origin of the wood. Compared to the other five models, the best results were obtained by the Artificial neural network approach, but its running time is more than other algorithms, and requires a higher number of tuned and optimized parameters. Moreover, both the linear and non-linear algorithms yielded positive results, but the non-linear models appear slightly better. The study revealed that applying NIRS assisted by machine learning technique is suitable for the rapid identification and discrimination of wood origin and can be an essential tool for tracing the origins of wood, contributing to a safe authentication method in a quick, relatively cheap, and non-destructive way.
Key words:Machine learning; Near-infrared spectroscopy; Wood origin identification; Principal component analysis; Linear discriminant analysis; Artificial neural network
[1] Feng L, Wu B, Zhu S, et al. Frontiers in Nutrition, 2021, 8: 680357.
[2] Kabir M H, Guindo M L, Chen R, et al. Foods, 2021, 10(11): 2767.
[3] Wang P, Yu Z. Journal of Pharmaceutical Analysis, 2015, 5(5): 277.
[4] Luypaert J, Massart D L, Vander Heyden Y. Talanta, 2007, 72(3): 865.
[5] Manley M. Chemical Society Reviews, 2014, 43(24): 8200.
[6] Wang Y, Xiang J, Tang Y, et al. Applied Spectroscopy Reviews, 2022, 57(4): 300.
[7] Liakos K, Busato P, Moshou D, et al. Sensors, 2018, 18(8): 2674.
[8] Wang Y, Zhang W, Gao R, et al. Wood Science and Technology, 2021, 55(5): 1171.
[9] Li Y, Via B K, Young T, et al. Forests, 2019, 10(12): 1078.
[10] Prades C, Gómez-Sánchez I, García-Olmo J, et al. Journal of Wood Chemistry and Technology, 2012, 32(1): 66.
[11] Yang Z, Liu Y, Pang X, et al. Bioresources, 2015, 10(4): 8505.
[12] Tsuchikawa S, Hayashi K, Tsutsumi S. Applied Spectroscopy, 1996, 50(9): 1117.
[13] Donaldson L. Wood Science and Technology, 2007, 41(5): 443.
[14] Cortes C, Vapnik V. Machine Learning, 1995, 20(3): 273.
[15] Riba Ruiz J R, Canals T, Cantero Gomez R. IEEE Transactions on Instrumentation and Measurement, 2012, 61(4): 1029.
[16] Breiman. Machine Learning, 2001, 45(1): 5.
[17] ZHANG Chi, GUO Yuan, LI Ming(张 驰, 郭 媛, 黎 明). Computer Engineering and Applications(计算机工程与应用), 2021, 57(11): 57.
[18] LUO Li, XU Zhao-jun, WANG Xiao-yu, et al(骆 立, 徐兆军, 王晓羽, 等). Journal of Forestry Engineering(林业工程学报), 2022, 7(104): 122.