Application of Near Infrared Spectroscopy Combined with Comparative Principal Component Analysis for Pesticide Residue Detection in Fruit
CHEN Shu-yi1, ZHAO Quan-ming1, DONG Da-ming2*
1. Hebei University of Technology, Tianjin 300401, China
2. Beijing Research Center for Intelligent Equipment for Agriculture, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Abstract:Near-infrared spectroscopy (NIR) analysis is considered as a promising chemical analysis technique because its advantages of convenient-testing, no damaging and fast response. However, due to the many unknown factors in the band distribution and structural analysis of the near-infrared spectrum, there are many difficulties in extracting the characteristic spectral information. Nowadays, although a variety of spectral data dimensionality reduction methods have been widely used, the traditional data dimensionality reduction methods have a limitation that the dimensionality reduction is restricted in one dataset. The results of data dimensionality reduction are often not ideal when there are many factors in dataset . This problem makes the data establish dimensionality reduction model extremely hard in near-infrared spectrum. Comparative Principal Component Analysis (cPCA) is an improved algorithm based on principal component analysis (PCA), which originated from Comparative Learning and applied to genomic information analysis. The advantage of the cPCA algorithm is that it can realize the dimensionality reduction between two related data sets. In this paper, the cPCA algorithm is applied to near-infrared spectroscopy for the first time and establish an accurate spectral dimensionality reduction model. In the experimental, we used the cPCA algorithm to analyze the surface of different types of fruits (apples and pears) with pesticide residues and without pesticide residues . The result showed that the PCA algorithm just distinguishes different fruit types, while the cPCA algorithm classifies the fruits with or without pesticides due to the constraint of the background dataset. This showed that cPCA outperforms in data dimensionality reduction of near-infrared spectra. It solves the problem of dataset limitation and feature information extraction in the near-infrared spectral data dimensionality, and cPCA could establish an accurate spectral data dimensionality reduction model.
Key words:Near-infrared spectroscopy; cPCA; Data dimensionality reduction; Model establishment
[1] Niu D, Dy J, Jordan M. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics,2011, 15: 552.
[2] Jolliffe I T, Cadima J. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 2016, 374(2065): 20150202. doi: org/10.1098/rsta.2015.0202.
[3] Wetzel S J. Physical Review E, 2017, 96(2): 022140.
[4] Wu L, Shen C, van den Hengel A. Pattern Recognition, 2017, 65(0031-3203): 238.
[5] Wu Y C, Hwang H T, Hsu C C, et al. INTERSPEECH, 2016, 567: 1652.
[6] Gisbrecht A, Schulz A, Hammer B. Neurocomputing, 2015, 147(0925-2312): 71.
[7] Abid A, Zhang M J, Bagaria V K, et al. Nature Communications, 2018, 9(1): 2134.
[8] Severson K, Ghosh S, Ng K. arXiv preprint arXiv: 1811.06094, 2018.
[9] SUN Jun, ZHOU Xin, MAO Han-ping,et al(孙 俊, 周 鑫, 毛罕平, 等). Transactions of the Chinese Society of Agricultural Engineering(农业工程学报), 2016, 32(19): 302.