Study of Modeling Samples Selection Method Based on Near Infrared Spectrum
JIN Zhao-xi1, ZHANG Xiu-juan2, LUO Fu-yi2, AN Dong1, 3*, ZHAO Sheng-yi1, RAN Hang1, YAN Yan-lu1
1. College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China 2. Dezhou Municipal Bureau of Agriculture, Dezhou 253016, China 3. Key Laboratory of Agricultural Information Acquisition Technology (Beijing), Ministry of Agriculture, Beijing 100083, China
Abstract:For more wheat varieties classification problem, we use near infrared spectrumto do qualitative analysis. Increasing the size of modeling sample could increase information of the model, however, at the same time, it also makes information redundancy so that modeling time and storage space will increase, thus, we need to decrease the size of modeling sample though selecting them. Some information must be lost and the effects of the model must be worse if we select samples blindly. We put forward the k nearest neighbor-density sample selection based on the traditional selection methods. Experiments use the near infrared diffuse reflection spectrum of wheat seed from lots of days. First, we use preprocessing and feature extraction to deal with the wheat original spectrum, then select modeling sample by three methods that are random sampling, k nearest neighbor and k nearest neighbor-density, finally, we establish the models of BPR(Biomimetic Pattern Recognition) and BPRI(Biomimetic Pattern Recognition Improved). The experimental results show that in the model of BPR we get the best results using the selection method of k nearest neighbor-density, especially it also decreases the size of modeling sample deeply, and in the model of BPRI the results using the selection method of k nearest neighbor-density are much better than random sampling and a little better than k nearest neighbor, but in the meanwhile the size of modeling sample using the selection method of k nearest neighbor-density are much smaller than k nearest neighbor. The experimental results prove that the sample selection method of k nearest neighbor-density can not only greatly reduce the modeling sample size, and ensure the quality of the model, it has obvious effect on varieties classification problem of wheat.
靳召晰1,张秀娟2,罗付义2,安 冬1, 3*,赵盛毅1,冉 航1,严衍禄1 . 近红外光谱建模样本选择方法研究 [J]. 光谱学与光谱分析, 2016, 36(12): 3920-3925.
JIN Zhao-xi1, ZHANG Xiu-juan2, LUO Fu-yi2, AN Dong1, 3*, ZHAO Sheng-yi1, RAN Hang1, YAN Yan-lu1 . Study of Modeling Samples Selection Method Based on Near Infrared Spectrum . SPECTROSCOPY AND SPECTRAL ANALYSIS, 2016, 36(12): 3920-3925.
[1] YAN Yan-lu, ZHAO Long-lian, HAN Dong-hai, et al(严衍禄, 赵龙莲, 韩东海, 等). Foundation and Application of Near-Infrared Spectroscopy Analysis(近红外光谱分析基础与应用). Beijing:China Light Industry Press(北京:中国轻工业出版社),2005. [2] LIU Xu-ping, HU Chang-qin, TIAN Ke-ren, et al(刘绪平, 胡昌勤, 田克仁, 等). Chinese Journal of Pharmaceutical Analysis(药物分析杂志),2010, 30(7): 1340. [3] ZHANG Qi-ke, DAI Lian-kui(张其可, 戴连奎). Chinese Journal of Sensors and Actuators(传感技术学报), 2006, 19(4):1190. [4] GAO Xue-jin, GENG Ling-xiao, XUE Pan-na, et al(高学金, 耿凌霄, 薛攀娜,等). Chinese Journal of Scientific Instrument(仪器仪表学报),2015, 36(2): 401. [5] ZHU Shi-ping, WANG Yi-ming, ZHANG Xiao-chao, et al(祝诗平, 王一鸣, 张小超, 等). Transactions of the Chinese Society for Agricultural Machinery(农业机械学报),2004, 35(4): 115. [6] LIU Li, WANG Chun-zhi(刘 丽, 王春枝). Software Guide(软件导刊), 2008, 7(7): 97. [7] ZHANG Chun-yang, ZHOU Ji-en, QIAN Quan, et al(张春阳, 周继恩, 钱 权, 等). Computer Science(计算机科学), 2004, 31(2): 127, 141. [8] ZHANG Li, GUO Jun(张 莉, 郭 军). Journal of Beijing University of Posts and Telecommunications(北京邮电大学学报), 2006, 29(4): 77. [9] WANG Shou-jue(王守觉). Acta Electronica Sinica(电子学报), 2002, 30(10): 1417. [10] HU Zhi-qin(虎治勤). Computer Knowledge and Technology(电脑知识与技术), 2011, 7(27): 6711.