Abstract:In the present paper, a simple but novel method based on maximum linearly independent group was introduced into near-infrared (NIR) spectral analysis for selecting representative calibration samples. The experiment materials contained 2 652 tobacco powder samples, with 1 001 samples randomly selected as prediction set, and the others as representative sample candidate set from which calibration sample set was selected. The method of locating maximum linearly independent vectors was used to select representative samples from the spectral vectors of representative samples candidate set. The arithmetic was accomplished by function rref(X,q) in Matlab. The maximum linearly independent spectral vectors were treated as calibration samples set. When different calculating precision q was given, different amount of representative samples were acquired. The selected calibration sample set was used to build regression model to predict the total sugar of tobacco powder samples by PLS. The model was used to analyze 1001 samples in the prediction set. When selecting 32 representative samples, the model presented a good predictive veracity, whose predictive mean relative error was 3.621 0%, and correlation coefficient was 0.964 3. By paired-samples t-test, we found that the difference between the predicting result of model obtained by 32 samples and that obtained by 146 samples was not significant (α=0.05). Also, we compared the methods of randomly selecting calibration samples and maximum linearly independent selection by their predicting effects of models. In the experiment, correspondingly, six calibration sample sets were selected, one of which included 28 samples, while the others included 32, 41, 76, 146 and 163 samples respectively. The method of maximum linearly independent selecting samples turned out to be obviously better than that of randomly selecting. The result indicated that the proposed method can not only effectively enhance the cost-effectiveness of NIR spectral analysis by reducing the number of samples required for cockamamie and expensive chemical measurement, but also improve the analysis accuracy. In conclusion, this method can be applied to select representative samples in near-infrared spectral analysis.
Key words:NIRS;Representative sample selection;Maximum linearly independent group
[1] McClure W F. Analytical Chemistry,1994,66(1):43. [2] YANG Hai-qing, HE Yong, CHEN Yong-ming, et al(杨海清, 何 勇, 陈永明, 等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2008, 28(6): 1232. [3] WANG Jia-jun, LIANG Yi-zeng, WANG Fan(王家俊,梁逸曾,汪 帆). Computers and Applied Chemistry(计算机与应用化学),2006,23(11):1133. [4] DUAN Yan-qing, WANG Jia-jun, YANG Tao, et al(段焰青, 王家俊, 杨 涛, 等). Laser & Infared(激光与红外), 2007, 37(10): 1058. [5] Inon Fernando A, Llario Rafael, Garrigues Salvador, et al. Analytical and Bioanalytical Chemistry, 2005, 382(7): 1549. [6] Chen Da, Cai Wensheng, Shao Xueguang. Anal. Bioanal. Chem., 2007, 387: 1041. [7] Martin K A. Applied Spectroscopy Reviews, 1992, 27(4): 325. [8] WU Jin-guang, et al(吴瑾光, 等). Techniques and Applications of Modern Fourier Transform Infrared Spectroscopy(近代傅里叶变换红外光谱技术及应用·上册). Beijing:Scientific and Technical Documents Publishing House((北京: 科学技术文献出版社),1994. 251. [9] WANG Dong-dan,LI Tian-fei,WU Yu-ping(王东丹,李天飞,吴玉萍). Journal of Yunnan University(云南大学学报),2001,23(2):135. [10] ZHANG Jian-ping,XIE Wen-yan,SHU Ru-xin(张建平,谢雯燕,束茹欣). Tobacco Science & Technology(烟草科技),1999,(3):37. [11] LI Hong-jun, LIU Wei, ZHAO Ji-shou, et al(李红军, 刘 巍, 赵吉寿, 等). Journal of Yunnan Nationalities University(Natural Sciences Edition)(云南民族大学学报·自然科学版), 2007, 16(4): 358. [12] CHEN Yi-qiang, SHEN Xiao-tian, LIU Guo-shun, et al(陈义强,沈笑天,刘国顺,等). Acta Agriculturae Universitatis Jiangxiensis(江西农业大学学报), 2007, 29(4): 550. [13] WU Jing-zhu, WANG Yi-ming, ZHANG Xiao-chao, et al(吴静珠, 王一鸣, 张小超, 等). Transactions of the Chinese Society for Agricultur Machinery(农业机械学报),2006, 37(4): 80. [14] GAN Li,SUN Xiu-li,JIN Liang,et al(甘 莉,孙秀丽,金 良,等). Scientia Agricultura Sinica(中国农业科学), 2003, 36(12):1609.