Study on Rapid Recognition of Microplastics Based on Infrared Spectroscopy
WU Xue1, 2, FENG Wei-wei2, 3, 4*, CAI Zong-qi2, 3, WANG Qing2, 3
1. Harbin Institute of Technology, Weihai, Weihai 264209, China
2. Key Laboratory of Coastal Environmental Process and Ecological Restoration (Yantai Institute of Coastal Zone), Chinese Academy of Sciences, Yantai 264003, China
3. Center for Ocean Mega-Science, Chinese Academy of Sciences, Qingdao 266071, China
4. University of Chinese Academy of Sciences, Beijing 100049, China
Abstract:The combination of spectroscopic technology and machine learning algorithm for rapid identification of microplastics provides great technical support for microplastics’ field detection, a new field that has attracted great attention. Nirs detection technology has the characteristics of fast detection speediness, highly sensitization, damage less, and can be directly detected without sample pretreatment, widely used in chemical analysis quality detection and other fields.This paper compares support vector machine (SVM) and Extreme Gradient Boosting (XGBoost), two machine learning classification algorithms based on the infrared spectrum, to build a classification model for high-speed and effective recognition of microplastics. Acrylonitrile butadiene styrene(ABS), Polyacrylonitrile (PAN), Polycarbonate (PC), Polyethylene glycol terephthalate(PET), Polymethyl methacrylate (PMMA), Polypropylene (PP), Polystyrene(PS), Polyvinyl chloride (PVC), Thermoplastic polyurethane (TPU), Ethylene-vinyl acetate copolymer (EVA), Polybutylene terephthalate (PBT), Polycaprolactone (PCL), Polyethersulfone (PES), Polylactic acid (PLA), Polyoxymethylene (POM), Polyphenylene Oxide (PPO), Polyphenylene sulfide (PPS), Poly tetra fluoroethylene (PTFE), polyvinyl alcohol (PVA), Styrenic Block Copolymers(SBS)20 standard samples of microplastics were collected by using A miniature near-infrared spectrum. In order to prevent overfitting, 1 260 microplastic samples were collected, each sample containing 512 data points. The XGBoost algorithm was used to rank the importance of the logarithmic data points, and a total of 65 data points which greatly influenced the recognition accuracy were extracted. SVM algorithm and XGBoost algorithm are respectively used to establish a microplastic fast recognition model for 65 data points extracted after dimensionality reduction, and GridSearchCV is used to select the hyperparameters that have a great influence on XGBoost algorithm to determine n_estimators, learning_rate, The optimal hyperparameters for min_child_weigh, max_depth, and gamma are 700, 0.07, 1,1, 0.0, respectively. In order to improve the model’s stability, recognition rate and generalization ability, a 10-fold cross-validation and confusion matrix were used to evaluate the model. The results show that the recognition accuracy of the XGBoost model is 97%, while that of the SVM model is 95%. The accuracy of the XGBoost model is better than the SVM model. In conclusion, the overall performance of the XGBoost model was better than that of the SVM model, which provides technical support for rapid identification of actual microplastics.
Key words:Microplastics; Near infrared spectrum; XGBoost; SVM
吴 雪,冯巍巍,蔡宗岐,王 清. 近红外光谱的海水微塑料快速识别[J]. 光谱学与光谱分析, 2022, 42(11): 3501-3506.
WU Xue, FENG Wei-wei, CAI Zong-qi, WANG Qing. Study on Rapid Recognition of Microplastics Based on Infrared Spectroscopy. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42(11): 3501-3506.
[1] Velimirovic M, Tirez K, Voorspoels S, et al. Analytical and Bioanalytical Chemistry, 2021, 413(24): 7.
[2] Michel A P M, Morrison A E, Preston V L, et al. Environ. Sci. Technol., 2020, 54(17): 10630.
[3] LUO Yong-ming, SHI Hua-hong, TU Chen, et al(骆永明,施华宏,涂 晨,等). Chin. Sci. Bull.(科学通报), 2021, 66: 1547.
[4] YANG Si-jie, FENG Wei-wei, WANG Qing, et al(杨思节,冯巍巍,王 清,等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2021, 41(8): 2469.
[5] Sommer C, Schneider L M, Nguyen J, et al. Marine Pollution Bulletin, 2021, 171: 112789.
[6] Liu Haitao, Niu Shuoran, Zhou Ying, et al. Micromachines, 2021, 12(6):696
[7] LIANG Zi-chao, LI Zhi-wei, LAI Keng, et al(梁子超,李智炜,赖 铿,等). Chinese Journal of Hospital Statistics(中国医院统计), 2020, 27(4): 289.
[8] Lemoine M, Piriou M, Charpentier A, et al. Small Ruminant Research, 2021, 202:106469.
[9] CHE Hong-xin, WANG Tong, WANG Wei(车宏鑫, 王 桐, 王 伟). Data Analysis and Knowledge Discovery(数据分析与知识发现), 2021, 5(9): 107.
[10] Huu P N, Ngoc T P. Journal of Robotics, 2021, 2021: 3986497.
[11] WANG Xing-yu, LUO Yu, Osawa(王星宇,罗 宇,大 沢). Hot Working Technology(热加工工艺), 2021, http://doi.org/10.14158/j.cnki.1001-3814.20202994.
[12] Hu C A, Chen C M, Fang Y C, et al. BMJ Open, 2020, 10(2): e033898.
[13] LIU Wen-fang, HAN Jun, LIU Yan-feng, et al(刘文芳,韩 军,刘艳锋,等). China Measurement & Test(中国测试), 2022, 48(1): 6.
[14] YE Tao, SI Qiao-rui, SHEN Chun-hao, et al(叶 韬,司乔瑞,申纯浩,等). Journal of Drainage and Irrigation Machinery Engineering(排灌机械工程学报), 2021, 39(9): 884.