Abstract:Due to simplicity, rapidity and non-destructiveness, Raman spectroscopy is very suitable for mineral classification and identification. A Raman spectral model-fitting method does not need to build a reference spectral database and complex spectral matching, which is advantageous in mineral classification. However, there is a lack of comprehensive comparison of the existing model-fitting methods based on machine learning and deep learning since they use relatively single-learning models. To this end, this paper comprehensively evaluates the model-fitting classification methods of mineral Raman spectral using the RRUFF mineral Raman spectrum dataset. It compares the classification performance of four traditional machine learning methods of KNN, XGBoost, SVM, and RF, and three deep learning models of CNN, DNN, and RNN, as well as four data preprocessing methods and sample size on the classification effect. To improve the classification performance, we also propose a data preprocessing method of Raman spectral intensity curvature, which calculates the curvature of the baseline-corrected Raman spectral sequence intensity as a construction feature so that the model can extract the position of the spectra peaks more effectively. The experimental results showed that data preprocessing greatly improved the classification performance of machine learning models but had little effect on deep learning models. Additionally, the size of the sample is a key factor of the model performance. When the size is large, the deep learning models outperform the traditional machine learning models, whereas when the size is small, it is difficult for the deep learning models to exert their advantages, while the traditional machine learning models combined with data preprocessing work better.
[1] ZHANG Yan-hui, WU Liang-ping, SUN Zhen-rong(张延会, 吴良平, 孙真荣). Education in Chemistry(化学教学), 2006, (4):32.
[2] Sevetlidis V, Pavlidis G. Journal of Cultural Heritage, 2019, 37:121.
[3] Aggarwal R L, Di Cecca S, Farrar L W, et al. J. Raman Spectrosc., 2014, 45:677.
[4] Ishikawa S T, Gulick V C. Comput. Geosci., 2013, 54:259.
[5] Zhang R, Xie H M, Cai S N, et al. J. Raman Spectrosc., 2020, 51:176.
[6] Sattlecker M, Bessant C, Smith J, et al. Analyst, 2010, 135(5):895.
[7] Jahoda P, Drozdovskiy I, Payler S J, et al. Analyst, 2021, 146:184.
[8] Downs R T. Proceedings of the 19th General Meeting of the International Mineralogical Association, 2006. 3.
[9] Liu J, Osadchy M, Ashton L, et al. Analyst, 2017, 142:4067.
[10] Howley T, Madden M G, O’Connell M L, et al. Knowledge—Based Systems, 2006, 19:363.
[11] Lafuente B, Downs R T, Yang H, et al. Roczniki Akademii Rolniczej W Poznaniu Rolnictwo, 2015.
[12] Pedregosa F, Varoquaux G, Gramfort A, et al. Journal of Machine Learning Research, 2011, 12:2825.
[13] Abadi M, Barham P, Chen Jian-Min, et al. Proceedings of OSDI’16: 12th USENIX, 2016:265.
[14] Buda M, Maki A, Mazurowski M A. Neural Netw, 2018, 106:249.
[15] BAI He-xuan, YANG Feng, LI Dan-yang, et al(白鹤轩, 杨 峰, 李丹阳, 等). Acta Optica Sinica(光学学报), 2021, 41(20):2024001.
[16] Lecun Y, Bottou L, Bengio Y, et al. Proc. IEEE, 1998, 86:2278.
[17] Hochreiter S, Schmidhuber J. Neural Computation, 1997, 9(8):1735.