无监督学习AE和MVO-DBSCAN结合LIF在煤矿突水识别中的应用

doi:10.3964/j.issn.1000-0593(2019)08-2437-06

摘要
参考文献
相关文章 (15)

全文: PDF (2356 KB)
输出: BibTeX | EndNote (RIS)

摘要：快速准确的识别突水类型和突水来源对煤矿安全开采具有重要意义，激光诱导荧光（LIF）技术在检测中具有快速性和灵敏性，将LIF应用于煤矿突水的检测，再结合模式识别算法，可快速识别出突水来源。目前用于识别水样光谱的算法过于依赖预先建立的水样光谱数据库，当突水水源不在该库中时，易引发误识别。无监督学习算法DBSCAN在聚类时不需样本集的标签和类别信息，能降低对未知类别的误识别，因此把DBSCAN算法用于突水的激光诱导荧光光谱识别，并将MVO用于DBSCAN的参数寻优，省去繁琐的人工参数寻优过程。实验中，从谢桥煤矿采水点获取四个水样，利用像素为2 048的USB2000+光谱仪采集水样的荧光光谱，每种水样采集30组光谱数据。首先，利用无监督学习算法自动编码器(AE)对原始光谱数据降维，以减少光谱数据中冗余信息对聚类的影响，设计的AE的结构是介于浅层和深层之间的多层网络模型，可将原始光谱数据降到2维；为使降维模型具有稀疏性，在传统的AE算法中加入一个Dropout层，由实验可知，加入Dropout层后的降维模型具有较快的收敛速度。将多元宇宙优化(MVO)算法用于DBSCAN参数寻优，在参数寻优过程中，DBSCAN对降维后的水样光谱识别率最高为97.5%，此时参数所对应的取值范围为[0.023 66 0.040 65]；为验证AE对水样光谱数据降维的有效性，把归一化后的未降维的光谱数据用于DBSCAN聚类识别，DBSCAN对原始水样光谱的识别率最高为95%，比降维后的后水样光谱识别率低了2.5%，结果表明，使用AE降维光谱数据，可提高DBSCAN对不同光谱的识别率。最后，用监督学习算法K最近邻(KNN)识别降维后的水样光谱，将识别结果和无监督学习算法DBSCAN的识别结果对比，其中训练集选用三种水样，测试集使用四种水样；在测试集中，监督学习算法只能准确地识别训练集所包含的水样类别，但把训练集没有的类别全部识别错误，而DBSCAN能准确的识别出训练集中没有的水样光谱。非线性降维算法AE能实现对高维的水样光谱数据降维，把MVO-DBSCAN用于煤矿突水水源的LIF光谱识别，可有效降低因矿井水源光谱数据库建立不完备而引起的误识别。

关键词：煤矿突水；激光诱导荧光；光谱识别；密度聚类；多元宇宙优化；自动编码器；丢失

Abstract：Quick and accurate identification of water inrush types and sources of water inrush is of great significance for safe mining of coal mines. Laser-induced fluorescence (LIF) technology is rapid and sensitive in detection, which applies LIF to the detection of water inrush in coal mines and uses pattern recognition algorithm to quickly identify the source of water inrush. The current algorithms for identifying water samples are too dependent on pre-established water sample spectral databases When the water source is not in the library, it is easy to cause misidentification. The unsupervised learning algorithm DBSCAN does not require the label and category information of the sample set when clustering, which can reduce the misidentification of unknown categories. Therefore, the DBSCAN algorithm is used to identify the laser-induced fluorescence spectra in water inrush, and MVO is used for the parameter optimization of DBSCAN, which can eliminate the cumbersome manual parameter optimization process. In the experiment, four water samples were taken from the water intake point of Xieqiao Coal Mine, and 30 sets of spectral data were collected for each water sample. The fluorescence spectra of the water samples were collected using a USB2000+ spectrometer with a pixel of 2 048. First, the unsupervised learning algorithm automatic encoder (AE) reduces the dimension of the original spectral data to reduce the influence of redundant information in the spectral data on the clustering. The structure of the AE designed in this paper is a multi-layer network model between shallow and deep layers, which can reduce the original spectral data to 2 dimensions. In order to make the dimensionality reduction model sparse, the author adds a Dropout layer to the traditional AE algorithm. It can be seen from the experiment that the dimensionality reduction model after adding the Dropout layer has a faster convergence speed. Then, using the multivariate optimization (MVO) algorithm to optimize the DBSCAN parameters. In the parameter optimization process, the spectral recognition rate of the water sample after DBSCAN is up to 97.5%, and the corresponding range of the parameter Eps is [0.023 66 0.040 65]. The normalized unscaled spectral data is used for DBSCAN cluster identification to verify the effectiveness of AE on the dimensionality reduction of water sample spectral data. The recognition rate of the original water sample spectrum by DBSCAN is up to 95%, which is 2.5% lower than that of the post-dimensional water sample. The results show that using AE dimensionality reduction data can improve the recognition rate of DBSCAN for different spectra. Finally, the supervised learning algorithm K nearest neighbor (KNN) is used to identify the water sample spectrum after dimension reduction, and the recognition result and the unsupervised learning algorithm DBSCAN are compared. The training set uses three water samples, and the test set uses four water samples. For the test set data, the supervised learning algorithm can only accurately identify the water sample categories contained in the training set, but all the categories that are not in the training set are identified incorrectly. On the contrary, DBSCAN can accurately identify the water sample spectrum not in the training set. The nonlinear dimensionality reduction algorithm AE can achieve dimensionality reduction on high-dimensional water spectral data. The use of MVO-DBSCAN for LIF spectral identification of coal mine water inrush can effectively reduce the misidentification caused by the incompleteness of the mine water source spectrum database.

Key words：Mine water inrush; Laser induced fluorescence; Spectral recognition; Density-based special clustering of applications with noise(DBSCAN); Multi-verse optimizer; Auto encoder; Dropout

收稿日期: 2018-12-19 修订日期: 2019-04-30

中图分类号:

O657.3

基金资助: 国家“十二五”科技支撑计划重点项目（2013BAK06B01），国家安全生产重大事故防治关键技术科技项目（anhui-0001-2016AQ），国家自然科学基金项目（51174258），安徽省自然科学基金面上项目（1808085MF202），安徽省高校科学研究重大项目（KJ2018ZD036）资助

通讯作者: 周孟然 E-mail: mrzhou8521@163.com

作者简介: 来文豪，1992年生，安徽理工大学电气与信息工程学院博士研究生 e-mail： whlai9@163.com

引用本文:

来文豪，周孟然，李大同，王亚，胡锋，赵舜，顾煜林. 无监督学习AE和MVO-DBSCAN结合LIF在煤矿突水识别中的应用[J]. 光谱学与光谱分析, 2019, 39(08): 2437-2442.
LAI Wen-hao, ZHOU Meng-ran, LI Da-tong, WANG Ya, HU Feng, ZHAO Shun, GU Yu-lin. Application of Unsupervised Learning AE and MVO-DBSCAN Combined with LIF in Mine Water Inrush Recognition. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2019, 39(08): 2437-2442.

链接本文:

https://www.gpxygpfx.com/CN/10.3964/j.issn.1000-0593(2019)08-2437-06 或 https://www.gpxygpfx.com/CN/Y2019/V39/I08/2437