Outlier Detection of Time Series Three-Dimensional Fluorescence Spectroscopy
YU Shao-hui1, ZHANG Yu-jun2, ZHAO Nan-jing2
1. School of Mathematics and Statistics, Hefei Normal University, Hefei 230061, China 2. Key Laboratory of Environmental Optics & Technology, Anhui Institute of Optics and Fine Mechanics, Chinese Academy of Sciences, Hefei 230031, China
Abstract:The qualitative and quantitative analysis are often interfered by the outliers in time series three-dimensional fluorescence spectroscopy. In this work, an efficient outlier detection method is proposed by taking advantage of the characteristics in time dimension and the spectral dimension. Firstly, the wavelength points that are mostly the outliers are extracted by the variance in time dimension. Secondly, by the analysis of the existence styles of outliers and similarity score of any two samples, the cumulative similarity is introduced in spectral dimension. At last, fluorescence intensity at each wavelength of all samples is modified by the correction matrix in time dimension and the outlier detection is completed according the to cumulative similarity scores. The application of the correction matrix in time dimension not only improves the validity of the method but also reduces the computation by the choice of characteristics region in correction matrix. Numerical experiments show that the outliers can still be detected by the 50 percent of all points in spectral dimension.
Key words:Time series;Three-dimensional fluorescence spectra;Outliers;Characteristics region
[1] Balkis N. Water Pollution. Croatia: InTech, 2012. [2] Mortensen P P,Bro R. Chemometrics Intelligent Laboratory System,2006, 84: 106. [3] Bro R. Chemometrics Intelligent Laboratory System, 1999, (46):133. [4] Carstea E M, Baker A, Bieroza M, et al. Water Research, 2010, 44(18): 5356. [5] Murphy K R, Hambly A, Singh S, et al. Environmental Science & Technology, 2011, 45(7): 2909. [6] Hur J, Cho J. Sensors, 2012, 12: 972. [7] Oto N N, Oshita S S, Makino Y Y, et al. Meat Science, 2013, 93(3): 579. [8] Sanchez N P, Skeriotis A T, Miller C M. Water Research, 2013, 47(4): 1679. [9] Sanchez N P, Skeriotis A T, Miller C M. Environmental Science & Technology, 2014, 48(3): 1582. [10] Lee J S, Cox D D. Computational Statistics & Data Analysis, 2010, 54(12): 3131. [11] Sanchez N P, Skeriotis A T, Miller C M. Water Research, 2013, 47(4): 1679. [12] Lawaetz A J, Bro R, Nielsen M K, et al. Metabolomics, 2012, 8(1): 111. [13] Bro R, Vidal M. Chemometrics and Intelligent Laboratory Systems, 2011, 106(1): 86. [14] Su W X, Zhu Y L, Liu F, et al. Journal of Central South University, 2013, 20(1): 114. [15] Gupta M, Gao J, Aggarwal C C, et al. IEEE Transactions on Knowledge and Data Engineering, 2013, 26(9): 2250.