A Convolutional Neural Network With Feature-Space Attention for Online Near-Infrared Detection of Tartaric Acid
LI Zhi-hao1, 2, XIAO Jin-feng1*, ZHANG Hong-ming2*, LÜ Bo2, 3*, YIN Xiang-hui1, LI Xiao-xing1, 2, ZHAO Ming4, MA Fei5
1. College of Electrical Engineering, University of South China, Hengyang 421001, China
2. Institute of Plasma Physics, Hefei Institutes of Physical Sciences, Chinese Academy of Sciences, Hefei 230031, China
3. Science Island Branch, Graduate School, University of Science and Technology of China, Hefei 230031, China
4. College of Biological and Food Engineering, Anhui Polytechnic University, Hefei 241000, China
5. College of Food and Biological Engineering, Hefei University of Technology, Hefei 230009, China
摘要: 酒石酸作为一种重要的有机酸,广泛存在于葡萄酒、果汁、碳酸饮料及部分糖果等食品中,其浓度直接影响产品的酸甜平衡、口感稳定性。在相关食品生产过程中,酒石酸浓度会因原料差异及配方调整而波动,因此建立能够实时在线监测酒石酸浓度的方法对保障产品质量与生产一致性至关重要。然而,现有检测方法(如滴定法、HPLC)存在响应迟滞问题,难以实现实时在线监测。鉴于工业生产过程的多变量、非线性和动态特性,建立精确的浓度预测模型对方法学提出了更高要求。为此,融合一维卷积神经网络(1D-CNN)与特征空间注意力机制(FSA),构建CNN-FSA混合模型,采集近红外光谱以驱动酒石酸溶液浓度检测实验,探索模型在提升检测速度与鲁棒性方面的潜力,为溶液化学过程的浓度实时在线监测提供创新参考方法。光谱数据先经主成分分析(PCA)结合马氏距离剔除异常,采用标准正态变量变换(SNV)以消除散射与基线漂移,再使用提出的CNN-FSA模型和 PLSR模型对数据集进行训练和评估。模型性能通过决定系数(R2)、均方根误差(RMSE)和平均绝对误差(MAE)进行综合评估。设计了六轮实验,每轮实验初始底料为 500 g(水、酒精、葡萄糖、苹果酸,柠檬酸)混合液体,补料为500 g(475 g 水+25 g 酒石酸)。前四轮实验数据按7∶3比例随机划分为训练集和测试集,后两轮实验数据作为独立预测集以严格评估模型泛化能力。在独立预测集上,CNN-FSA模型取得了优异的性能:R2=0.989 6,RMSE=0.000 702,MAE=0.000 580。相比之下,PLSR模型的性能为:R2=0.968 8,RMSE=0.001 214,MAE=0.001 059。相较于 PLSR,CNN-FSA在独立预测集上的RMSE显著降低了42.17%,MAE 降低了45.23%。结果表明:CNN-FSA在酒石酸浓度预测建模中显著优于PLSR,在预测集上展现出更强泛化性与稳健性。
关键词:近红外光谱;PLSR;卷积神经网络;酒石酸;在线检测
Abstract:Tartaric acid, as an important organic acid, is widely present in wine, fruit juice, carbonated beverages, and certain confectionery products. Its concentration directly influences the balance between sweetness and acidity as well as the stability of flavor. During food production, the tartaric acid concentration may fluctuate due to variations in raw materials and formulation adjustments. Therefore, establishing a method for real-time online monitoring of tartaric acid concentration is crucial for ensuring product quality and production consistency. However, conventional detection methods (e. g., titration, HPLC) suffer from response delays and are unsuitable for real-time monitoring. Considering the multivariate, nonlinear, and dynamic characteristics of industrial processes, more accurate concentration prediction models are required. To address this, we integrate a one-dimensional convolutional neural network (1D-CNN) with a feature-space attention (FSA) mechanism, resulting in a CNN-FSA hybrid model. By conducting near-infrared (NIR) spectroscopy—driven experiments to detect tartaric acid concentration, this study explores the potential of CNN-FSA to improve prediction speed and model robustness, thereby providing an innovative approach for real-time online monitoring of solution-phase chemical processes. Spectral data were first processed using principal component analysis (PCA) combined with Mahalanobis distance to remove outliers, followed by standard normal variate (SNV) transformation to eliminate scattering and baseline drift. Subsequently, the proposed CNN-FSA model and the traditional partial least squares regression (PLSR) model were trained and evaluated. Model performance was comprehensively assessed using the coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE). Six rounds of experiments were designed, with each round starting with 500 g of a mixed solution (water, ethanol, glucose, malic acid, and citric acid) as the initial substrate, supplemented with 500 g of a solution (475 g water+25 g tartaric acid). Data from the first four rounds were randomly split into training and test sets at a 7∶3 ratio. In comparison, data from the last two rounds were used as independent test sets to evaluate the model's generalization ability rigorously. On the independent prediction sets, the CNN-FSA model achieved outstanding performance: R2=0.989 6, RMSE=0.000 702, and MAE=0.000 580. In contrast, the PLSR model yielded R2=0.968 8, RMSE=0.001 214, and MAE=0.001 059. Compared with PLSR, CNN-FSA reduced RMSE by 42.17% and MAE by 45.23% on the independent prediction sets. The CNN-FSA model significantly outperforms PLSR in tartaric acid concentration prediction, demonstrating superior generalization and robustness on independent prediction datasets.