|
|
|
|
|
|
Local Preserving Projection Similarity Measure Method Based on Kernel Mapping and Rank-Order Distance |
QIN Yu-hua1, ZHANG Meng1*, YANG Ning2, SHAN Qiu-fu3 |
1. College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
2. Qingdao Lanzhi Modern Service Industry Digital Engineering Research Center, Qingdao 266071, China
3. China Tobacco Yunnan Industrial Co., Ltd., Technical Research Center, Kunming 650024, China |
|
|
Abstract Aiming at the curse of dimensionality problem in measuring spectral similarity caused by the high dimensionality, high redundancy, non-linearity and small samples of the near-infrared spectrum, a local preserving projection algorithm based on kernel mapping and rank-order distance (KRLPP) is proposed in this paper. First, the spectral data is mapped to a higher-dimensional space through a kernel transformation, which effectively ensures the manifold structure’s nonlinear characteristics. Then, the dimensionality of the data is reduced by the locality preserving projections (LPP) algorithm, the rank-order distance is introduced instead of the traditional Euclidean distance or geodesic distance, and a more accurate local neighborhood relationship can be obtained by sharing the information of neighboring points. Finally, the measurement of the spectrum is realized by calculating the distance in low-dimensional space. This method solves the problem of distance failure in high-dimensional space and improves the accuracy of similarity measurement results. In order to verify the effectiveness of the KRLPP algorithm, firstly, the best parameters including the number k of the nearest neighbors and the dimensionality d of the reduced space were determined according to the residuals variation of the dataset before and after dimension reduction. Secondly, it compared with PCA, LPP, and INLPP algorithms from the perspectives of the projection effect of the spectra dimension reduction and the model classification ability. The results show that the KRLPP algorithm has a better ability to distinguish tobacco positions, and the effects of dimension reduction and correct identification of different tobacco positions are significantly better than PCA, LPP and INLPP methods. Finally, five representative tobacco were selected as target tobacco from a certain brand of cigarette formula. At the same time, PCA, LPP and KRLPP methods were used to find similar tobacco for each target tobacco from 300 tobacco samples used for formula maintenance, and the tobacco and cigarette formulas before and after replacement were evaluated from the aspects of chemical composition and sensory. Among them, the parameter selection of LPP and KRLPP for dimensionality reduction is consistent, and 6 principal components were selected for PCA. The results showed that, compared with PCA and LPP methods, the chemical components of total sugar, reducing sugar, total nicotine, total nitrogen and sensory indexes such as aroma, smoke and taste of the replacement tobacco and the replacement formula selected by the KRLPP algorithm had the least difference, and the accuracy of similarity measurement was the highest. This method can be applied to search for alternative raw materials for formula products and assist enterprises in maintaining product quality.
|
Received: 2020-09-24
Accepted: 2021-01-30
|
|
Corresponding Authors:
ZHANG Meng
E-mail: 1427193350@qq.com
|
|
[1] CHU Xiao-li, XU Yu-peng, LU Wan-zhen(褚小立, 许育鹏, 陆婉珍). Chinese Journal of Analytical Chemistry(分析化学), 2008, 23(5): 702.
[2] SONG Chun-jing, DING Xiang-qian, XU Peng-min, et al(宋春静, 丁香乾, 徐鹏民, 等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2017, 37(7): 2032.
[3] Li W, Wang G, Li K, et al. Chinese High Technology Letters, 2017, 65(2): 1764.
[4] HE Ling, CAI Yi-chao, YANG Zheng(贺 玲, 蔡益朝, 杨 征). Computer Science(计算机科学), 2010, 37(5): 155.
[5] XIE Ming-xia, GUO Jian-zhong, ZHANG Hai-bo, et al(谢明霞, 郭建忠, 张海波, 等). Computer Engineering and Science(计算机工程与科学), 2010, 32(5): 92.
[6] CAO Peng-yun, FU Qiu-juan, GONG Hui-li, et al(曹鹏云, 付秋娟, 宫会丽, 等). Chinese Tobacco Science(中国烟草科学), 2013, 34(3): 84.
[7] XU Bao-ding, DING Xiang-qian, QIN Yu-hua, et al(徐宝鼎, 丁香乾, 秦玉华, 等). Laser & Optoelectronics Progress(激光与光电子学进展), 2019, 56(3): 251.
[8] Lu K, He X F. Pattern Recognition, 2005, 38(11): 2047.
[9] ZHANG Zhi-wei, YANG Fan, XIA Ke-wen, et al(张志伟, 杨 帆, 夏克文, 等). Journal of Electronics and Information Technology(电子与信息学报), 2008, 45(3): 539.
[10] Gu X H, Gong W G, Yang L P. Neurocomputing, 2011, 74(17): 1452.
[11] HUANG Dong-mei, ZHANG Xiao-tong, ZHANG Ming, et al(黄冬梅, 张晓桐, 张 明, 等). Laser & Optoelectronics Progress(激光与光电子学进展), 2019, 56(2): 63.
[12] Zhu C, Wen F, Sun J. A·Rank-Order Distance Based Clustering Algorithm for Face Tagging, CVPR 2011, 2011, 481. doi: 10.1109/CVPR.2011.5995680.
[13] ZHAO Chun-hui, TIAN Ming-hua, LI Jia-wei(赵春晖,田明华,李佳伟). Journal of Harbin Engineering University(哈尔滨工程大学学报),2017,38(8):1179.
[14] Agelet L E, Ellis D D, Duvick S. J. Cereal. Sci., 2012, 55(4): 160.
[15] Meesa C, Souard F, Delported C. Talanta, 2018, 177(9): 4. |
[1] |
GAO Feng1, 2, XING Ya-ge3, 4, LUO Hua-ping1, 2, ZHANG Yuan-hua3, 4, GUO Ling3, 4*. Nondestructive Identification of Apricot Varieties Based on Visible/Near Infrared Spectroscopy and Chemometrics Methods[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 44-51. |
[2] |
BAO Hao1, 2,ZHANG Yan1, 2*. Research on Spectral Feature Band Selection Model Based on Improved Harris Hawk Optimization Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 148-157. |
[3] |
HU Cai-ping1, HE Cheng-yu2, KONG Li-wei3, ZHU You-you3*, WU Bin4, ZHOU Hao-xiang3, SUN Jun2. Identification of Tea Based on Near-Infrared Spectra and Fuzzy Linear Discriminant QR Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3802-3805. |
[4] |
LIU Xin-peng1, SUN Xiang-hong2, QIN Yu-hua1*, ZHANG Min1, GONG Hui-li3. Research on t-SNE Similarity Measurement Method Based on Wasserstein Divergence[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3806-3812. |
[5] |
BAI Xue-bing1, 2, SONG Chang-ze1, ZHANG Qian-wei1, DAI Bin-xiu1, JIN Guo-jie1, 2, LIU Wen-zheng1, TAO Yong-sheng1, 2*. Rapid and Nndestructive Dagnosis Mthod for Posphate Dficiency in “Cabernet Sauvignon” Gape Laves by Vis/NIR Sectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3719-3725. |
[6] |
WANG Qi-biao1, HE Yu-kai1, LUO Yu-shi1, WANG Shu-jun1, XIE Bo2, DENG Chao2*, LIU Yong3, TUO Xian-guo3. Study on Analysis Method of Distiller's Grains Acidity Based on
Convolutional Neural Network and Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3726-3731. |
[7] |
LUO Li, WANG Jing-yi, XU Zhao-jun, NA Bin*. Geographic Origin Discrimination of Wood Using NIR Spectroscopy
Combined With Machine Learning Techniques[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3372-3379. |
[8] |
ZHANG Shu-fang1, LEI Lei2, LEI Shun-xin2, TAN Xue-cai1, LIU Shao-gang1, YAN Jun1*. Traceability of Geographical Origin of Jasmine Based on Near
Infrared Diffuse Reflectance Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3389-3395. |
[9] |
YANG Qun1, 2, LING Qi-han1, WEI Yong1, NING Qiang1, 2, KONG Fa-ming1, ZHOU Yi-fan1, 2, ZHANG Hai-lin1, WANG Jie1, 2*. Non-Destructive Monitoring Model of Functional Nitrogen Content in
Citrus Leaves Based on Visible-Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3396-3403. |
[10] |
HUANG Meng-qiang1, KUANG Wen-jian2, 3*, LIU Xiang1, HE Liang4. Quantitative Analysis of Cotton/Polyester/Wool Blended Fiber Content by Near-Infrared Spectroscopy Based on 1D-CNN[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3565-3570. |
[11] |
HUANG Zhao-di1, CHEN Zai-liang2, WANG Chen3, TIAN Peng2, ZHANG Hai-liang2, XIE Chao-yong2*, LIU Xue-mei4*. Comparing Different Multivariate Calibration Methods Analyses for Measurement of Soil Properties Using Visible and Short Wave-Near
Infrared Spectroscopy Combined With Machine Learning Algorithms[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3535-3540. |
[12] |
KANG Ming-yue1, 3, WANG Cheng1, SUN Hong-yan3, LI Zuo-lin2, LUO Bin1*. Research on Internal Quality Detection Method of Cherry Tomatoes Based on Improved WOA-LSSVM[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3541-3550. |
[13] |
HUANG Hua1, LIU Ya2, KUERBANGULI·Dulikun1, ZENG Fan-lin1, MAYIRAN·Maimaiti1, AWAGULI·Maimaiti1, MAIDINUERHAN·Aizezi1, GUO Jun-xian3*. Ensemble Learning Model Incorporating Fractional Differential and
PIMP-RF Algorithm to Predict Soluble Solids Content of Apples
During Maturing Period[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3059-3066. |
[14] |
CHEN Jia-wei1, 2, ZHOU De-qiang1, 2*, CUI Chen-hao3, REN Zhi-jun1, ZUO Wen-juan1. Prediction Model of Farinograph Characteristics of Wheat Flour Based on Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3089-3097. |
[15] |
GUO Ge1, 3, 4, ZHANG Meng-ling3, 4, GONG Zhi-jie3, 4, ZHANG Shi-zhuang3, 4, WANG Xiao-yu2, 5, 6*, ZHOU Zhong-hua1*, YANG Yu2, 5, 6, XIE Guang-hui3, 4. Construction of Biomass Ash Content Model Based on Near-Infrared
Spectroscopy and Complex Sample Set Partitioning[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3143-3149. |
|
|
|
|