光谱学与光谱分析 |
|
|
|
|
|
Parallel PLS Aigorithm Using MapReduce and Its Aplication in Spectral Modeling |
YANG Hui-hua1, DU Ling-ling2, LI Ling-qiao2, TANG Tian-biao2, GUO Tuo2, LIANG Qiong-lin3, WANG Yi-ming3, LUO Guo-an3 |
1. School of Electric Engineering and Automation, Guilin University of Electronic Technology, Guilin 541004,China 2. School of Computer Science and Engineering, Guilin University of Electronic Technology, Guilin 541004,China 3. Analysis Center, Tsinghua University, Beijing 100084,China |
|
|
Abstract Partial least squares (PLS) has been widely used in spectral analysis and modeling, and it is computation-intensive and time-demanding when dealing with massive data. To solve this problem effectively, a novel parallel PLS using MapReduce is proposed, which consists of two procedures, the parallelization of data standardizing and the parallelization of principal component computing. Using NIR spectral modeling as an example, experiments were conducted on a Hadoop cluster, which is a collection of ordinary computers. The experimental results demonstrate that the parallel PLS algorithm proposed can handle massive spectra, can significantly cut down the modeling time, and gains a basically linear speedup, and can be easily scaled up.
|
Received: 2012-03-08
Accepted: 2012-06-20
|
|
Corresponding Authors:
YANG Hui-hua
E-mail: yanghuihua@tsinghua.edu.cn
|
|
[1] Zhang Z M, Liang Y Z, Xu Q S. Chemometrics and Intelligent Laboratory Systems,2009, 96(1): 94. [2] SHEN Yong-xiang, YANG Hui-hua, HE Qian, et al(申永祥, 杨辉华,何 倩,等). Control and Automation Publication Group(微计算机信息), 2010,26(9): 208. [3] Dean J, Ghemawat S. Google, Inc., 2004. [4] Paradies M. Datenbank Spektrum, 2011, 11:47. [5] Yang Lai, Shi Zhong-zhi. International Federation for Information Processing, 2010, 213. [6] Pham D P, Yuan S M, Jou E. LNSC6104, 2010. 662. [7] Chu C T, Kim S K, Lin Y A, et al. NIPS, 2006. 281. [8] http://mahout.apache.org/2011. [9] JIANG Xiao-ping, LI Cheng-hua, XIANG Wen, et al(江小平,李成华,向 文,等). Huazhang Univ. of Sci. & Tech.·Natural Science Edition(华中科技大学学报·自然科学版), 2011,(S1): 120. [10] TAO Yong-cai, XUE Zheng-yuan, SHI Lei(陶永才, 薛正元,石 磊). Journal of Computer Aplications(计算机应用),2011, 31(9): 2412. [11] WANG Hui-wen, WU Zai-bin, MENG Jie(王惠文, 吴载斌, 孟 洁). Partial Least-Squares Regression-Linear and Nonlinear Methods(偏最小二乘回归的线性与非线性方法). Beijing: National Defense Industry Press(北京: 国防工业出版社), 2006. 255. [12] White T. Hadoop: The Definitive Guide. Beijing: Tsinghua University Press(北京: 清华大学出版社), 2011. [13] XIE Chao, MAI Lian-dao, DU Zhi-hui, et al(谢 超,麦联叨,都志辉,等). Computer Engineering and Applications(计算机工程与应用). 2003: 66
|
[1] |
GAO Feng1, 2, XING Ya-ge3, 4, LUO Hua-ping1, 2, ZHANG Yuan-hua3, 4, GUO Ling3, 4*. Nondestructive Identification of Apricot Varieties Based on Visible/Near Infrared Spectroscopy and Chemometrics Methods[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 44-51. |
[2] |
BAO Hao1, 2,ZHANG Yan1, 2*. Research on Spectral Feature Band Selection Model Based on Improved Harris Hawk Optimization Algorithm[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2024, 44(01): 148-157. |
[3] |
HU Cai-ping1, HE Cheng-yu2, KONG Li-wei3, ZHU You-you3*, WU Bin4, ZHOU Hao-xiang3, SUN Jun2. Identification of Tea Based on Near-Infrared Spectra and Fuzzy Linear Discriminant QR Analysis[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3802-3805. |
[4] |
LIU Xin-peng1, SUN Xiang-hong2, QIN Yu-hua1*, ZHANG Min1, GONG Hui-li3. Research on t-SNE Similarity Measurement Method Based on Wasserstein Divergence[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3806-3812. |
[5] |
BAI Xue-bing1, 2, SONG Chang-ze1, ZHANG Qian-wei1, DAI Bin-xiu1, JIN Guo-jie1, 2, LIU Wen-zheng1, TAO Yong-sheng1, 2*. Rapid and Nndestructive Dagnosis Mthod for Posphate Dficiency in “Cabernet Sauvignon” Gape Laves by Vis/NIR Sectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3719-3725. |
[6] |
WANG Qi-biao1, HE Yu-kai1, LUO Yu-shi1, WANG Shu-jun1, XIE Bo2, DENG Chao2*, LIU Yong3, TUO Xian-guo3. Study on Analysis Method of Distiller's Grains Acidity Based on
Convolutional Neural Network and Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(12): 3726-3731. |
[7] |
LUO Li, WANG Jing-yi, XU Zhao-jun, NA Bin*. Geographic Origin Discrimination of Wood Using NIR Spectroscopy
Combined With Machine Learning Techniques[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3372-3379. |
[8] |
ZHANG Shu-fang1, LEI Lei2, LEI Shun-xin2, TAN Xue-cai1, LIU Shao-gang1, YAN Jun1*. Traceability of Geographical Origin of Jasmine Based on Near
Infrared Diffuse Reflectance Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3389-3395. |
[9] |
YANG Qun1, 2, LING Qi-han1, WEI Yong1, NING Qiang1, 2, KONG Fa-ming1, ZHOU Yi-fan1, 2, ZHANG Hai-lin1, WANG Jie1, 2*. Non-Destructive Monitoring Model of Functional Nitrogen Content in
Citrus Leaves Based on Visible-Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3396-3403. |
[10] |
HUANG Meng-qiang1, KUANG Wen-jian2, 3*, LIU Xiang1, HE Liang4. Quantitative Analysis of Cotton/Polyester/Wool Blended Fiber Content by Near-Infrared Spectroscopy Based on 1D-CNN[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3565-3570. |
[11] |
HUANG Zhao-di1, CHEN Zai-liang2, WANG Chen3, TIAN Peng2, ZHANG Hai-liang2, XIE Chao-yong2*, LIU Xue-mei4*. Comparing Different Multivariate Calibration Methods Analyses for Measurement of Soil Properties Using Visible and Short Wave-Near
Infrared Spectroscopy Combined With Machine Learning Algorithms[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3535-3540. |
[12] |
KANG Ming-yue1, 3, WANG Cheng1, SUN Hong-yan3, LI Zuo-lin2, LUO Bin1*. Research on Internal Quality Detection Method of Cherry Tomatoes Based on Improved WOA-LSSVM[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(11): 3541-3550. |
[13] |
HUANG Hua1, LIU Ya2, KUERBANGULI·Dulikun1, ZENG Fan-lin1, MAYIRAN·Maimaiti1, AWAGULI·Maimaiti1, MAIDINUERHAN·Aizezi1, GUO Jun-xian3*. Ensemble Learning Model Incorporating Fractional Differential and
PIMP-RF Algorithm to Predict Soluble Solids Content of Apples
During Maturing Period[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3059-3066. |
[14] |
CHEN Jia-wei1, 2, ZHOU De-qiang1, 2*, CUI Chen-hao3, REN Zhi-jun1, ZUO Wen-juan1. Prediction Model of Farinograph Characteristics of Wheat Flour Based on Near Infrared Spectroscopy[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3089-3097. |
[15] |
GUO Ge1, 3, 4, ZHANG Meng-ling3, 4, GONG Zhi-jie3, 4, ZHANG Shi-zhuang3, 4, WANG Xiao-yu2, 5, 6*, ZHOU Zhong-hua1*, YANG Yu2, 5, 6, XIE Guang-hui3, 4. Construction of Biomass Ash Content Model Based on Near-Infrared
Spectroscopy and Complex Sample Set Partitioning[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(10): 3143-3149. |
|
|
|
|