|
|
|
|
|
|
Classification of Hyperspectral Remote Sensing Images by Joint Hybrid Convolution and Cascaded Group Attention Mechanisms |
WANG Xiao-yan1, LIANG Wen-hui2, BI Chu-ran1, LI Jie3*, WANG Xi-yu2 |
1. School of Systems Science and Statistics, Beijing Wuzi University, Beijing 101149, China
2. School of Information, Beijing Wuzi University, Beijing 101149, China
3. School of Electromechanical and Vehicle Engineering, Beijing University of Civil Engineering and Architecture, Beijing 102616, China
|
|
|
Abstract The rich spectral information of hyperspectral remote sensing images can provide reliable data support for their feature classification. However, the problems of high dimensionality and redundancy of spectral data, difficulty associating spatial and spectral features, and insufficient spectral feature extraction have challenged the classification of hyperspectral remote sensing images based on deep learning. Convolutional neural network (CNN) and Vision Transformer (ViT) are two deep learning architectures widely used in computer vision, and each has unique advantages and limitations.CNN is good at capturing local features and spatial hierarchies and can deal with the invariance of the image's translation. ViT can capture global dependencies and has a better understanding of complex patterns in images. To improve the classification accuracy of hyperspectral remote sensing images and give full play to the advantages of both CNN and ViT models, this paper combines the local feature extraction capability of CNN and the global context understanding capability of ViT, and innovatively introduces the 3D Efficient ViT module into the hybrid convolution, and proposes a hyperspectral remote sensing image classification algorithm combining the hybrid convolution and cascading group attention mechanism EVIT3D_HSN: This algorithm introduces 3D Efficient ViT module based on 3D convolution to extract the joint features of hyperspectral remote sensing images and 2D convolution to extract the spatial features, which improves the generalization ability to different datasets and captures the image features of hyperspectral data in a more comprehensive way, thus enhances the performance of the classification algorithm without increasing the complexity of the model. To validate the advancement of this algorithm, this paper's algorithm EVIT3D_HSN is compared with algorithms 1DCNN, 2DCNN, 3DFCN, and 3DCNN and the original algorithm HybridSN for ablation experiments on hyperspectral remote sensing imagery classification datasets India Pines, Pavia University, and Salinas. The classification results of EVIT3D_HSN on the above three datasets are 97.66%, 99.00%, and 99.65% for OA and 97.3%, 98.6%, and 99.6% for the Kappa coefficient, respectively. Compared with 1DCNN, the model classification accuracies are improved by 37.12%, 25.09%, and 33.67%, respectively; compared with 2DCNN, the accuracies are improved by 59%, 57.43%, and 46.92%, respectively; compared with 3DFCN, the accuracies are improved by 45.36%, 24.5% and 29.72%, respectively; and compared with 3DCNN, the accuracies are improved by 28.05%, 14.26% and 34.29%; and compared to HybridSN, the accuracy is improved by 3.76%, 1.85% and 2.57%, respectively. In addition, EVIT3D_HSN has the highest F1 values for a total of 37 features, except stone steel towers for the IP dataset, Painted metal sheets and Shadows for the PU dataset, and Stubble features for the SA dataset. CONCLUSION The experimental results show that EVIT3D_HSN outperforms the above five hyperspectral remote sensing image classification algorithms regarding model accuracy and generalization ability, and the model has good practical value.
|
Received: 2024-06-29
Accepted: 2024-12-04
|
|
Corresponding Authors:
LI Jie
E-mail: lijie@bucea.edu.cn
|
|
[1] LIU Yin-nian, XUE Yong-qi(刘银年, 薛永祺). Acta Geodaetica et Cartographica Sinica(测绘学报), 2023, 52(7): 1045.
[2] SU Yuan-chao, XU Ruo-qing, GAO Lian-ru, et al(苏远超,许若晴,高连如, 等). National Remote Sensing Bulletin(遥感学报), 2024, 28(1): 1.
[3] WANG Zi-xuan, YANG Liang, HUANG Ling-xia, et al(王子轩, 杨 良, 黄凌霞, 等). Spectroscopy and Spectral Analysis(光谱学与光谱分析), 2024, 44(6): 1724.
[4] Yang X, Ye Y, Li X, et al. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(9): 5408.
[5] Li S, Song W, Fang L, et al. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(9): 6690.
[6] Hu W, Huang Y, Wei L, et al. Journal of Sensors, 2015, 2015(1): 258619.
[7] Lee H, Kwon H. 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). IEEE, 2016: 3322.
[8] Sharma V, Diba A, Tuytelaars T, et al. Hyperspectral CNN for Image Classification & Band Selection, With Application to Face Recognition, Belgium, Tech. Rep. KUL/ESAT/PSI/1604, 2016.
[9] Hamida A B, Benoit A, Lambert P, et al. IEEE Transactions on Geoscience & Remote Sensing, 2018, 56(8): 4420.
[10] Roy S K, Krishna G, Dubey S R, et al. IEEE Geoscience and Remote Sensing Letters, 2020, 17(2): 277.
[11] Han K, Wang Y, Chen H, et al. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(1): 87.
[12] Liu X, Peng H, Zheng N, et al. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 14420.
[13] Liu Z, Lin Y, Cao Y, et al. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2021: 10012.
|
|
|
|