Swin-T與ConvNeXt多級融合的皮膚病變分類_《生物醫學工程學雜志》

作者：

王澤彤 ,  張俊華 , 王肖

云南大學信息學院（昆明 650500）;

關鍵詞：

Swin-T ConvNeXt 多級注意力機制逐級倒置殘差融合模塊皮膚病變圖像

DOI：

10.7507/1001-5515.202305025

視頻：

導出 下載 收藏 掃碼 引用

摘要 全文 圖表 視頻 參考文獻 施引文獻 補充材料

皮膚癌是一個重要的公共衛生問題，計算機輔助診斷技術可以有效地減輕這一負擔。在采用計算機輔助診斷時，準確識別皮膚病變類型至關重要。為此，本文提出一種基于Swin-T與ConvNeXt的多級注意力逐級融合模型，采用分層Swin-T與ConvNeXt分別提取全局與局部特征，并提出殘差通道注意力與空間注意力模塊進一步提取有效特征；利用多級注意力機制對多尺度全局與局部特征進行處理；針對淺層特征因離分類器較遠而丟失的問題，采用逐級聚合的思想，提出逐級倒置殘差融合模塊動態調整提取的特征信息。本文通過均衡采樣策略以及焦點損失，解決皮膚病變類別不平衡的問題。在ISIC2018、ISIC2019數據集上進行測試，其準確率、精確率、召回率和F1-Score分別是96.01%、93.67%、92.65%、93.11%與92.79%、91.52%、88.90%、90.15%。與Swin-T相比，準確率分別提升了3.60%和1.66%；與ConvNeXt相比，準確率分別提升了2.87%和3.45%。實驗表明，本文提出的方法能夠準確分類皮膚病變圖像，為皮膚癌的診斷提供了新的解決方案。

引用本文： 王澤彤, 張俊華, 王肖. Swin-T與ConvNeXt多級融合的皮膚病變分類. 生物醫學工程學雜志, 2024, 41(3): 544-551. doi: 10.7507/1001-5515.202305025 復制

0 引言

皮膚癌是一種常見且危害巨大的惡性腫瘤，其發病率逐年增加^[1]。常見的皮膚病變包括8種類型，其中，黑色素痣（melanocytic nevus，NV）、良性角化病（seborrheic keratosis，BKL）、光化性角化病（actinic keratosis/early intraepidermal carcinoma，AKEIC）、血管源性病變（vascular lesion，VASC）和皮膚纖維瘤（dermatofibroma，DF）屬于良性病變；而黑色素瘤（melanoma，MEL）、基底細胞癌（basal cell carcinoma，BCC）則為惡性病變。惡性黑色素瘤危害是最嚴重的，每年僅在美國就導致10 000人死亡。早期發現可通過簡單手術治愈，但晚期診斷會面臨更高的死亡風險。臨床上皮膚鏡檢查是一種有效手段，可檢測黑色素瘤和其他色素性皮膚病變。專業醫生可通過皮膚鏡圖像外觀和形態特征區分不同病變類型，然而其主觀性和可變性可能導致誤診和漏診^[2]。

為應對皮膚病的誤診和漏診問題，計算機輔助診斷技術正逐漸成為主流。傳統方法通常涉及病變區域的分割、特征提取和分類。Celebi等^[3]利用基于閾值分的分割算法和支持向量機（support vector machines，SVM）分類器，取得了良好的效果。Ganster等^[4]采用ABCD準則提取特征，將黑色素瘤的識別準確率提高至85%~91%，明顯改善了早期識別能力。隨著深度卷積神經網絡（convolutional neural networks，CNN）的興起，它在自然圖像分類中表現出色，在醫學圖像分析中的應用也日益凸顯。Yu等^[5]使用深度CNN成功識別黑色素瘤，平均準確率達到85.5%。Codella等^[6]結合稀疏編碼、深度學習和SVM，將3類皮膚病變圖像分類的準確率提升至93.1%。Esteva等^[7]采用遷移學習和微調Inception V3，對三類皮膚病變圖像進行分類，精度達到71.2%。Kawahara等^[8]使用預訓練的AlexNet網絡進行黑色素瘤檢測，準確率達到85.8%。盡管CNN能提取重要局部特征信息，但在提取全局特征方面存在局限。Ayas等^[9]將Swin-T與遷移學習結合，成功地在ISIC-2019數據集中識別出8種皮膚病變類型，準確率達到82.3%。

本文提出了一種基于Swin-T^[10]與ConvNeXt^[11]多級注意力逐級融合（Swin-T–ConvNeXt multi-level attention hierarchical fusion，S-C-MAHF）模型用于皮膚病變分類。采用分層Swin-T與ConvNeXt分別提取全局與局部特征，并提出殘差通道注意力（residual channel attention，Res-CA）與殘差空間注意力（residual spatial attention，Res-SA）模塊進一步提取有效特征；利用多級注意力機制（multi-level attention，MLA）對多尺度全局與局部特征進行處理；針對淺層特征因離分類器較遠而丟失的問題，采用逐級聚合的思想，提出逐級倒置殘差融合（hierarchical inverted residual fusion，HIRF）模塊動態調整提取的特征信息。該算法旨在解決現有皮膚病變分類方法在精度和難度上的瓶頸問題，為醫生準確診斷皮膚病變程度提供可靠的分析支持。

1 算法描述

S-C-MAHF結構如圖1所示。該模型由分層ConvNeXt、分層Swin-T、MLA和HIRF組成。分層Swin-T和ConvNeXt編碼器采用遷移學習方法，使用在Image-21K數據集中預訓練的權重。MLA包含多級空間注意力（multi-level spatial attention，MLSA）和多級通道注意力（multi-level channel attention，MLCA）兩個模塊，MLSA通過Res-SA動態調整局部特征圖之間的權重，加強皮膚病變特征空間位置的提取，而MLCA則利用Res-CA在全局特征圖之間動態調整權重，加強皮膚病變特征通道之間的聯系。HIRF通過逐級融合多尺度特征的方法，實現皮膚病變特征的融合。

圖1 網絡整體結構圖 Figure1. Overall network structure diagram

圖選項

數據集	AKEIC	BCC	BKL	DF	MEL	NV	SCC	VASC
ISIC 2018	8	1	1	1	10	1	0	1
ISIC 2019	6	1	4	10	1	1	6	1

數據集	模塊	準確率（%）	精確率（%）	召回率（%）	F1值（%）
ISIC2018	M1	93.14	88.71	88.69	88.57
	M2	93.87	89.71	89.45	89.51
	M3	92.41	84.83	85.57	84.89
	M4	94.54	88.37	92.28	90.10
	M5	95.01	89.51	92.18	90.51
	M6	95.34	92.37	92.64	92.48
	M7	96.01	93.67	92.65	93.11
ISIC2019	M1	89.34	84.25	85.71	84.85
	M2	90.17	88.78	84.65	86.45
	M3	91.13	89.71	83.37	86.10
	M4	91.53	89.25	86.13	87.59
	M5	91.90	89.93	88.03	88.95
	M6	92.40	91.05	88.75	89.83
	M7	92.79	91.52	88.90	90.15
注：粗體數字為最優值

數據集	方法	準確率（%）	精確率（%）	召回率（%）	F1值（%）
ISIC2018	DenseNet121	91.48	85.21	82.33	83.19
	DenseNet201	92.14	89.48	84.63	86.64
	ResNet101	90.41	83.19	85.91	84.18
	ResNet152	91.61	85.70	82.97	83.74
	ConvNeXt	93.14	88.71	88.69	88.57
	Swin-T	92.41	84.83	85.57	84.89
	本文方法	96.01	93.67	92.65	93.11
ISIC2019	DenseNet121	88.03	85.75	79.55	82.27
	DenseNet201	89.69	87.67	82.72	84.95
	ResNet101	88.74	86.07	79.87	82.46
	ResNet152	89.92	89.06	83.72	86.14
	ConvNeXt	89.34	84.25	85.71	84.85
	Swin-T	90.17	88.78	84.65	86.45
	本文方法	92.79	91.52	88.90	90.15
注：粗體數字為最優值

方法	浮點運算量/MFLOPs	參數量	幀率/FPS
Swin-T	101 810.80	205.52e6	53.35
ConvNeXt	91 533.92	196.04e6	83.71
Ours	151 272.84	309.46e6	44.04

1.	Kumari S, Choudhary P K, Shukla R, et al. Recent advances in nanotechnology based combination drug therapy for skin cancer. J Biomater Sci Polym Ed, 2022, 33(11): 1435-1468.
2.	Steiner A, Binder M, Schemper M, et al. Statistical evaluation of epiluminescence microscopy criteria for melanocytic pigmented skin lesions. J Am Acad Dermatol, 1993, 29(4): 581-588.
3.	Celebi M E, Iyatomi H, Schaefer G, et al. Lesion border detection in dermoscopy images. Comput Med Imaging Graph, 2009, 33(2): 148-153.
4.	Ganster H, Pinz P, Rohrer R, et al. Automated melanoma recognition. IEEE Trans Med Imaging, 2001, 20(3): 233-239.
5.	Yu L, Chen H, Dou Q, et al. Automated melanoma recognition in dermoscopy images via very deep residual networks. IEEE Trans Med Imaging, 2017, 36(4): 994-1004.
6.	Codella N, Cai J, Abedini M, et al. Deep learning, sparse coding, and SVM for melanoma recognition in dermoscopy images// International Workshop on Machine Learning in Medical Imaging. Munich: MLMI, 2015: 118-126.
7.	Esteva A, Kuprel B, Novoa R A, et al. Dermatologist-levelclassification of skin cancer with deep neural networks. Nature, 2017, 542(7639): 115-118.
8.	Kawahara J, BenTaieb A, Hamarneh G. Deep features to classify skin lesions// 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI). Prague: IEEE, 2016: 1397-1400.
9.	Ayas S. Multiclass skin lesion classification in dermoscopic images using swin transformer model. Neural Comput Appl, 2023, 35(9): 6713-6722.
10.	Liu Z, Lin Y, Cao Y, et al. Swin transformer: Hierarchical vision transformer using shifted windows// Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 10012-10022.
11.	Liu Z, Mao H, Wu C Y, et al. A convnet for the 2020s// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 11976-11986.
12.	Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module// Proceedings of the European Conference on Computer Vision (ECCV). Munich: ECCV, 2018: 3-19.
13.	Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4510-4520.
14.	Romdhane T F, Alhichri H, Ouni R, et al. Electrocardiogram heartbeat classification based on a deep convolutional neural network and focal loss. Comput Biol Med, 2020, 123: 103866.
15.	Tschandl P, Rosendahl C, Kittler H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci Data, 2018, 5(1): 180161.
16.	Combalia M, Codella N C F, Rotemberg V, et al. BCN20000: Dermoscopic lesions in the wild. arXiv, 2019, 2019: 1908.02288.
17.	Huang G, Liu Z, Van Der Maaten L, et al. Densely connected convolutional networks// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 4700-4708.
18.	He K, Zhang X, Ren S, et al. Deep residual learning for image recognition// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE 2016: 770-778.
19.	Ali R, Hardie R C, Narayanan B N, et al. Deep learning ensemble methods for skin lesion analysis towards melanoma detection// 2019 IEEE National Aerospace and Electronics Conference (NAECON). Dayton: IEEE, 2019: 311-316.
20.	Pacheco A G C, Ali A R, Trappenberg T. Skin cancer detection based on deep learning and entropy to detect outlier samples. arXiv, 2019, 2019: 1909.04525.
21.	Ali R, Hardie R C, De Silva M S, et al. Skin lesion segmentation and classification for ISIC 2018 by combining deep CNN and handcrafted features. arXiv, 2019, 2019: 1908.05730.
22.	Ahmed S A A, Yaniko?lu B, G?ksu ?, et al. Skin lesion classification with deep CNN ensembles// 2020 28th Signal Processing and Communications Applications Conference (SIU). Gaziantep: IEEE, 2020: 1-4.
23.	Almaraz-Damian J A, Ponomaryov V, Sadovnychiy S, et al. Melanoma and nevus skin lesion classification using handcraft and deep learning feature fusion via mutual information measures. Entropy, 2020, 22(4): 484.
24.	Guissous A E. Skin lesion classification using deep neural network. arXiv, 2019, 2019: 1911.07817.
25.	Zhang J, Xie Y, Xia Y, et al. Attention residual learning for skin lesion classification. IEEE Trans Med Imaging, 2019, 38(9): 2092-2103.
26.	Benyahia S, Meftah B, Lézoray O. Multi-features extraction based on deep learning for skin lesion classification. Tissue Cell, 2022, 74: 101701.
27.	Afza F, Sharif M, Khan M A, et al. Multiclass skin lesion classification using hybrid deep features selection and extreme learning machine. Sensors, 2022, 22(3): 799.

ISIC2018	準確率（%）	ISIC2019	準確率（%）
文獻[19]	93.60	文獻[20]	89.00
文獻[21]	84.10（Recall）	文獻[22]	90.60
文獻[23]	92.40	文獻[24]	91.00
文獻[25]	93.40	文獻[26]（DenseNet201 與 Fine KNN結合）	91.71
文獻[27]	94.36	文獻[26]（DenseNet201 與Cubic SVM結合）	92.34
本文方法	96.01	本文方法	92.79

ISIC2018	準確率（%）	ISIC2019	準確率（%）
文獻[19]	93.60	文獻[20]	89.00
文獻[21]	84.10（Recall）	文獻[22]	90.60
文獻[23]	92.40	文獻[24]	91.00
文獻[25]	93.40	文獻[26]（DenseNet201 與 Fine KNN結合）	91.71
文獻[27]	94.36	文獻[26]（DenseNet201 與Cubic SVM結合）	92.34
本文方法	96.01	本文方法	92.79

ISIC2018	準確率（%）	ISIC2019	準確率（%）
文獻[19]	93.60	文獻[20]	89.00
文獻[21]	84.10（Recall）	文獻[22]	90.60
文獻[23]	92.40	文獻[24]	91.00
文獻[25]	93.40	文獻[26]（DenseNet201 與 Fine KNN結合）	91.71
文獻[27]	94.36	文獻[26]（DenseNet201 與Cubic SVM結合）	92.34
本文方法	96.01	本文方法	92.79

《生物醫學工程學雜志》

Swin-T與ConvNeXt多級融合的皮膚病變分類

摘要 全文 圖表 視頻 參考文獻 施引文獻 補充材料

0 引言

1 算法描述

1.1 殘差注意力

1.2 多級注意力

1.3 逐級倒置殘差融合塊

1.4 焦點損失函數及均衡采樣策略

2 實驗結果及分析

2.1 實驗環境、數據集及評價標準

2.2 消融實驗

2.3 可視化分析

2.3.1 熱力圖分析

2.3.2 混淆矩陣

2.4 對比實驗

2.5 模型交叉實驗對比

2.6 性能對比

2.7 與其他論文結果對比

3 結論

0 引言

1 算法描述

1.1 殘差注意力

1.2 多級注意力

1.3 逐級倒置殘差融合塊

1.4 焦點損失函數及均衡采樣策略

2 實驗結果及分析

2.1 實驗環境、數據集及評價標準

2.2 消融實驗

2.3 可視化分析

2.3.1 熱力圖分析

2.3.2 混淆矩陣

2.4 對比實驗

2.5 模型交叉實驗對比

2.6 性能對比

2.7 與其他論文結果對比

3 結論

上一篇

下一篇

Format

Content

摘要全文圖表視頻參考文獻施引文獻補充材料