Dual-Branch Multi-Task Regressor and Transformer Model for Endoscopic Image Classification.
Endoscopy plays a crucial role in the early diagnosis of colon cancer. The manual processing of images by skilled endoscopists is time-consuming, making automatic image classification highly valuable. We propose a novel multi-label classification method that integrates complementary learning from both local and global approaches. The model comprises a Swin Transformer branch for global feature extraction and a modified VGG16-based CNN branch for local feature analysis. The learning capability of the CNN branch is enhanced by concatenating a saliency map and the prediction of a texture feature vector through a multi-task learning framework. The proposed method outperformed state-of-the-art techniques, achieving an F1-score of 96.08% and an accuracy of 96.06% on the classification of the Kvasir-v2 dataset of endoscopic images.Clinical Relevance-Experimental results demonstrate the superiority of the proposed model for classifying endoscopic images, paving the way for enhanced diagnostic performance in clinical settings.
Authors
Sobhaninia Sobhaninia, Mirmahboub Mirmahboub, Abharian Abharian, Karimi Karimi, Shirani Shirani, Samavi Samavi
View on Pubmed