DEEP METRIC LEARNING FOR FEW-SHOT PLANT DISEASES IMAGE CLASSIFICATION.

Elassiouti, Hosam Sherif; El-Saadawy, Hadeer; ElBery, Maryam; ElGamal, Mahmoud

doi:10.21608/ijicis.2025.401650.1409

DEEP METRIC LEARNING FOR FEW-SHOT PLANT DISEASES IMAGE CLASSIFICATION.

Document Type : Original Article

Authors

¹ Scientific Computing, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt

² Scientific Computing Department, Faculty of Computers and information sciences, Ain Shams University, Cairo, Egypt

10.21608/ijicis.2025.401650.1409

Abstract

Image classification is a powerful and widely used technique for distinguishing objects across various benchmarks. However, it suffers from several limitations. First, it fails to recognize or adapt to images of unseen categories, making it unsuitable for real-world applications where new categories frequently emerge during testing. Additionally, traditional classification models assume that the training and testing data are drawn from the same distribution, as is the case with most benchmarks. However, in real-word scenarios, even images from the same categories can be captured under different environmental conditions and challenging settings, making a well-trained classification model ineffective when handling out-of-distribution (OOD) data. Few-shot learning comes into play, where few-shot learning models can adapt to unseen categories and generalize better to OOD data using only a small labeled support set during test. In this paper, we present a resource-efficient deep metric learning network for plant leaf disease recognition in few-shot scenarios, addressing real-world challenges, where new diseases may emerge and field conditions can vary significantly. Specifically, we introduce a lightweight triplet network that leverages efficient embedding backbones. We employ MobileNetV2 and MobileViT-S as our network embedding backbones and optimize the network using the triplet loss. Experiments are conducted on the PlantVillage dataset, where the model is trained on 28 categories and evaluated on 10 unseen categories. Using MobileViT-S as the embedding backbone, our approach achieves a top-1 few-shot classification accuracy of 87.18% on the unseen categories.

Keywords