Large-scale Histopathological Colon Cancer Annotation Model Using Machine Learning Techniques

Document Type : Original Article

Authors

1 Basic Science department, Faculty of computer and information sciences, Ain shams university, Cairo, Egypt

2 Department of Scientific Computing, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, 11566, Egypt

3 Department of Information Systems, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, 11566, Egypt

4 Elnarges Buildings, 5th Settlement New Cairo

Abstract

Colon cancer ranks among the leading factors contributing to mortality and morbidity among adults. One of the main components in determining the kind of cancer is the histopathological diagnosis. This study presents the development of a computer-aided diagnosis system for adenocarcinomas of the colon using machine learning (ML) to analyze digital pathology images. A dataset of 10,000 images was gathered from the LC25000 collection, with 5000 images for each class. The Convolutional Neural Network with a Light Gradient Boosting Machine (CNN-LightGBM) with multiple threads was used as the classification model, and the system was evaluated against other ML algorithms. The reported diagnosis accuracy for colon cancer has achieved greater than 90%, outperforming the latest ML algorithms in disease classification accuracy. However, the accuracy was less than that for lung cancer classification based on this approach. This study demonstrates the potential for ML to improve the accuracy and efficiency of medical diagnosis and highlights the need for further research to improve the accuracy of colon cancer diagnosis.

Keywords