Automatic Dialect identification of Spoken Arabic Speech using Deep Neural Networks

Document Type : Original Article

Authors

1 Faculty of computer and information sciences, Ain Shams University

2 Department of Information Systems, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, 11566, Egypt

Abstract

Dialect identification is considered a subtask of the language identification problem and it is thought to be a more complex case due to the linguistic similarity between different dialects of the same language. In this paper, a novel approach is introduced for identifying three of the most used Arabic dialects: Egyptian, Levantine, and Gulf dialects. In this study, four experiments were conducted using different classification approaches that vary from simple classifiers such as Gaussian Naïve Bayes and Support Vector Machines to more complex classifiers using Deep Neural Networks (DNN). A features vector of 13 Mel cepstral coefficients (MFCCs) of the audio signals was used to train the classifiers using a multi-dialect parallel corpus. The experimental results showed that the proposed convolutional neural networks-based classifier has outperformed other classifiers in all three dialects. It has achieved an average improvement of 0.16, 0.19, and 0.19 in the Egyptian dialect, and of 0.07, 0.13, and 0.1 in the Gulf dialect, and of 0.52, 0.35, and 0.49 in the Levantine dialect for the Precision, recall and f1-score metrics respectively.

Keywords