Document Type : Original Article
Authors
1
Information Systems Department, Faculty of Computer and Information Sciences, Ain Shams University
2
Information Systems, Faculty of Computer and Information Sciences, Ain Shams University
3
Department of Information Systems, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, 11566, Egypt
Abstract
Sarcasm detection in social media has become a crucial task in natural language processing (NLP), where social media has become a fundamental aspect of communication, enabling billions of users to interact, share information, and express opinions. Platforms like Facebook, Instagram, and Twitter have transformed how news and entertainment are consumed, often replacing traditional media outlets. Unlike traditional sentiment analysis, sarcasm detection requires understanding the deeper context behind a statement, as the literal meaning often contrasts with the intended sentiment. This complexity is compounded by the lack of facial expressions or vocal cues, which typically aid in detecting sarcasm in face-to-face conversations. As a result, accurately identifying sarcastic content on social media demands sophisticated models that can account for both contextual and emotional subtleties within conversations. In this work, we propose an enhanced approach for sarcasm detection that combines both conversational context and emotional cues. Our method extracts emotions from the main post and each comment surrounding a response tweet, summarizes the conversation to reduce the context size while preserving key information, and incorporates both the response and the summarized, emotion-rich context into a RoBERTa-based model for classification. We evaluate our approach on a Twitter dataset. Experimental results demonstrate that our approach, which combines summarized context and emotional cues, achieves an F1-score of 0.8374, outperforming models that use only the response or rely solely on summarized context. Furthermore, our approach significantly reduces the data size by 41%, leading to less memory usage, addressing computational challenges posed by large conversational contexts.
Keywords