Evaluating Privacy-Level Metrics in Privacy-Preserving Data Mining

Document Type : Original Article

Authors

1 Planning Techniques Center, Institute of National Planning, Cairo, Egypt

2 Information Systems Department, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt

3 Dean, Faculty of Computer Science and Engineering, Galala University, Suez, Egypt

Abstract

The increasing collection and analysis of sensitive personal data necessitates the development of robust Privacy-Preserving Data Mining (PPDM) methods. PPDM techniques are essential for extracting valuable insights from sensitive data while ensuring the maintenance of individuals’ privacy. A critical aspect of implementing PPDM involves assessing the efficacy of these techniques in safeguarding privacy. However, despite the growing significance of PPDM, there remains a limited comprehensive understanding of the metrics used to evaluate their effectiveness, particularly concerning privacy preservation. This paper addresses this research gap by presenting an extensive study of privacy-level metrics for PPDM methods. The study examines data privacy metrics, which quantify the uncertainty faced by adversaries attempting to infer original sensitive data from transformed datasets. In addition, the paper analyzes results privacy metrics, which assess the risk of sensitive information disclosure from data mining outputs. Besides, the paper presents a new classification for privacy-level metrics based on the phase of PPDM processes in which they can be utilized. Moreover, the study provides a detailed analytical discussion of privacy-level metrics used in PPDM, examining their strengths and limitations while demonstrating their implications for practical applications. Furthermore, the paper highlights several considerations and challenges associated with measuring privacy within different PPDM methods in the absence of a universally accepted definition. By providing a comprehensive overview of existing privacy-level metrics, the proposed study establishes a vital foundation for the evaluation of PPDM methods and contributes to the advancement of responsible and trustworthy data mining practices.

Keywords