Article

Privacy-preserving data mining

ACM Sigmod Record 06/2004; 29:439-450. pp.439-450
Source: CiteSeer
0 0
 · 
1 Bookmark
 · 
29 Views
  • Source
    Article: A trust‐based noise injection strategy for privacy protection in cloud
    [show abstract] [hide abstract]
    ABSTRACT: Cloud promises users that they can present and deploy IT services in a pay-as-you-go fashion in an open and virtualized cloud environment while saving huge capital investment in their own IT infrastructure. In this sense, protection of users' privacy is critical and has become one of the most concerned issues as otherwise users may eventually lose the confidence and passion of deploying cloud in practice. Under some special cloud circumstances, some users' privacy, such as plans or habits, could be induced from their service requests by service providers without permissions from users. In this regard, obfuscation strategy can protect this kind of privacy by injecting ‘noise’ service requests to confuse potential ‘immoral’ service providers. However, existing noise obfuscation strategies focus on single noise injection whereas investigation of noise injection architecture has been neglected. Especially, a common service pattern in inter-clouds environment, the cooperative service process including different service providers, makes the risk of privacy serious and uncontrollable by the spread of users' privacy. To address this, we present a novel trust-based noise injection strategy for privacy protection in cloud. To support the strategy, we describe our noise injection architecture in cloud which specializes in the relations between various service roles in inter-clouds based on our trust model. The simulation can demonstrate that our noise injection strategy could significantly improve the effectiveness of privacy protection. Copyright © 2011 John Wiley & Sons, Ltd.
    Software Practice and Experience 03/2012; 42(4):431 - 445. · 0.52 Impact Factor
  • Source
    Conference Proceeding: Effects of Data Anonymization on the Data Mining Results
    [show abstract] [hide abstract]
    ABSTRACT: This article examines the possibility of publication of students’ data, such as secondary school success, state graduation exam scores and success during their first year of university study for analyses. In order to discover data patterns and relationships using data mining techniques, the data must be released in the form of original tuples, instead of pre-aggregated statistics. These records contain sensitive and even confidential personal information, which implies significant privacy concerns regarding the disclosure of such data. Removing explicit identifiers prior to data release cannot guarantee anonymity, since the datasets still contain information that can be used for linking the released records with publicly available collections that include students’ identities. One of the privacy preserving techniques proposed in the literature is the k-anonymization. The process of anonymizing a data set usually involves generalizing data records and, consequently, it incurs loss of relevant information. In the primary research undertaken in the University of Dubrovnik’s students’ database the effect of anonymization has been measured by comparing the results of mining the original data set with the results of mining the altered data set to determine if it is possible to use anonymized data for research purposes.
    35. International Convention MIPRO/miproBIS, Opatija, Croatia; 05/2012
  • Source
    Article: A practical approximation algorithm for optimal k-anonymity
    [show abstract] [hide abstract]
    ABSTRACT: k-Anonymity is a privacy preserving method for limiting disclosure of private information in data mining. The process of anonymizing a database table typically involves generalizing table entries and, consequently, it incurs loss of relevant information. This motivates the search for anonymization algorithms that achieve the required level of anonymization while incurring a minimal loss of information. The problem of k-anonymization with minimal loss of information is NP-hard. We present a practical approximation algorithm that enables solving the k-anonymization problem with an approximation guarantee of O(ln k). That algorithm improves an algorithm due to Aggarwal etal. (Proceedings of the international conference on database theory (ICDT), 2005) that offers an approximation guarantee of O(k), and generalizes that of Park and Shim (SIGMOD ’07: proceedings of the 2007 ACM SIGMOD international conference on management of data, 2007) that was limited to the case of generalization by suppression. Our algorithm uses techniques that we introduce herein for mining closed frequent generalized records. Our experiments show that the significance of our algorithm is not limited only to the theory of k-anonymization. The proposed algorithm achieves lower information losses than the leading approximation algorithm, as well as the leading heuristic algorithms. A modified version of our algorithm that issues ℓ-diverse k-anonymizations also achieves lower information losses than the corresponding modified versions of the leading algorithms. KeywordsPrivacy-preserving data mining– k-Anonymity– ℓ-Diversity–Approximation algorithms for NP-hard problems–Frequent generalized itemsets
    Data Mining and Knowledge Discovery 04/2012; 25(1):134-168. · 1.54 Impact Factor

Full-text (2 Sources)

View
2 Downloads
Available from