InterPARES Trust - Terminology - Citations (English)

Citations

Rajalaxmi and Natarajan 2012 (†724)
Rajalaxmi, R. R., and A. M. Natarajan, "Effective Sanitization Approaches to Hide Sensitive Utility and Frequent Itemsets," Intelligent Data Analysis 16:6 (2012), p.933-951.

data sanitization (p.934): Data sanitization approaches hide the sensitive knowledge by modifying the original database. Usually, these approaches hide either frequent itemsets or utility itemsets, but not both. Also, frequent itemset hiding considers the presence or absence of items, whereas utility itemset hiding deals with internal and external utility of items. When support and utility of the itemsets are combined, it produces itemsets with high utility and support. When the data owner intends to hide sensitive utility and frequent itemsets, it is not possible to use frequent itemset hiding approaches since they reveal certain sensitive itemsets even after sanitization. (†1651)
data sanitization (p.936): ... There are subtle differences between data perturbation and data sanitization. First, data perturbation mainly focuses on individual data privacy whereas data sanitization methods aim to protect sensitive knowledge. In data perturbation, data utility is measured with the accurate aggregate statistical information while data sanitization measures data utility based on the ability to discover non-sensitive patterns. Also, data perturbation techniques have assumptions about the data distribution whereas data sanitization does not consider the distribution. (†1652)