Anatomy: Simple and Effective Privacy Preservation

 

Xiaokui Xiao and Yufei Tao

 

In Very Large Data Bases conference (VLDB), 2006

 
Abstract

This paper presents a novel technique, anatomy, for publishing sensitive data. Anatomy releases all the quasi-identifier and sensitive values directly in two separate tables. Combined with a grouping mechanism, this approach protects privacy, and captures a large amount of correlation in the microdata. We develop a linear-time algorithm for computing anatomized tables that obey the l-diversity privacy requirement, and minimize the error of reconstructing the microdata. Extensive experiments confirm that our technique allows significantly more effective data analysis than the conventional publication method based on generalization. Specifically, anatomy permits aggregate reasoning with average error below 10\%, which is lower than the error obtained from a generalized table by orders of magnitude.

Paper download

    

The long version of this paper can be downloaded here, and is under journal submission.
 
Implementation and datasets


Before you proceed with downloading, please read and agree to the terms of using our implementation.
 
Download the source codes of anatomy and multidimension k-anonymity genearlization (implemented by Xiaokui Xiao). These source codes were compiled under Microsoft Visual Studio. For a python implementation of anatomy, click here (implemented by Daniel Kifer).

Real datasets used in our experiments: SAL, OCC (each zip file is a package that contains 5 datasets with cardinalities 100k, 200k , ..., and 500k)

Dataset formats:
SAL: Each line corresponds to the personal information of an American adult, in the form of:
    tuple-id <s> age <s> gender <s> education <s> marital-status <s> race <s> work-class <s> native-country <s> salary-class
where <s> represents a space.

OCC: Each line corresponds to the personal information of an American adult, in the form of:
    tuple-id <s> age <s> gender <s> education <s> marital-status <s> race <s> work-class <s> native-country <s> occupation

 

Back to Yufei's home, or publication list