Geometric
Data Perturbation for Privacy-preserving Data Classification
Keke
Chen and Ling Liu
|
|
|
|
This project investigates a
random-geometric-transformation based data-perturbation approach for privacy
preserving data classification. The goal of this perturbation approach is
two-fold: preserving the utility of data in terms of classification modeling,
and preserving the privacy of data. To achieve the first goal, we identify
that many classification models utilize the geometric properties of datasets,
which can be preserved by geometric transformation. We prove that the three
types of well-known classifiers will deliver the same performance over the
geometrically perturbed dataset as over the original dataset. As a result,
this perturbation approach guarantees no loss of accuracy for three popular
classification methods. To reach the second goal, we propose a multi-column
privacy model to address the problems of evaluating privacy quality for
multidimensional perturbation, and develop an attack-resilient perturbation
optimization method. We analyze three types of inference attacks: naive
estimation, ICA-based reconstruction, and distribution-based attacks with the
proposed privacy metric. Based on the attack analysis, a randomized optimization
method is developed to optimize perturbation. Our initial experiments show
that this approach can provide high privacy guarantee while preserving the
accuracy for the discussed classifiers. More related geometric transformations will
be investigated to meet the requirements of different privacy-preserving
mining tasks and models. |
|
|
|
Representative papers:
|