Home     Publications     Links     Conferences     History            20 20 20

Data Privacy

Privacy Preserving Data Mining (PPDM)

Statistical Disclosure Control (SDC)


Transparency: Transparency in data privacy is the principle that we need to publish data together with all information about how data has been processed and protected. This implies that we have to inform about the data protection method used as well as the parameters of these methods. As this information can be used by adversaries to refine their methods (i.e., to do transparency attacks), disclosure risk measures need also to take it into account. We have developed specific/adhoc disclosure risk measures that use information about the method and its parameters.

Transparency principle: (similar to the Kerckhoffs's principle in cryptography)

``Given a privacy model, a masking method should be compliant with this privacy model even if everything about the method is public knowledge'' (Torra, 2017, p. 17)

Our publications: Transparency and microaggregation
  • Torra, V., Miyamoto, S. (2004) Evaluating fuzzy clustering algorithms for microdata protection, Privacy in Statistical Databases, 2004. (Lecture Notes in Computer Science 3050 175-186) PDF @ Springer Link
    • In this paper we present an heuristic approach for microaggregation to blurren the microclusters so that the risk of disclosure is decreased. The heuristic approach is based on fuzzy clustering. This approach was introduced to avoid attacks to microaggregation when the transparency principle is applied. We discuss on the easiness of detecting that a microaggregated file has been microaggregated.
  • Nin, J., Torra, V. (2009) Analysis of the Univariate Microaggregation Disclosure Risk. New Generation Computing, Springer. PDF @ Springer
    • In this paper we attack a file protected with univariate microaggregation showing that intruders implementing an ad hoc attack (dedicated software for attacking the data base) can reidentify much more records than using a standard / generic approach. In addition, we show that in this case there is no uncertainty on whether a record has been reidentified or not. Note that some approaches to reidentification only give a probability of reidentification. This type of analysis is needed in order to apply data privacy with transparency.
    • V. Torra (2017) Fuzzy microaggregation for the transparency principle, J. Applied Logic 23 70-80. PDF @ Elsevier
      • In this paper we present a microaggregation algorithm that avoids transparency attacks. The algorithm is based on fuzzy clustering.
    • Nin, N., Herranz, J., Torra, V. (2008) On the Disclosure Risk of Multivariate Microaggregation. Data and Knowledge Engineering (DKE), Elsevier, 67:3 399-412. Paper @ ScienceDirect
      • In this paper we attack a file protected using multivariate microaggregation. As in the previous paper, we show that more records can be reidentified. Again, this type of analysis is needed to apply masking methods with transparency (i.e., informing the user how data has been protected).
    Transparency and rank swapping:
    • Nin, J., Herranz, J., Torra V. (2008) Rethinking Rank Swapping to Decrease Disclosure Risk, Data and Knowledge Engineering, 64:1 346-364. PDF@ScienceDirect
      • This paper describes an attack for rank swapping when we presume that the transparency principle is applied. Two variations of rank swapping are introduced which do not suffer from this type of attack.


Cite this site as:
V. Torra, Data privacy, Springer, 2017. Associated website: http://www.ppdm.cat/dp/

Vicenç Torra, Last modified: 11 : 47 September 14 2017.