XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Module: Utilities

    Description

      Story

      As a data scientist, I want to perform anonymization operations on my data, so that I can prepare it for input to predictive analytics algorithms. I also want to be able to de-anonymize my data.

      This feature is relevant especially given the recent GDPR policy:
      https://eugdpr.org/

      Proposed functionality:

      • Create conversion table for anonymization.
      • Create an anonymized version of a table.
      • Create a deanonymized version of a table

      Must be able to:

      • anonymize multiple columns in a table
      • datasets will still join correctly even on masked columns
      • the aggregates on masked columns will match to the original
      • add salt to hash function for better security

      References

      [1] PDL tools
      http://pivotalsoftware.github.io/PDLTools/group__grp__anonymization.html

      [2] General information on anonymization
      https://en.wikipedia.org/wiki/Data_anonymization

      [3] Blog on hashing
      https://crackstation.net/hashing-security.htm

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              fmcquillan Frank McQuillan
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: