Uploaded image for project: 'Directory Client API'
  1. Directory Client API
  2. DIRAPI-225

Add a LDIF anonymizer that takes a LDIF file and replace the value with random text

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.0-M28
    • Fix Version/s: 1.0.0-M29
    • Labels:
      None

      Description

      From time to time, we have to ask for user's LDIF, or users have to transmit LDIF to someone else for test purposes. It's clearly important to be able to have anonymized files, so that no critical information is leaked.

      The idea would be to read the original LDIF, replacing all teh values with random - but syntaxically correct - values.

      It should also be configurable (ie, the list of attributes to anonymized should be extensible).

      We have to take care of DN too, and of attributes which are DN pointing on some of the base entries (like Member).

        Activity

        Hide
        elecharny Emmanuel Lecharny added a comment -

        Done

        Show
        elecharny Emmanuel Lecharny added a comment - Done
        Hide
        akiran Kiran Ayyagari added a comment -

        Emmanuel Lecharny Using a slightly modified LdifPartition might help in solving this, provided the DNs are anonymized first using the rename operation. One additional pass in the beginning is still required to collect the groups though.

        Show
        akiran Kiran Ayyagari added a comment - Emmanuel Lecharny Using a slightly modified LdifPartition might help in solving this, provided the DNs are anonymized first using the rename operation. One additional pass in the beginning is still required to collect the groups though.
        Hide
        mheyman Marty Heyman added a comment -

        Yeah, I'm not sure we can afford to maintain workable LDIF. Most of the anonymization has potentially problematic global implications. DN is the worst.

        Show
        mheyman Marty Heyman added a comment - Yeah, I'm not sure we can afford to maintain workable LDIF. Most of the anonymization has potentially problematic global implications. DN is the worst.
        Hide
        elecharny Emmanuel Lecharny added a comment -

        Anonymizing attribute which are DN is quite a challenge. An anonymized DN must still be a valid DN, which exists in the DIT. We already transform an Entry's DN when we anonymize the AT which is part of the RDN, but if this anonymized DN is referenced elswhere in the LDIF file, then we should use this anonymized version. The problem is that we may refer DN's that we have not yet processed...

        That would require we parse the LDIF file first, and keep a track of all the DN in it, associated with their anonymized form (which requires we also anonymize the AT during this phase). We can even thing of cycles between entries...

        Show
        elecharny Emmanuel Lecharny added a comment - Anonymizing attribute which are DN is quite a challenge. An anonymized DN must still be a valid DN, which exists in the DIT. We already transform an Entry's DN when we anonymize the AT which is part of the RDN, but if this anonymized DN is referenced elswhere in the LDIF file, then we should use this anonymized version. The problem is that we may refer DN's that we have not yet processed... That would require we parse the LDIF file first, and keep a track of all the DN in it, associated with their anonymized form (which requires we also anonymize the AT during this phase). We can even thing of cycles between entries...
        Hide
        mheyman Marty Heyman added a comment -

        Looks good.

        Show
        mheyman Marty Heyman added a comment - Looks good.
        Hide
        elecharny Emmanuel Lecharny added a comment -

        We can default with a list of attributes that are to be anonymized, no matter what :

        • userPassword
        • displayName
        • givenName
        • surName
        • homePhone
        • homePostalAddress
        • jpegPhoto
        • labeledURI
        • mail
        • manager
        • mobile
        • organizationName
        • pager
        • photo
        • secretary
        • uid
        • userCertificate
        • userPKCS12
        • userSMIMECertificate
        • x500UniqueIdentifier
        • carLicense
        • host
        • locality
        • organizationName
        • organizationalUnitName
        • seelAlso
        • homeDirectory
        • uidNumber
        • gidNumber
        • commonName
        • gecos
        • description
        • memberUid
        Show
        elecharny Emmanuel Lecharny added a comment - We can default with a list of attributes that are to be anonymized, no matter what : userPassword displayName givenName surName homePhone homePostalAddress jpegPhoto labeledURI mail manager mobile organizationName pager photo secretary uid userCertificate userPKCS12 userSMIMECertificate x500UniqueIdentifier carLicense host locality organizationName organizationalUnitName seelAlso homeDirectory uidNumber gidNumber commonName gecos description memberUid

          People

          • Assignee:
            Unassigned
            Reporter:
            elecharny Emmanuel Lecharny
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development