Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4491

Encryption and Key Protection

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      When dealing with sensitive data, it is required to keep the data encrypted wherever it is stored. Common use case is to pull encrypted data out of a datasource and store in HDFS for analysis. The keys are stored in an external keystore.

      The feature adds a customizable framework to integrate different types of keystores, support for Java KeyStore, read keys from keystores, and transport keys from JobClient to Tasks.
      The feature adds PGP encryption as a codec and additional utilities to perform encryption related steps.

      The design document is attached. It explains the requirement, design and use cases.
      Kindly review and comment. Collaboration is very much welcome.

      I have a tested patch for this for 1.1 and will upload it soon as an initial work for further refinement.

      Update: The patches are uploaded to subtasks.

      Attachments

        1. Hadoop_Encryption.pdf
          565 kB
          Benoy Antony
        2. Hadoop_Encryption.pdf
          566 kB
          Benoy Antony
        3. crypto_abstractions.zip
          12 kB
          Haifeng Chen

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            benoyantony Benoy Antony Assign to me
            benoyantony Benoy Antony
            Votes:
            1 Vote for this issue
            Watchers:
            32 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 504h
              504h
              Remaining:
              Remaining Estimate - 504h
              504h
              Logged:
              Time Spent - Not Specified
              Not Specified

              Slack

                Issue deployment