Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-778

[Rumen] Need a standalone JobHistory log anonymizer

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.23.0, 2.0.0-alpha
    • Fix Version/s: 0.23.1
    • Component/s: tools/rumen
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Added an anonymizer tool to Rumen. Anonymizer takes a Rumen trace file and/or topology as input. It supports persistence and plugins to override the default behavior.
    • Tags:
      rumen anonymization

      Description

      Job history logs contain a rich set of information that can help understand and characterize cluster workload and individual job execution. Examples of work that parses or utilizes job history include HADOOP-3585, MAPREDUCE-534, HDFS-459, MAPREDUCE-728, and MAPREDUCE-776. Some of the parsing tools developed in previous work already contains a component to anonymize the logs. It would be nice to combine these effort and have a common standalone tool that can anonymizes job history logs and preserve much of the structure of the files so that existing tools on top of job history logs continue work with no modification.

      1. anonymizer.py
        10 kB
        Guanying Wang
      2. ASF.LICENSE.NOT.GRANTED--anonymizer.patch
        20 kB
        Guanying Wang
      3. MAPREDUCE-778_branch0.23.patch
        220 kB
        Alejandro Abdelnur
      4. mapreduce-778-v1.14-12.patch
        220 kB
        Amar Kamat
      5. mapreduce-778-v1.14-14.patch
        220 kB
        Amar Kamat
      6. mapreduce-778-v1.2-2.patch
        91 kB
        Amar Kamat
      7. same.py
        2 kB
        Guanying Wang

        Issue Links

          Activity

          Allen Wittenauer made changes -
          Affects Version/s 2.0.0-alpha [ 12320354 ]
          Affects Version/s 0.24.0 [ 12317654 ]
          Arun C Murthy made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Vinod Kumar Vavilapalli made changes -
          Fix Version/s 0.24.0 [ 12317654 ]
          Vinod Kumar Vavilapalli made changes -
          Fix Version/s 0.23.1 [ 12318883 ]
          Affects Version/s 0.23.0 [ 12315570 ]
          Tsz Wo Nicholas Sze made changes -
          Link This issue is related to HADOOP-7470 [ HADOOP-7470 ]
          Alejandro Abdelnur made changes -
          Attachment MAPREDUCE-778_branch0.23.patch [ 12511305 ]
          Eli Collins made changes -
          Target Version/s 0.24.0 [ 12317654 ]
          Amar Kamat made changes -
          Link This issue relates to MAPREDUCE-3580 [ MAPREDUCE-3580 ]
          Amar Kamat made changes -
          Link This issue relates to MAPREDUCE-3581 [ MAPREDUCE-3581 ]
          Amar Kamat made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Hadoop Flags Reviewed [ 10343 ]
          Release Note Added an anonymizer tool to Rumen. Anonymizer takes a Rumen trace file and/or topology as input. It supports persistence and plugins to override the default behavior.
          Target Version/s 0.24.0 [ 12317654 ]
          Resolution Fixed [ 1 ]
          Amar Kamat made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Amar Kamat made changes -
          Attachment mapreduce-778-v1.14-14.patch [ 12507529 ]
          Amar Kamat made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Affects Version/s 0.24.0 [ 12317654 ]
          Fix Version/s 0.24.0 [ 12317654 ]
          Amar Kamat made changes -
          Attachment mapreduce-778-v1.14-12.patch [ 12507028 ]
          Amar Kamat made changes -
          Attachment mapreduce-778-v1.2-2.patch [ 12486169 ]
          Amar Kamat made changes -
          Assignee Amar Kamat [ amar_kamat ]
          Amar Kamat made changes -
          Labels anonymization derby_triage10_5_2 rumen anonymization rumen
          Amar Kamat made changes -
          Summary Need a standalone JobHistory log anonymizer [Rumen] Need a standalone JobHistory log anonymizer
          Labels anonymization derby_triage10_5_2 rumen
          Tags rumen anonymization
          Component/s tools/rumen [ 12313617 ]
          Guanying Wang made changes -
          Attachment anonymizer.patch [ 12441863 ]
          Guanying Wang made changes -
          Field Original Value New Value
          Attachment anonymizer.py [ 12440436 ]
          Attachment same.py [ 12440437 ]
          Hong Tang created issue -

            People

            • Assignee:
              Amar Kamat
              Reporter:
              Hong Tang
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development