Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-4680

Audit logging of delegation tokens for MR tracing

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.3-alpha
    • Fix Version/s: 2.1.1-beta
    • Component/s: namenode, security
    • Labels:
      None

      Description

      HDFS audit logging tracks HDFS operations made by different users, e.g. creation and deletion of files. This is useful for after-the-fact root cause analysis and security. However, logging merely the username is insufficient for many usecases. For instance, it is common for a single user to run multiple MapReduce jobs (I believe this is the case with Hive). In this scenario, given a particular audit log entry, it is difficult to trace it back to the MR job or task that generated that entry.

      I see a number of potential options for implementing this.

      1. Make an optional "client name" field part of the NN RPC format. We already pass a clientName as a parameter in many RPC calls, so this would essentially make it standardized. MR tasks could then set this field to the job and task ID.
      2. This could be generalized to a set of optional key-value tags in the NN RPC format, which would then be audit logged. This has standalone benefits outside of just verifying MR task ids.
      3. Neither of the above two options actually securely verify that MR clients are who they claim they are. Doing this securely requires the JobTracker to sign MR task attempts, and then having the NN verify this signature. However, this is substantially more work, and could be built on after idea #2.

      Thoughts welcomed.

      1. hdfs-4680-1.patch
        5 kB
        Andrew Wang
      2. hdfs-4680-2.patch
        12 kB
        Andrew Wang
      3. hdfs-4680-3.patch
        11 kB
        Andrew Wang
      4. hdfs-4680-4.patch
        14 kB
        Andrew Wang
      5. hdfs-4680-5.patch
        17 kB
        Andrew Wang

        Issue Links

          Activity

          Andrew Wang created issue -
          Sandy Ryza made changes -
          Field Original Value New Value
          Link This issue is related to HADOOP-9775 [ HADOOP-9775 ]
          Andrew Wang made changes -
          Attachment hdfs-4680-1.patch [ 12595129 ]
          Andrew Wang made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Target Version/s 3.0.0 [ 12320356 ] 3.0.0, 2.3.0 [ 12320356, 12324588 ]
          Andrew Wang made changes -
          Summary Audit logging of client names Audit logging of delegation tokens for MR tracing
          Andrew Wang made changes -
          Link This issue relates to MAPREDUCE-5379 [ MAPREDUCE-5379 ]
          Andrew Wang made changes -
          Attachment hdfs-4680-2.patch [ 12596376 ]
          Andrew Wang made changes -
          Attachment hdfs-4680-3.patch [ 12596951 ]
          Andrew Wang made changes -
          Attachment hdfs-4680-4.patch [ 12601099 ]
          Andrew Wang made changes -
          Attachment hdfs-4680-5.patch [ 12602089 ]
          Andrew Wang made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Fix Version/s 2.3.0 [ 12324588 ]
          Resolution Fixed [ 1 ]
          Andrew Wang made changes -
          Fix Version/s 2.1.1-beta [ 12324809 ]
          Fix Version/s 2.3.0 [ 12324588 ]
          Target Version/s 3.0.0, 2.3.0 [ 12320356, 12324588 ] 3.0.0, 2.3.0, 2.1.1-beta [ 12320356, 12324588, 12324809 ]
          Arun C Murthy made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Andrew Wang
              Reporter:
              Andrew Wang
            • Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development