Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5706

toBeDeleted parent directories aren't being cleaned up

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.22.1
    • Component/s: security
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When security is enabled on 0.22, MRASyncDiskService doesn't always delete the parent directories under toBeDeleted.

      MRAsyncDiskService goes through toBeDeleted and creates "tasks" to delete the directories under there using the LinuxTaskController. It chooses which user to run as by looking at who owns that directory.
      For example:

      ls -al /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0
      total 12
      drwxr-xr-x 3 mapred mapred 4096 Jul  5 05:37 .
      drwxr-xr-x 5 mapred mapred 4096 Dec 19 10:15 ..
      drwxr-s--- 4 test   mapred 4096 Jul  2 02:54 test
      

      It would create a task to use "test" user to delete /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0/test (there could be more in there for other users). It then creates a task to use "mapred" user to delete /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0.

      So, the problem is that we normally configure "mapred" to not be allowed by the LinuxTaskController in the /etc/hadoop/conf.cloudera.mapreduce1/taskcontroller.cfg. The permissions on the toBeDeleted dir is drwxr-xr-x mapred:mapred, which means that only "mapred" can delete things in it (i.e. the timestamped dirs). However, the MRAsyncDiskService is already running as the mapred user, so there's no reason to use the LinuxTaskController for impersonation anyway; we can directly do it from the Java code.

      Another issue is that MRAsyncDiskService#deletePathsInSecureCluster expects an absolute file path (e.g. /mapred/local/toBeDeleted/2013-07-05_05-37-49.052_0, but MRAsyncDiskService#moveAndDeleteRelativePath passes in a relative path (e.g. toBeDeleted/2013-07-05_05-37-49.052_0).

        Activity

        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-22-branch #116 (See https://builds.apache.org/job/Hadoop-Mapreduce-22-branch/116/)
        MAPREDUCE-5706. toBeDeleted parent directories aren't being cleaned up. (Robert Kanter via kasha) (kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1593894)

        • /hadoop/common/branches/branch-0.22/mapreduce/CHANGES.txt
        • /hadoop/common/branches/branch-0.22/mapreduce/src/java/org/apache/hadoop/mapreduce/util/MRAsyncDiskService.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-22-branch #116 (See https://builds.apache.org/job/Hadoop-Mapreduce-22-branch/116/ ) MAPREDUCE-5706 . toBeDeleted parent directories aren't being cleaned up. (Robert Kanter via kasha) (kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1593894 ) /hadoop/common/branches/branch-0.22/mapreduce/CHANGES.txt /hadoop/common/branches/branch-0.22/mapreduce/src/java/org/apache/hadoop/mapreduce/util/MRAsyncDiskService.java
        Hide
        kkambatl Karthik Kambatla (Inactive) added a comment -

        Thanks Robert. Just committed this to branch-0.22.

        Show
        kkambatl Karthik Kambatla (Inactive) added a comment - Thanks Robert. Just committed this to branch-0.22.
        Hide
        kkambatl Karthik Kambatla (Inactive) added a comment -

        +1

        Show
        kkambatl Karthik Kambatla (Inactive) added a comment - +1
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12621395/MAPREDUCE-5706.patch
        against trunk revision .

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4480//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621395/MAPREDUCE-5706.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4480//console This message is automatically generated.
        Hide
        rkanter Robert Kanter added a comment -

        The patch fixes the two issues:

        1. The parent directory is now deleted using a "plain" PathDeletionContext which is backed by simple Java code rather than the TaskController.DeletionContext which is backed by the LinuxTaskController and does the impersonation.
        2. deletePathsInSecureCluster is now always called with an absolute path. I also renamed the argument to make that more clear.

        I don't have any unit tests, but I verified this in a cluster

        Show
        rkanter Robert Kanter added a comment - The patch fixes the two issues: The parent directory is now deleted using a "plain" PathDeletionContext which is backed by simple Java code rather than the TaskController.DeletionContext which is backed by the LinuxTaskController and does the impersonation. deletePathsInSecureCluster is now always called with an absolute path. I also renamed the argument to make that more clear. I don't have any unit tests, but I verified this in a cluster

          People

          • Assignee:
            rkanter Robert Kanter
            Reporter:
            rkanter Robert Kanter
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development