Hadoop Common
  1. Hadoop Common
  2. HADOOP-6304

Use java.io.File.set{Readable|Writable|Executable} where possible in RawLocalFileSystem

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.20.1
    • Fix Version/s: None
    • Component/s: fs
    • Labels:
      None
    • Release Note:
      Dramatic reduction in the number of system fork() calls when permissions are set.

      Description

      Using java.io.File.set

      {Readable|Writable|Executable}

      where possible in RawLocalFileSystem when g & o perms are same saves a lot of 'fork' system-calls.

        Issue Links

          Activity

          Hide
          Arun C Murthy added a comment -

          Emergency bug-fix to yahoo hadoop20 distribution - I'll upload one for trunk shortly.

          Show
          Arun C Murthy added a comment - Emergency bug-fix to yahoo hadoop20 distribution - I'll upload one for trunk shortly.
          Hide
          Tom White added a comment -

          Arun - it would be good to get this into trunk for the 0.22 release.

          Show
          Tom White added a comment - Arun - it would be good to get this into trunk for the 0.22 release.
          Hide
          Todd Lipcon added a comment -

          I am -1 on this patch. As seen in MAPREDUCE-2238, this pattern is dangerous as it temporarily drops directories into u-x or u-r territory which can screw up other processes working inside. I think HADOOP-7110 is the right solution for performance, and forking as we're doing now is fine for people who don't care about performance.

          Show
          Todd Lipcon added a comment - I am -1 on this patch. As seen in MAPREDUCE-2238 , this pattern is dangerous as it temporarily drops directories into u-x or u-r territory which can screw up other processes working inside. I think HADOOP-7110 is the right solution for performance, and forking as we're doing now is fine for people who don't care about performance.
          Hide
          Konstantin Shvachko added a comment -

          I disagree with your assessment of the dangers seen in MAPREDUCE-2238. See my comment in HADOOP-7110.
          I also disagree with your aggressive -1-ing without proper discussions and proposing alternatives within the same context.

          Show
          Konstantin Shvachko added a comment - I disagree with your assessment of the dangers seen in MAPREDUCE-2238 . See my comment in HADOOP-7110 . I also disagree with your aggressive -1-ing without proper discussions and proposing alternatives within the same context.
          Hide
          Todd Lipcon added a comment -

          I also disagree with your aggressive -1-ing without proper discussions and proposing alternatives within the same context.

          Apologies, I certainly didn't mean to come off as aggressive. To me, "-1" means "this shouldn't be committed as is until changes are made" with the same force as +1 means "commit away as is". I suppose it has become a loaded term in this community.

          The alternative proposed is HADOOP-7110.

          Show
          Todd Lipcon added a comment - I also disagree with your aggressive -1-ing without proper discussions and proposing alternatives within the same context. Apologies, I certainly didn't mean to come off as aggressive. To me, "-1" means "this shouldn't be committed as is until changes are made" with the same force as +1 means "commit away as is". I suppose it has become a loaded term in this community. The alternative proposed is HADOOP-7110 .
          Hide
          Arun C Murthy added a comment -

          Todd, this was a stop-gap. Anyway, a foggy day, could you please explain to me the link between using java.io.File.set

          {Readable|Writable|Executable}

          and MAPREDUCE-2238. Thanks.

          I agree that HADOOP-7110 will obviate a need for this. But, please be careful with JNI...

          Show
          Arun C Murthy added a comment - Todd, this was a stop-gap. Anyway, a foggy day, could you please explain to me the link between using java.io.File.set {Readable|Writable|Executable} and MAPREDUCE-2238 . Thanks. I agree that HADOOP-7110 will obviate a need for this. But, please be careful with JNI...
          Hide
          Todd Lipcon added a comment -

          Hey Arun. MAPREDUCE-2238 was caused by the use of these same APIs in Localizer.PermissionsHandler.setPermissions. What was happening was something like the following:

          • Thread A: setPermissions("/path/to/userlogs", 755)
          • Thread B: setPermissions("/path/to/userlogs/attempt_x", 755)

          The way they got interleaved was something like:

          • B: set userlogs/attempt_x to 000
          • A: set userlogs/ to 000
          • B: try to restore permissions on attempt_x, but fail since it can't traverse the path
          • A: set userlogs/ back to 755
          • The attempt_x directory is left at 000 or some other "incomplete" permissions where any following Hudson runs won't be able to delete it.

          This same problem can happen in a real cluster too.

          So I think the pattern used in this patch (same as the one in Localizer.PermissionsHandler.setPermissions) is dangerous because it's not atomic. This was just one manifestation of the issue.

          I agree JNI is difficult but this is a very simple use of it. No buffer management, no complicated errors to handle, no strange system APIs. Let's have the JNI discussion over in HADOOP-7110 though.

          Show
          Todd Lipcon added a comment - Hey Arun. MAPREDUCE-2238 was caused by the use of these same APIs in Localizer.PermissionsHandler.setPermissions. What was happening was something like the following: Thread A: setPermissions("/path/to/userlogs", 755) Thread B: setPermissions("/path/to/userlogs/attempt_x", 755) The way they got interleaved was something like: B: set userlogs/attempt_x to 000 A: set userlogs/ to 000 B: try to restore permissions on attempt_x, but fail since it can't traverse the path A: set userlogs/ back to 755 The attempt_x directory is left at 000 or some other "incomplete" permissions where any following Hudson runs won't be able to delete it. This same problem can happen in a real cluster too. So I think the pattern used in this patch (same as the one in Localizer.PermissionsHandler.setPermissions) is dangerous because it's not atomic. This was just one manifestation of the issue. I agree JNI is difficult but this is a very simple use of it. No buffer management, no complicated errors to handle, no strange system APIs. Let's have the JNI discussion over in HADOOP-7110 though.
          Hide
          Arun C Murthy added a comment -

          Ah, I was looking at each set* and thinking they are fine since each is a JNI call individually... thanks for the explanation.

          +1 for marking this invalid in favour of HADOOP-7110.

          Show
          Arun C Murthy added a comment - Ah, I was looking at each set* and thinking they are fine since each is a JNI call individually... thanks for the explanation. +1 for marking this invalid in favour of HADOOP-7110 .
          Hide
          Arun C Murthy added a comment -

          To be fixed via HADOOP-7110.

          Show
          Arun C Murthy added a comment - To be fixed via HADOOP-7110 .

            People

            • Assignee:
              Arun C Murthy
              Reporter:
              Arun C Murthy
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development