Details

      Description

      The current implementation persists and ACL bit in FSImage and editlogs. Moreover, the security decisions also depend on whether the bit is set.

      The problem here is that we have to maintain the implicit invariant, which is the ACL bit is set if and only if the the inode has AclFeature. The invariant has to be maintained everywhere otherwise it can lead to a security vulnerability. In the worst case, an attacker can toggle the bit and bypass the ACL checks.

      The jira proposes to treat the ACL bit as a transient bit. The bit should not be persisted onto the disk, neither it should affect any security decisions.

      1. HDFS-5923.004.patch
        61 kB
        Chris Nauroth
      2. HDFS-5923.003.patch
        33 kB
        Haohui Mai
      3. HDFS-5923.002.patch
        58 kB
        Chris Nauroth
      4. HDFS-5923.001.patch
        30 kB
        Haohui Mai
      5. HDFS-5923.000.patch
        29 kB
        Haohui Mai

        Issue Links

          Activity

          Hide
          Chris Nauroth added a comment -

          +1 for the patch. Thanks for addressing the feedback.

          In addition to the automated tests, I manually tested upgrading a NameNode with edits from a trunk build to a HDFS-4685 build. The latest patch loaded the existing OP_ADD and OP_MKDIR ops with no problem.

          I've committed this to the HDFS-4685 branch.

          Show
          Chris Nauroth added a comment - +1 for the patch. Thanks for addressing the feedback. In addition to the automated tests, I manually tested upgrading a NameNode with edits from a trunk build to a HDFS-4685 build. The latest patch loaded the existing OP_ADD and OP_MKDIR ops with no problem. I've committed this to the HDFS-4685 branch.
          Hide
          Chris Nauroth added a comment -

          Here is v4, merging back in the test changes. Reviewing now...

          Show
          Chris Nauroth added a comment - Here is v4, merging back in the test changes. Reviewing now...
          Hide
          Haohui Mai added a comment -

          Renew the v2 patch into v3 to avoid confusion

          Show
          Haohui Mai added a comment - Renew the v2 patch into v3 to avoid confusion
          Hide
          Haohui Mai added a comment -

          The v2 patch checks the layoutversion for the editlog so that the NN can consume the old edit logs.

          Show
          Haohui Mai added a comment - The v2 patch checks the layoutversion for the editlog so that the NN can consume the old edit logs.
          Hide
          Haohui Mai added a comment -

          The version has been bumped as a part of HDFS-5914:

              PROTOBUF_FORMAT(-52, "Use protobuf to serialize FSImage"),
              EXTENDED_ACL(-53, "Extended ACL"),
              RESERVED_REL2_4_0(-54, -51, "Reserved for release 2.4.0", true,
                  PROTOBUF_FORMAT, EXTENDED_ACL);
          
          Show
          Haohui Mai added a comment - The version has been bumped as a part of HDFS-5914 : PROTOBUF_FORMAT(-52, "Use protobuf to serialize FSImage" ), EXTENDED_ACL(-53, "Extended ACL" ), RESERVED_REL2_4_0(-54, -51, "Reserved for release 2.4.0" , true , PROTOBUF_FORMAT, EXTENDED_ACL);
          Hide
          Chris Nauroth added a comment -

          This patch looks mostly good. Thanks, Haohui.

          I see the edit log serialization format now adds a flag at the end of OP_ADD and OP_MKDIR to indicate the presence of ACLs, since there is no longer an ACL bit to act as the flag. Then, in HDFS-5933 this flag gets converted to a count of the ACL entries as part of the optimization work. When loading an old edit log (pre-ACLs), the flag/count will not be present for existing instances of OP_ADD and OP_MKDIR, so I expect we'd fail to read. Can we change this logic to consider layout version?

          Show
          Chris Nauroth added a comment - This patch looks mostly good. Thanks, Haohui. I see the edit log serialization format now adds a flag at the end of OP_ADD and OP_MKDIR to indicate the presence of ACLs, since there is no longer an ACL bit to act as the flag. Then, in HDFS-5933 this flag gets converted to a count of the ACL entries as part of the optimization work. When loading an old edit log (pre-ACLs), the flag/count will not be present for existing instances of OP_ADD and OP_MKDIR , so I expect we'd fail to read. Can we change this logic to consider layout version?
          Hide
          Chris Nauroth added a comment -

          There are a lot more assertions on the permission bits throughout the tests. Instead of asking you to hunt for them, I just went ahead and made the test code changes on top of your v1 patch. I'm attaching that as v2. I also refactored a little to create a shared assertion method in AclTestHelpers, and I changed the logic of that assertion to check the short value directly. That way, our tests won't be sensitive to the implementation details of FsPermission#equals.

          I'll review the patch now.

          Show
          Chris Nauroth added a comment - There are a lot more assertions on the permission bits throughout the tests. Instead of asking you to hunt for them, I just went ahead and made the test code changes on top of your v1 patch. I'm attaching that as v2. I also refactored a little to create a shared assertion method in AclTestHelpers , and I changed the logic of that assertion to check the short value directly. That way, our tests won't be sensitive to the implementation details of FsPermission#equals . I'll review the patch now.
          Hide
          Haohui Mai added a comment -

          The v1 patch fixes an invalid value of FsPermission in FSAclBaseTest.

          Show
          Haohui Mai added a comment - The v1 patch fixes an invalid value of FsPermission in FSAclBaseTest .
          Hide
          Chris Nauroth added a comment -

          +1 for the new proposal too. If a future use case motivates it, we can safely re-introduce the ACL bit, and it would be backwards-compatible.

          Show
          Chris Nauroth added a comment - +1 for the new proposal too. If a future use case motivates it, we can safely re-introduce the ACL bit, and it would be backwards-compatible.
          Hide
          Haohui Mai added a comment -

          The v0 patch takes a more aggressive approach, which removes the ACL bit completely. The rationale is the following:

          1. Some applications might assume that FsPermission stay within the range of 0~0777. Changing FsPermission might lead to unexpected issues.
          2. There are not many users care about whether the file has ACL except for ls. Since ls is not in the critical path, ls can make a separate getAclStatus() call to calculate the ACL bit.
          Show
          Haohui Mai added a comment - The v0 patch takes a more aggressive approach, which removes the ACL bit completely. The rationale is the following: Some applications might assume that FsPermission stay within the range of 0~0777. Changing FsPermission might lead to unexpected issues. There are not many users care about whether the file has ACL except for ls. Since ls is not in the critical path, ls can make a separate getAclStatus() call to calculate the ACL bit.
          Hide
          Chris Nauroth added a comment -

          Hi Fengdong,

          We committed HDFS-5914 yesterday to the HDFS-4685 branch to take care of ACL serialization after the protobuf merge.

          +1 for the proposal. We have a choice between 2 possible code maintenance scenarios:

          1. Persist the ACL bit. Guarantee that all code paths accepting an FsPermission from the client don't trust it and don't allow it to change the persisted version. The benefit is that we don't need to do outbound translation to toggle on the ACL bit for APIs like getFileStatus. The drawback is that we need to remember to do inbound translation to maintain the persisted value of the ACL bit for APIs like setPermission.
          2. Do not persist the ACL bit. Guarantee that all code paths returning an FsPermission to the client toggle on the ACL bit if the inode has an AclFeature. The benefit is that we don't need to do inbound translation. The drawback is that we need to remember to do outbound translation.

          The HDFS-4685 branch currently implements #1, but I agree that #2 is superior, because it reduces risk. Bugs in strategy #1 could result in toggling the ACL bit on or off incorrectly, which impacts permission enforcement. Bugs in strategy #2 would only return incorrect results to a client, but would not compromise permission enforcement.

          Thanks for proposing the change, Haohui.

          Show
          Chris Nauroth added a comment - Hi Fengdong, We committed HDFS-5914 yesterday to the HDFS-4685 branch to take care of ACL serialization after the protobuf merge. +1 for the proposal. We have a choice between 2 possible code maintenance scenarios: Persist the ACL bit. Guarantee that all code paths accepting an FsPermission from the client don't trust it and don't allow it to change the persisted version. The benefit is that we don't need to do outbound translation to toggle on the ACL bit for APIs like getFileStatus . The drawback is that we need to remember to do inbound translation to maintain the persisted value of the ACL bit for APIs like setPermission . Do not persist the ACL bit. Guarantee that all code paths returning an FsPermission to the client toggle on the ACL bit if the inode has an AclFeature . The benefit is that we don't need to do inbound translation. The drawback is that we need to remember to do outbound translation. The HDFS-4685 branch currently implements #1, but I agree that #2 is superior, because it reduces risk. Bugs in strategy #1 could result in toggling the ACL bit on or off incorrectly, which impacts permission enforcement. Bugs in strategy #2 would only return incorrect results to a client, but would not compromise permission enforcement. Thanks for proposing the change, Haohui.
          Hide
          Fengdong Yu added a comment -

          Thanks Zhao Jing.

          Another question, HDFS-5968 has serialized FsImage using Protobuf, then does that also serialized ACL states? I don't think we've done it. because HDFS-4685 not merged to trunk yet.

          Show
          Fengdong Yu added a comment - Thanks Zhao Jing. Another question, HDFS-5968 has serialized FsImage using Protobuf, then does that also serialized ACL states? I don't think we've done it. because HDFS-4685 not merged to trunk yet.
          Hide
          Jing Zhao added a comment -

          Hi Fengdong, here Haohui refers to the ACL bit, not the whole ACL state. The ACL information will still be persisted in editlog and fsimage.

          Show
          Jing Zhao added a comment - Hi Fengdong, here Haohui refers to the ACL bit, not the whole ACL state. The ACL information will still be persisted in editlog and fsimage.
          Hide
          Fengdong Yu added a comment -

          Does that all Acl setting disappeaered after NN restart if we don't persisit ACL state in the fsImage?

          Show
          Fengdong Yu added a comment - Does that all Acl setting disappeaered after NN restart if we don't persisit ACL state in the fsImage?

            People

            • Assignee:
              Haohui Mai
              Reporter:
              Haohui Mai
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development