Details

    • Type: Sub-task
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.7.2
    • Fix Version/s: None
    • Component/s: fs/s3
    • Labels:
      None

      Description

      Seen in a code review. Notable that if true, this got by all the FS contract tests —showing we missed a couple.

      S3AFilesystem.create() does not examine its parent paths to verify that there does not exist one which is a file. It looks for the destination path if overwrite=false (see HADOOP-13188 for issues there), but it doesn't check the parent for not being a file, or the parent of that path.

      It must go up the tree, verifying that either a path does not exist, or that the path is a directory. The scan can stop at the first entry which is is a directory, thus the operation is O(empty-directories) and not O(directories).

        Issue Links

          Activity

          Hide
          stevel@apache.org Steve Loughran added a comment -

          HADOOP-12667 adds a createNonRecursive operation (and test), which fails if the parent isn' there. The S3a create code could support this option too; albeit with a check for the parent existing as a directory, not a file. the current create() code would have to b factored out into something which both create() and createNonRecursive could call, the startup check being slightly different.

          Notable that the createNonRecursive(), because it explicitly only checks the parent dir, is guaranteed to be O(1); with 1..3 HTTP requests made of S3 to guarantee that the parent dir exists and is a directory

          Show
          stevel@apache.org Steve Loughran added a comment - HADOOP-12667 adds a createNonRecursive operation (and test), which fails if the parent isn' there. The S3a create code could support this option too; albeit with a check for the parent existing as a directory, not a file. the current create() code would have to b factored out into something which both create() and createNonRecursive could call, the startup check being slightly different. Notable that the createNonRecursive() , because it explicitly only checks the parent dir, is guaranteed to be O(1); with 1..3 HTTP requests made of S3 to guarantee that the parent dir exists and is a directory
          Hide
          stevel@apache.org Steve Loughran added a comment -

          This has the potential to be very expensive. it could perhaps be done asynchronously, with the create() call starting a check which surfaces as a failure in a subsequent write() operation. Even there, given that s3a doesn't write files until close(), there's a race condition. A create() check may pass, but if a file is later created further up the directory tree, the final close() would still be created.

          Show
          stevel@apache.org Steve Loughran added a comment - This has the potential to be very expensive. it could perhaps be done asynchronously, with the create() call starting a check which surfaces as a failure in a subsequent write() operation. Even there, given that s3a doesn't write files until close(), there's a race condition. A create() check may pass, but if a file is later created further up the directory tree, the final close() would still be created.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Moved to S3a Phase III, marking as a dependent of HADOOP-13695. We could make this another scheduled future of a block write

          Show
          stevel@apache.org Steve Loughran added a comment - Moved to S3a Phase III, marking as a dependent of HADOOP-13695 . We could make this another scheduled future of a block write
          Hide
          mackrorysd Sean Mackrory added a comment -

          Attaching a test for this originally written as part of HADOOP-14457. I agree with all your feedback on that portion of the patch, but given that it's less urgent than the rest of the patch and impacts other filesystems, I'd suggest we just address it after HADOOP-13345 is merged. Perhaps we can make the S3Guard-specific part asynchronous at that point as well.

          Also, I did some testing of other FSs this morning:

          The following already pass the test:

          • TestAzureNativeContractCreate (WASB)
          • TestHDFSContractCreate
          • TestLocalFSContractCreate
          • TestRawlocalContractCreate

          The following tests fail:

          • TestAdlContractCreateLive: It fails with an ACL error. All other tests in the ContractCreate class pass, so I suspect this is actually the same bug, but manifesting in ACLs instead of storage metadata explicitly.
          • ITestS3NContractCreate, ITestS3AContractCreate (hence this JIRA getting opened in the first place).

          The following tests were skipped, and I'll need to do some setup to run them unless someone else can do so expediently

          • TestFTPContractCreate
          • TestSwiftContractCreate
          • TestAliyunOSSContractCreate
          Show
          mackrorysd Sean Mackrory added a comment - Attaching a test for this originally written as part of HADOOP-14457 . I agree with all your feedback on that portion of the patch, but given that it's less urgent than the rest of the patch and impacts other filesystems, I'd suggest we just address it after HADOOP-13345 is merged. Perhaps we can make the S3Guard-specific part asynchronous at that point as well. Also, I did some testing of other FSs this morning: The following already pass the test: TestAzureNativeContractCreate (WASB) TestHDFSContractCreate TestLocalFSContractCreate TestRawlocalContractCreate The following tests fail: TestAdlContractCreateLive: It fails with an ACL error. All other tests in the ContractCreate class pass, so I suspect this is actually the same bug, but manifesting in ACLs instead of storage metadata explicitly. ITestS3NContractCreate, ITestS3AContractCreate (hence this JIRA getting opened in the first place). The following tests were skipped, and I'll need to do some setup to run them unless someone else can do so expediently TestFTPContractCreate TestSwiftContractCreate TestAliyunOSSContractCreate
          Hide
          stevel@apache.org Steve Loughran added a comment -

          good catch of the Adl problem; shows how the same assumption "who would create a file under a file" reoccurs in independent codepaths. We all bring the same assumption about what a sensible FS would do, and code for it.

          For S3N, I propose a skip(). For S3A, well, that's the tough one, isn't it? I'd go with a direct parent check, but the cost of checking an entire path is potentially very expensive. And we have never, ever, seen this problem arise in the field..it's a corner case which production code doesn't attempt. I think that's partially because production code tends to call mkdirs() before creating files, and mkdirs does contain the check

          Show
          stevel@apache.org Steve Loughran added a comment - good catch of the Adl problem; shows how the same assumption "who would create a file under a file" reoccurs in independent codepaths. We all bring the same assumption about what a sensible FS would do, and code for it. For S3N, I propose a skip(). For S3A, well, that's the tough one, isn't it? I'd go with a direct parent check, but the cost of checking an entire path is potentially very expensive. And we have never, ever, seen this problem arise in the field..it's a corner case which production code doesn't attempt. I think that's partially because production code tends to call mkdirs() before creating files, and mkdirs does contain the check

            People

            • Assignee:
              mackrorysd Sean Mackrory
              Reporter:
              stevel@apache.org Steve Loughran
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:

                Development