Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: fs/s3
    • Labels:
      None
    • Target Version/s:

      Description

      This issue proposes S3Guard, a new feature of S3A, to provide an option for a stronger consistency model than what is currently offered. The solution coordinates with a strongly consistent external store to resolve inconsistencies caused by the S3 eventual consistency model.

      1. HADOOP-13345.prototype1.patch
        76 kB
        Chris Nauroth
      2. s3c.001.patch
        61 kB
        Lei (Eddy) Xu
      3. S3C-ConsistentListingonS3-Design.pdf
        245 kB
        Lei (Eddy) Xu
      4. S3GuardImprovedConsistencyforS3A.pdf
        431 kB
        Chris Nauroth
      5. S3GuardImprovedConsistencyforS3AV2.pdf
        328 kB
        Chris Nauroth

        Issue Links

        1.
        Support running isolated unit tests separate from AWS integration tests. Sub-task Resolved Chris Nauroth
         
        2.
        Refactor S3AFileSystem to support introduction of separate metadata repository and tests. Sub-task Resolved Chris Nauroth
         
        3.
        S3Guard: Define MetadataStore interface. Sub-task Resolved Chris Nauroth
         
        4.
        S3Guard: Implement DynamoDBMetadataStore. Sub-task Resolved Mingliang Liu
         
        5.
        S3Guard: Implement access policy providing strong consistency with S3 as source of truth. Sub-task Closed Unassigned
         
        6.
        S3Guard: Implement access policy using metadata store as source of truth. Sub-task Closed Unassigned
         
        7.
        S3Guard: Implement access policy for intra-client consistency with in-memory metadata store. Sub-task Resolved Aaron Fabbri
         
        8.
        S3Guard: Instrument new functionality with Hadoop metrics. Sub-task Resolved Ai Deng
         
        9. S3Guard: Provide custom FileSystem Statistics. Sub-task Open Unassigned
         
        10.
        S3Guard: Write end user docs, change table autocreate default. Sub-task Resolved Aaron Fabbri
         
        11.
        S3Guard: create basic contract tests for MetadataStore implementations Sub-task Resolved Aaron Fabbri
         
        12.
        S3Guard: implement move() for LocalMetadataStore, add unit tests Sub-task Resolved Aaron Fabbri
         
        13.
        S3Guard: Allow execution of all S3A integration tests with S3Guard enabled. Sub-task Resolved Steve Loughran
         
        14. s3guard: implement time-based (TTL) expiry for LocalMetadataStore Sub-task Open Aaron Fabbri
         
        15.
        S3Guard: S3AFileSystem Integration with MetadataStore Sub-task Resolved Aaron Fabbri
         
        16.
        S3Guard: Provide command line tools to manipulate metadata store. Sub-task Resolved Lei (Eddy) Xu
         
        17.
        Change PathMetadata to hold S3AFileStatus instead of FileStatus. Sub-task Resolved Lei (Eddy) Xu
         
        18. LocalMetadataStore#put(DirListingMetadata) should also put file metadata into fileHash. Sub-task Open Aaron Fabbri
         
        19.
        S3Guard: better support for multi-bucket access Sub-task Resolved Aaron Fabbri
         
        20.
        S3Guard: add delete tracking Sub-task Resolved Aaron Fabbri
         
        21. S3Guard: implement retries Sub-task Open Unassigned
         
        22.
        s3guard: add inconsistency injection, integration tests Sub-task Resolved Aaron Fabbri
         
        23.
        s3guard to log choice of metadata store at debug Sub-task Resolved Mingliang Liu
         
        24. S3Guard, MetadataStore to support atomic create(path, overwrite=false) Sub-task Open Unassigned
         
        25.
        S3Guard: fix TestDynamoDBMetadataStore when fs.s3a.s3guard.ddb.table is set Sub-task Resolved Aaron Fabbri
         
        26.
        s3guard: ITestS3AFileOperationCost.testFakeDirectoryDeletion failure Sub-task Resolved Mingliang Liu
         
        27.
        dynamodb dependency -> compile Sub-task Resolved Mingliang Liu
         
        28.
        DynamoDBMetadataStore to handle DDB throttling failures through retry policy Sub-task Resolved Aaron Fabbri
         
        29.
        tune dynamodb client & tests Sub-task Resolved Steve Loughran
         
        30.
        s3guard: improve S3AFileStatus#isEmptyDirectory handling Sub-task Resolved Aaron Fabbri
         
        31.
        S3Guard: Existing tables may not be initialized correctly in DynamoDBMetadataStore Sub-task Resolved Mingliang Liu
         
        32.
        S3Guard: NPE when table is already populated in dynamodb and user specifies "fs.s3a.s3guard.ddb.table.create=false" Sub-task Closed Mingliang Liu
         
        33.
        S3AGuard: Use BatchWriteItem in DynamoDBMetadataStore#put() Sub-task Resolved Mingliang Liu
         
        34.
        S3Guard: S3AFileSystem::listLocatedStatus() to employ MetadataStore Sub-task Resolved Mingliang Liu
         
        35.
        S3Guard: DynamoDBMetadataStore#move() could be throwing exception due to BatchWriteItem limits Sub-task Resolved Mingliang Liu
         
        36. S3Guard: DynamoDB can go out of sync with S3AFileSystem::delete operation Sub-task Open Unassigned
         
        37.
        Mock bucket locations in MockS3ClientFactory Sub-task Resolved Mingliang Liu
         
        38.
        S3guard: replace dynamo.describe() call in init with more efficient query Sub-task Closed Mingliang Liu
         
        39.
        Initialize DynamoDBMetadataStore without associated S3AFileSystem Sub-task Resolved Mingliang Liu
         
        40.
        Add ability to start DDB local server in every test Sub-task Resolved Mingliang Liu
         
        41. S3Guard CLI: Add fsck check command Sub-task Open Aaron Fabbri
         
        42.
        s3guard: add a version marker to every table Sub-task Resolved Steve Loughran
         
        43.
        S3Guard CLI: Add documentation Sub-task Resolved Aaron Fabbri
         
        44.
        s3guard cli: make tests easier to run and address failure Sub-task Resolved Sean Mackrory
         
        45. initial s3guard preview Sub-task Open Unassigned
         
        46. s3guard metadata stores to support millons of children Sub-task Open Unassigned
         
        47. cli to list info about a bucket (S3guard or not) Sub-task Open Unassigned
         
        48. Handled dynamo exceptions in translateException Sub-task Open Unassigned
         
        49.
        S3Guard: fix multi-bucket integration tests Sub-task Resolved Aaron Fabbri
         
        50.
        Optimize dirListingUnion Sub-task Resolved Sean Mackrory
         
        51.
        S3Guard: DynamoDBMetadataStore logs nonsense region Sub-task Resolved Sean Mackrory
         
        52.
        Implicitly creating DynamoDB table ignores endpoint config Sub-task Resolved Sean Mackrory
         
        53.
        S3Guard: intermittent duplicate item keys failure Sub-task Resolved Mingliang Liu
         
        54.
        CLI command to prune old metadata Sub-task Resolved Sean Mackrory
         
        55.
        Metastore destruction test creates table without version marker Sub-task Resolved Sean Mackrory
         
        56.
        S3Guard: link docs from index, fix typos Sub-task Resolved Aaron Fabbri
         
        57. Add integration test version of TestMetadataStore for DynamoDB Sub-task Open Sean Mackrory
         
        58.
        Fix breaking link in s3guard.md Sub-task Resolved Mingliang Liu
         
        59.
        Drop unnecessary type assertion and cast Sub-task Resolved Sean Mackrory
         
        60.
        Allow users to specify region for DynamoDB table instead of endpoint Sub-task Resolved Sean Mackrory
         
        61.
        Rethink S3GuardTool options Sub-task Resolved Sean Mackrory
         
        62.
        s3guard: regression in dirListingUnion Sub-task Resolved Aaron Fabbri
         
        63. improvements to S3GuardTool destroy command Sub-task Open Unassigned
         
        64.
        ITestS3GuardListConsistency fails intermittently Sub-task Resolved Mingliang Liu
         
        65.
        In S3AFileSystem, make getAmazonClient() package private; export getBucketLocation() Sub-task Resolved Steve Loughran
         
        66.
        s3guard tool tests aren't isolated; can't run in parallel Sub-task Resolved Sean Mackrory
         
        67.
        ITestS3ACredentialsInURL sometimes fails Sub-task Resolved Sean Mackrory
         
        68.
        Simplify DynamoDBClientFactory for creating Amazon DynamoDB clients Sub-task Resolved Mingliang Liu
         
        69.
        s3guard: CLI diff non-empty after import on new table Sub-task Resolved Sean Mackrory
         
        70.
        Ensure GenericOptionParser is used for S3Guard CLI Sub-task Resolved Sean Mackrory
         
        71. Set isAuthoritative flag when creating DirListingMetadata in DynamoDBMetaStore Sub-task Open Unassigned
         
        72. Possible for modified configuration to leak into metadatastore in S3GuardTool Sub-task Open Unassigned
         
        73.
        Add S3Guard.dirListingUnion in S3AFileSystem#listFiles, listLocatedStatus Sub-task Closed Unassigned
         
        74.
        S3GuardTool tests should not run if S3Guard is not set up Sub-task Resolved Sean Mackrory
         
        75.
        S3Guard: import does not import empty directory Sub-task Resolved Sean Mackrory
         
        76.
        Add validation of DynamoDB region Sub-task Resolved Sean Mackrory
         
        77.
        DynamoDB client should waitForActive on existing tables Sub-task Resolved Sean Mackrory
         
        78. Add S3GuardTool diagnostics command Sub-task Open Unassigned
         
        79. Add s3guardtool dump command Sub-task Open Unassigned
         
        80.
        S3Guard: DynamoDBMetadataStore::move() should populate ancestor directories of destination paths Sub-task Resolved Mingliang Liu
         
        81.
        S3Guard: S3AFileSystem::rename() should move non-listed sub-directory entries in metadata store Sub-task Resolved Mingliang Liu
         
        82.
        S3Guard: ITestS3AConcurrentOps is not cleaning up test data Sub-task Resolved Mingliang Liu
         
        83.
        TestS3GuardTool hangs/fails when offline: it's an IT test Sub-task Resolved Mingliang Liu
         
        84.
        S3Guard: S3AFileSystem::listFiles() to employ MetadataStore Sub-task Resolved Mingliang Liu
         
        85.
        S3Guard: DynamoDBMetadata::prune() should self interrupt correctly Sub-task Resolved Mingliang Liu
         
        86.
        TestDynamoDBMetadataStore is broken unless we can fail faster without a table version Sub-task Resolved Sean Mackrory
         
        87.
        ITestS3GuardListConsistency failure w/ Local, authoritative metadata store Sub-task Resolved Aaron Fabbri
         
        88. Improve DynamoDB schema update story Sub-task Open Sean Mackrory
         
        89. S3Guard: S3GuardTool to support provisioning existing metadata store Sub-task Open Unassigned
         
        90. s3guard will set file length to -1 on a putObjectDirect(stream, -1) call Sub-task Patch Available Steve Loughran
         
        91. Add more s3guard metrics Sub-task Open Unassigned
         
        92.
        ITestS3GuardConcurrentOps.testConcurrentTableCreations fails without table name configured Sub-task Resolved Sean Mackrory
         
        93.
        Play nice with ITestS3AEncryptionSSEC Sub-task Resolved Sean Mackrory
         
        94. create() does not notify metadataStore of parent directories or ensure they're not existing files Sub-task Patch Available Sean Mackrory
         
        95. S3Guard: Improve FNFE message when opening a stream Sub-task Open Aaron Fabbri
         
        96. S3Guard: make short-circuit getFileStatus() configurable Sub-task Open Aaron Fabbri
         
        97.
        make InconsistentAmazonS3Client usable in downstream tests Sub-task Resolved Aaron Fabbri
         
        98.
        Ensure deleted parent directory tombstones are overwritten when implicitly recreated Sub-task Resolved Sean Mackrory
         
        99.
        DirListingMetadata precondition failure messages to include path at fault Sub-task Resolved Steve Loughran
         
        100.
        s3guard w/ failure injection: listStatus fails after renaming file into directory Sub-task Resolved Sean Mackrory
         
        101.
        ITestS3GuardConcurrentOps requires explicit DynamoDB table name to be configured Sub-task Resolved Sean Mackrory
         
        102. Findbugs warning in LocalMetadataStore Sub-task Patch Available Sean Mackrory
         
        103.
        ProvidedFileStatusIterator#next() may throw IndexOutOfBoundsException Sub-task Resolved Aaron Fabbri
         
        104. simplify mkdirs() after S3Guard delete tracking change Sub-task Open Unassigned
         
        105.
        InconsistentAmazonS3Client adds extra paths to listStatus() after delete. Sub-task Resolved Sean Mackrory
         
        106.
        ITestS3GuardListConsistency is too slow Sub-task Resolved Aaron Fabbri
         
        107. S3Guard: issues running parallel tests w/ S3N Sub-task Open Aaron Fabbri
         
        108. DynamoDB tables may leave ACTIVE state after initial connection Sub-task Open Unassigned
         
        109. ITestS3AInconsistency.testGetFileStatus failing Sub-task Open Unassigned
         
        110. Ensure controls in-place to prevent clients with significant clock skews pruning aggressively Sub-task Open Unassigned
         
        111.
        LocalDynamoDB missing from latest AWS SDK releases Sub-task Resolved Steve Loughran
         
        112. ITestS3ATemporaryCredentials to cover all ddb metastore ops with session credentials Sub-task Open Unassigned
         

          Activity

          Hide
          cnauroth Chris Nauroth added a comment -

          I'm attaching a design document and a prototype patch that I've been testing. The design document describes the proposed solution in more detail. To summarize the prototype patch:

          • There is a new module: hadoop-tools/hadoop-s3guard. It was convenient to structure the prototype this way, isolated in a separate module, to avoid conflicts with the ongoing hadoop-aws work. I'm not certain this is the best permanent structure. It might make more sense to embed it all within hadoop-aws. There are pros and cons either way.
          • The hadoop-s3guard module defines a ConsistentS3AFileSystem class, which is a subclass of S3AFileSystem. This is the main entry point for this work. It overrides several methods to coordinate with the consistent external store.
          • ConsistentStore defines the interface expected from the strongly consistent store. There is currently one implementation using DynamoDB as the back-end: DynamoDBConsistentStore.
          • Even though it's a prototype, I wrote JavaDocs to explain what is happening at the code level. I think the JavaDocs serve as a good companion piece to the design doc.
          • For testing, I have used some Maven trickery to reuse all S3A test suites from hadoop-aws within hadoop-s3guard. src/test/resources/core-site.xml rewires the s3a: scheme so that all tests are executing against ConsistentS3AFileSystem. All of these tests are passing currently. This provides basic coverage of the hadoop-s3guard code paths and confirmation that the work hasn't yet introduced regressions.
          • Currently missing is any form of dedicated testing that specifically tries to trigger eventually consistent behavior and confirm that S3Guard handles it successfully. This means that we don't yet have automated testing that really confirms S3Guard does what it needs to do. Since eventually consistent behavior is non-deterministic, I think we'll need to explore mock-based approaches that simulate eventual consistency by returning incomplete/out-dated results on the first try, and then complete results on subsequent retries. We'll need to combine this with tests that really integrate with S3.
          • Also currently missing is any form of end user documentation.
          • I have marked various TODOs in the code to indicate important work items that remain to be done.
          Show
          cnauroth Chris Nauroth added a comment - I'm attaching a design document and a prototype patch that I've been testing. The design document describes the proposed solution in more detail. To summarize the prototype patch: There is a new module: hadoop-tools/hadoop-s3guard. It was convenient to structure the prototype this way, isolated in a separate module, to avoid conflicts with the ongoing hadoop-aws work. I'm not certain this is the best permanent structure. It might make more sense to embed it all within hadoop-aws. There are pros and cons either way. The hadoop-s3guard module defines a ConsistentS3AFileSystem class, which is a subclass of S3AFileSystem . This is the main entry point for this work. It overrides several methods to coordinate with the consistent external store. ConsistentStore defines the interface expected from the strongly consistent store. There is currently one implementation using DynamoDB as the back-end: DynamoDBConsistentStore . Even though it's a prototype, I wrote JavaDocs to explain what is happening at the code level. I think the JavaDocs serve as a good companion piece to the design doc. For testing, I have used some Maven trickery to reuse all S3A test suites from hadoop-aws within hadoop-s3guard. src/test/resources/core-site.xml rewires the s3a: scheme so that all tests are executing against ConsistentS3AFileSystem . All of these tests are passing currently. This provides basic coverage of the hadoop-s3guard code paths and confirmation that the work hasn't yet introduced regressions. Currently missing is any form of dedicated testing that specifically tries to trigger eventually consistent behavior and confirm that S3Guard handles it successfully. This means that we don't yet have automated testing that really confirms S3Guard does what it needs to do. Since eventually consistent behavior is non-deterministic, I think we'll need to explore mock-based approaches that simulate eventual consistency by returning incomplete/out-dated results on the first try, and then complete results on subsequent retries. We'll need to combine this with tests that really integrate with S3. Also currently missing is any form of end user documentation. I have marked various TODOs in the code to indicate important work items that remain to be done.
          Hide
          cnauroth Chris Nauroth added a comment -

          Andrew Wang, John Zhuge and Lei (Eddy) Xu, please have a look. Feedback appreciated. I will be offline for a while, returning 7/17, so I won't be able to respond to comments until then.

          Show
          cnauroth Chris Nauroth added a comment - Andrew Wang , John Zhuge and Lei (Eddy) Xu , please have a look. Feedback appreciated. I will be offline for a while, returning 7/17, so I won't be able to respond to comments until then.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Obviously I'm supportive of this. I think we should be doing this as a feature branch though, because it'll take >1 patch for the code to be functional. The other s3a work has generally been incremental.

          Show
          stevel@apache.org Steve Loughran added a comment - Obviously I'm supportive of this. I think we should be doing this as a feature branch though, because it'll take >1 patch for the code to be functional. The other s3a work has generally been incremental.
          Hide
          liuml07 Mingliang Liu added a comment -

          Thanks, Chris Nauroth for the design doc and prototype patch. I like the proposal.

          Design doc:
          1. Besides the metrics2, do you plan to support statistics (as subclass of StorageStatistics probably)?
          2. As there is no limit to the number of objects that can be stored in a S3 bucket, the S3 bucket may be very large. In this case, the consistency check requests may go to a single DynamoDB talbe. S3Guard may suffer from the low capacity units (read and write). To avoid this, the customers need to monitor and provision the table throughput. I suggest we consider this as the third "potential drawback" when using DynamoDB as a consistency store. See page 11 of design doc. I think using namenode should work just fine regarding the operation overhead.
          3. fs.s3a.s3guard.fail.on.error the default value is false, which should be true as indicated by the config key description. I believe this is an omission.
          4. As to the exponential back­off strategy for recheck, will the jitter be helpful? I referred to https://www.awsarchitectureblog.com/2015/03/backoff.html.
          5. I think we can also discuss on the ConsistentStore methods that a consistent store should implement in the design doc. Plus the DynamoDB Table scheme/index design. I saw in the code there is discussion about alternative schema ideas which is helpful.
          The patch:
          1. DescendantsIterator.java claims to implement preordering depth-first traversal (DFS) of a path and all of its descendants recursively. The example given was actually a breath-first traversal (BFS). I checked the code and think that it did implement a BFS, which conforms with the example. I think this is an omission in the javadoc.
          2. In DynamoDBConsistentStore#initTable(), perhaps we can call dynamodb.getTable(tableName).waitForActiveOrDelete() instead of sleeping and polling manually.
          3. I think we can use the DynamoDB document API (refer to here). For example, if we use the @ThreadSafe Table class, we can avoid directly operating the AmazonDynamoDBClient object and setting the table name for each request.
          4. For DynamoDBConsistentStore#get(), do we need to set ConsistentRead to true for the getItem() request?
          5. In DynamoDBConsistentStore#listChildren(), we can use key condition expression instead of key conditions in pathToParentEq for the query request.
          Nits:
          1. I understand that the fs.s3a.s3guard.store.table.name.prefix was not used yet in the patch.
          2. S3AFileSystem#awsConf can be final?
          3. In DescendantsIterator#hasNext() the statement !(stack.isEmpty() && !children.hasNext()); can be simplified as !(stack.isEmpty()) || children.hasNext());. It's simpler to me.
          4. I may need to read the DescendantsIterator#move() and related helper methods carefully, but it'd be helpful if we can add some javadoc stating that in DynamoDB, we can not update the key schema attributes. We need to delete and put a new item for key changes.

          Considering a dozen of TODOs in the current patch, user doc, and test, I agree with Steve Loughran that the work can be done in a feature branch and this JIRA be an umbrella JIRA for subtasks so people who are interested (like me) can pick up small task and contribute.

          Show
          liuml07 Mingliang Liu added a comment - Thanks, Chris Nauroth for the design doc and prototype patch. I like the proposal. Design doc: Besides the metrics2, do you plan to support statistics (as subclass of StorageStatistics probably)? As there is no limit to the number of objects that can be stored in a S3 bucket, the S3 bucket may be very large. In this case, the consistency check requests may go to a single DynamoDB talbe. S3Guard may suffer from the low capacity units (read and write). To avoid this, the customers need to monitor and provision the table throughput. I suggest we consider this as the third "potential drawback" when using DynamoDB as a consistency store. See page 11 of design doc. I think using namenode should work just fine regarding the operation overhead. fs.s3a.s3guard.fail.on.error the default value is false, which should be true as indicated by the config key description. I believe this is an omission. As to the exponential back­off strategy for recheck, will the jitter be helpful? I referred to https://www.awsarchitectureblog.com/2015/03/backoff.html . I think we can also discuss on the ConsistentStore methods that a consistent store should implement in the design doc. Plus the DynamoDB Table scheme/index design. I saw in the code there is discussion about alternative schema ideas which is helpful. The patch: DescendantsIterator.java claims to implement preordering depth-first traversal (DFS) of a path and all of its descendants recursively. The example given was actually a breath-first traversal (BFS). I checked the code and think that it did implement a BFS, which conforms with the example. I think this is an omission in the javadoc. In DynamoDBConsistentStore#initTable() , perhaps we can call dynamodb.getTable(tableName).waitForActiveOrDelete() instead of sleeping and polling manually. I think we can use the DynamoDB document API (refer to here ). For example, if we use the @ThreadSafe Table class, we can avoid directly operating the AmazonDynamoDBClient object and setting the table name for each request. For DynamoDBConsistentStore#get() , do we need to set ConsistentRead to true for the getItem() request? In DynamoDBConsistentStore#listChildren() , we can use key condition expression instead of key conditions in pathToParentEq for the query request. Nits: I understand that the fs.s3a.s3guard.store.table.name.prefix was not used yet in the patch. S3AFileSystem#awsConf can be final? In DescendantsIterator#hasNext() the statement !(stack.isEmpty() && !children.hasNext()); can be simplified as !(stack.isEmpty()) || children.hasNext()); . It's simpler to me. I may need to read the DescendantsIterator#move() and related helper methods carefully, but it'd be helpful if we can add some javadoc stating that in DynamoDB, we can not update the key schema attributes. We need to delete and put a new item for key changes. Considering a dozen of TODOs in the current patch, user doc, and test, I agree with Steve Loughran that the work can be done in a feature branch and this JIRA be an umbrella JIRA for subtasks so people who are interested (like me) can pick up small task and contribute.
          Hide
          fabbri Aaron Fabbri added a comment -

          Thanks for posting Chris Nauroth. Very cool stuff. I'm interested in helping.

          Lei (Eddy) Xu and I have also been working on a similar effort called S3 Consistency (S3C) and want to share it here so we can compare designs. I will attach a design doc and patch of what we did.

          Some interesting differences, from first glance:

          • You are a little further along. We only started this a month or so ago, part time. Our DynamoDB back-end is not finished. We wrote the design doc then started prototyping. Prototyping was valuable experience. Devil is in the details, as they say.
          • Our prototype (S3C) treats performance as a primary goal. There are some interesting tradeoffs. We demonstrated a significant performance improvement with our "fully cached directory" concept (see the design doc), but having a non-S3 source of truth makes failure handling trickier. Ultimately, I'd love the policy to be configurable.
          • Really interesting to compare your ConsistentStore with our MetadataStore interface. Look forward to discussions on this, and the "caching policy" as I call it (i.e. source of truth policy).
          • Our code separation at this point needs work. We intended to refactor. S3AFileSystem could use that in general. We wanted to keep MetadataStore separate from s3a, so it wouldn't be to hard to pull it out of s3a and use it for other storage connectors. We debated doing a separate FileSystem wrapper (discussed in design doc). Also thought about doing subclassing of S3AFileSystem as you did (might be brittle to future change but your separation is much better). At this point, though, the code served to make us aware of the details we need to conquer in the MetadataStore interface as well as feasibility of using it as a source of truth, adaptively (Policy C in design doc).
          • Having a local implementation of the MetadataStore has been fun. The TestLocalMetadataStore unit tests would probably become a general contract test of the interface. Intention is for local unit testing, but it happens to speed up something like hadoop -fs -copyFromLocal /my/dir/tree s3a://bucket over 2x in my quick tests. In addition to the warning in the class comment, it needs a log.WARN "this is not for production use" or something.. Concerned about people turning it on for performance without realizing that it has zero cross-node coherency.

          I'm on vacation next week but wanted to send out what we have. I'm hoping we can help you with this and combine the best of both efforts.

          Show
          fabbri Aaron Fabbri added a comment - Thanks for posting Chris Nauroth . Very cool stuff. I'm interested in helping. Lei (Eddy) Xu and I have also been working on a similar effort called S3 Consistency (S3C) and want to share it here so we can compare designs. I will attach a design doc and patch of what we did. Some interesting differences, from first glance: You are a little further along. We only started this a month or so ago, part time. Our DynamoDB back-end is not finished. We wrote the design doc then started prototyping. Prototyping was valuable experience. Devil is in the details, as they say. Our prototype (S3C) treats performance as a primary goal. There are some interesting tradeoffs. We demonstrated a significant performance improvement with our "fully cached directory" concept (see the design doc), but having a non-S3 source of truth makes failure handling trickier. Ultimately, I'd love the policy to be configurable. Really interesting to compare your ConsistentStore with our MetadataStore interface. Look forward to discussions on this, and the "caching policy" as I call it (i.e. source of truth policy). Our code separation at this point needs work. We intended to refactor. S3AFileSystem could use that in general. We wanted to keep MetadataStore separate from s3a , so it wouldn't be to hard to pull it out of s3a and use it for other storage connectors. We debated doing a separate FileSystem wrapper (discussed in design doc). Also thought about doing subclassing of S3AFileSystem as you did (might be brittle to future change but your separation is much better). At this point, though, the code served to make us aware of the details we need to conquer in the MetadataStore interface as well as feasibility of using it as a source of truth, adaptively (Policy C in design doc). Having a local implementation of the MetadataStore has been fun. The TestLocalMetadataStore unit tests would probably become a general contract test of the interface. Intention is for local unit testing, but it happens to speed up something like hadoop -fs -copyFromLocal /my/dir/tree s3a://bucket over 2x in my quick tests. In addition to the warning in the class comment, it needs a log.WARN "this is not for production use" or something.. Concerned about people turning it on for performance without realizing that it has zero cross-node coherency. I'm on vacation next week but wanted to send out what we have. I'm hoping we can help you with this and combine the best of both efforts.
          Hide
          cnauroth Chris Nauroth added a comment -

          Everyone, thank you for reviewing. There is a lot of helpful feedback here. I'm going to respond to everyone's points over the next several days and then fold that into another revision of the design document.

          Show
          cnauroth Chris Nauroth added a comment - Everyone, thank you for reviewing. There is a lot of helpful feedback here. I'm going to respond to everyone's points over the next several days and then fold that into another revision of the design document.
          Hide
          cnauroth Chris Nauroth added a comment -

          Mingliang Liu, thank you for your feedback.

          Besides the metrics2, do you plan to support statistics (as subclass of StorageStatistics probably)?

          I hadn't considered it, but yes, I think we can investigate adding statistics specific to S3Guard's implementation.

          As there is no limit to the number of objects that can be stored in a S3 bucket, the S3 bucket may be very large. In this case, the consistency check requests may go to a single DynamoDB talbe. S3Guard may suffer from the low capacity units (read and write). To avoid this, the customers need to monitor and provision the table throughput. I suggest we consider this as the third "potential drawback" when using DynamoDB as a consistency store. See page 11 of design doc. I think using namenode should work just fine regarding the operation overhead.

          Management of provisioned throughput is an additional source of operational complexity, but I failed to call that out specifically in the first revision of the document. I'll add it in the next revision.

          fs.s3a.s3guard.fail.on.error the default value is false, which should be true as indicated by the config key description. I believe this is an omission.

          Yes, thank you for catching it.

          As to the exponential back­off strategy for recheck, will the jitter be helpful?

          Yes, FWIW, I consider jitter important enough that it should be a part of any exponential back-off implementation. I tend to think of it as implicit whenever anyone uses the phrase "exponential back-off", but that's not necessarily true, so I'll state it explicitly in the next revision.

          I think we can also discuss on the ConsistentStore methods that a consistent store should implement in the design doc. Plus the DynamoDB Table scheme/index design. I saw in the code there is discussion about alternative schema ideas which is helpful.

          Yes, I can fold this information into the design document. There is a balance to strike as I expect some of these aspects to evolve during implementation, which risks invalidating an overly prescriptive upfront design document. We'll figure out that balance as we go.

          DescendantsIterator.java claims to implement preordering depth-first traversal (DFS) of a path and all of its descendants recursively. The example given was actually a breath-first traversal (BFS). I checked the code and think that it did implement a BFS, which conforms with the example. I think this is an omission in the javadoc.

          I'll need to revisit this, because I think I actually have a bug in here right now. My intent was to match the iteration order as would be seen through the S3 object listings performed inside S3AFileSystem during recursive deletes and renames. I believed matching the iteration order would make it easier to reason about failure modes. However, I now realize that it's almost never going to match up exactly anyway, because S3 won't have a key for every intermediate directory, but I expect DynamoDB will.

          In DynamoDBConsistentStore#listChildren(), we can use key condition expression instead of key conditions in pathToParentEq for the query request.

          Are you recommending this based on the fact that the AWS SDK JavaDocs for withKeyConditions describe it as a "legacy parameter", or is there something more to it? This is my first time working with DynamoDB, so I'm learning as I go.

          All of the other code suggestions look great to me. Thanks!

          Show
          cnauroth Chris Nauroth added a comment - Mingliang Liu , thank you for your feedback. Besides the metrics2, do you plan to support statistics (as subclass of StorageStatistics probably)? I hadn't considered it, but yes, I think we can investigate adding statistics specific to S3Guard's implementation. As there is no limit to the number of objects that can be stored in a S3 bucket, the S3 bucket may be very large. In this case, the consistency check requests may go to a single DynamoDB talbe. S3Guard may suffer from the low capacity units (read and write). To avoid this, the customers need to monitor and provision the table throughput. I suggest we consider this as the third "potential drawback" when using DynamoDB as a consistency store. See page 11 of design doc. I think using namenode should work just fine regarding the operation overhead. Management of provisioned throughput is an additional source of operational complexity, but I failed to call that out specifically in the first revision of the document. I'll add it in the next revision. fs.s3a.s3guard.fail.on.error the default value is false, which should be true as indicated by the config key description. I believe this is an omission. Yes, thank you for catching it. As to the exponential back­off strategy for recheck, will the jitter be helpful? Yes, FWIW, I consider jitter important enough that it should be a part of any exponential back-off implementation. I tend to think of it as implicit whenever anyone uses the phrase "exponential back-off", but that's not necessarily true, so I'll state it explicitly in the next revision. I think we can also discuss on the ConsistentStore methods that a consistent store should implement in the design doc. Plus the DynamoDB Table scheme/index design. I saw in the code there is discussion about alternative schema ideas which is helpful. Yes, I can fold this information into the design document. There is a balance to strike as I expect some of these aspects to evolve during implementation, which risks invalidating an overly prescriptive upfront design document. We'll figure out that balance as we go. DescendantsIterator.java claims to implement preordering depth-first traversal (DFS) of a path and all of its descendants recursively. The example given was actually a breath-first traversal (BFS). I checked the code and think that it did implement a BFS, which conforms with the example. I think this is an omission in the javadoc. I'll need to revisit this, because I think I actually have a bug in here right now. My intent was to match the iteration order as would be seen through the S3 object listings performed inside S3AFileSystem during recursive deletes and renames. I believed matching the iteration order would make it easier to reason about failure modes. However, I now realize that it's almost never going to match up exactly anyway, because S3 won't have a key for every intermediate directory, but I expect DynamoDB will. In DynamoDBConsistentStore#listChildren(), we can use key condition expression instead of key conditions in pathToParentEq for the query request. Are you recommending this based on the fact that the AWS SDK JavaDocs for withKeyConditions describe it as a "legacy parameter", or is there something more to it? This is my first time working with DynamoDB, so I'm learning as I go. All of the other code suggestions look great to me. Thanks!
          Hide
          liuml07 Mingliang Liu added a comment -

          Thanks for addressing the comments, Chris Nauroth. I like the design and look forward to its addressing the important S3A consistency problem in Hadoop.

          Are you recommending this based on the fact that the AWS SDK JavaDocs for withKeyConditions describe it as a "legacy parameter", or is there something more to it? This is my first time working with DynamoDB, so I'm learning as I go.

          Sorry I don't have solid support. It stems from my personal experience when I worked for DynamoDB. Key condition expression is a new and easier approach to using expression-style syntax for specifying the key conditions. I searched and found the official AWS blog here. I then realized that the conditions of DDB requests are not that complex in the current design. I now believe it's minor and can be improved later.

          Show
          liuml07 Mingliang Liu added a comment - Thanks for addressing the comments, Chris Nauroth . I like the design and look forward to its addressing the important S3A consistency problem in Hadoop. Are you recommending this based on the fact that the AWS SDK JavaDocs for withKeyConditions describe it as a "legacy parameter", or is there something more to it? This is my first time working with DynamoDB, so I'm learning as I go. Sorry I don't have solid support. It stems from my personal experience when I worked for DynamoDB. Key condition expression is a new and easier approach to using expression-style syntax for specifying the key conditions. I searched and found the official AWS blog here . I then realized that the conditions of DDB requests are not that complex in the current design. I now believe it's minor and can be improved later.
          Hide
          cnauroth Chris Nauroth added a comment -

          aaron fabbri and Lei (Eddy) Xu, thank you for sharing your work. I see a lot of commonality between the 2 efforts so far.

          I also explored something like the Layered FileSystem vs. Pluggable Metadata trade-off that you described. Specifically, I had an earlier prototype (not attached) that was a FilterFileSystem, with the intent that you could layer it over any other FileSystem implementation. I abandoned this idea when I got into implementation and found that I was going to need to coordinate more directly with the S3A logic in a way that wasn't amenable to overriding FileSystem methods. For example, I needed special case logic around createFakeDirectoryIfNecessary. It looks like you came to the same conclusion in your patch.

          The main difference I see is that my work focused more on consistency, with the S3 bucket still treated as source of truth, and your work focused more on performance. I hadn't tried anything with the DynamoDB lookup completely short-circuiting the S3 lookup. I think we can reconcile this though. Like you said, we can support configurable policies for different use cases. For example, if a user is willing to commit to performing all access through S3A and no external tools, then I expect it's safe for them to turn on a more aggressive caching policy that satisfies all metadata lookups from DynamoDB. Alternatively, there can be a fix-up tool like you described. This might fold into HADOOP-13311, where I proposed a new shell entry point for S3A-specific administration commands.

          Another interesting example in this area is GCS, which has something like the policies we are describing in terms of their DirectoryListCache. This includes an implementation like the in-memory one included in your patch.

          https://github.com/GoogleCloudPlatform/bigdata-interop/blob/1447da82f2bded2ac8493b07797a5c2483b70497/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/InMemoryDirectoryListCache.java

          The JavaDocs advertise it as providing consistency within a single process. Like you said, there is no cache coherence across processes.

          HADOOP-12876 is slightly related. Azure Data Lake has implemented an in-memory FileStatus cache (patch not yet available). When this idea was suggested, I raised the concern about cache coherence, but system testing with that caching enabled has gone well. That's a good sign that the cache coherence problem might not cause much harm to applications in practice. I had been thinking the HADOOP-12876 work could eventually be refactored to hadoop-common for any FileSystem to use, effectively becoming something like the "dentry cache" of Hadoop. I had been thinking this could happen independent of S3Guard. We can explore further if that makes sense, or if it's really beneficial to push the caching lower into S3A itself. (Some of the internal S3 listing calls don't map exactly to FileSystem method calls.)

          To summarize though, I see more commonality than difference, so I'd like to proceed with collaborating on this. I'd start by creating a feature branch and folding all of the information into a shared design doc. Please let me know your thoughts.

          Show
          cnauroth Chris Nauroth added a comment - aaron fabbri and Lei (Eddy) Xu , thank you for sharing your work. I see a lot of commonality between the 2 efforts so far. I also explored something like the Layered FileSystem vs. Pluggable Metadata trade-off that you described. Specifically, I had an earlier prototype (not attached) that was a FilterFileSystem , with the intent that you could layer it over any other FileSystem implementation. I abandoned this idea when I got into implementation and found that I was going to need to coordinate more directly with the S3A logic in a way that wasn't amenable to overriding FileSystem methods. For example, I needed special case logic around createFakeDirectoryIfNecessary . It looks like you came to the same conclusion in your patch. The main difference I see is that my work focused more on consistency, with the S3 bucket still treated as source of truth, and your work focused more on performance. I hadn't tried anything with the DynamoDB lookup completely short-circuiting the S3 lookup. I think we can reconcile this though. Like you said, we can support configurable policies for different use cases. For example, if a user is willing to commit to performing all access through S3A and no external tools, then I expect it's safe for them to turn on a more aggressive caching policy that satisfies all metadata lookups from DynamoDB. Alternatively, there can be a fix-up tool like you described. This might fold into HADOOP-13311 , where I proposed a new shell entry point for S3A-specific administration commands. Another interesting example in this area is GCS, which has something like the policies we are describing in terms of their DirectoryListCache . This includes an implementation like the in-memory one included in your patch. https://github.com/GoogleCloudPlatform/bigdata-interop/blob/1447da82f2bded2ac8493b07797a5c2483b70497/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/InMemoryDirectoryListCache.java The JavaDocs advertise it as providing consistency within a single process. Like you said, there is no cache coherence across processes. HADOOP-12876 is slightly related. Azure Data Lake has implemented an in-memory FileStatus cache (patch not yet available). When this idea was suggested, I raised the concern about cache coherence, but system testing with that caching enabled has gone well. That's a good sign that the cache coherence problem might not cause much harm to applications in practice. I had been thinking the HADOOP-12876 work could eventually be refactored to hadoop-common for any FileSystem to use, effectively becoming something like the "dentry cache" of Hadoop. I had been thinking this could happen independent of S3Guard. We can explore further if that makes sense, or if it's really beneficial to push the caching lower into S3A itself. (Some of the internal S3 listing calls don't map exactly to FileSystem method calls.) To summarize though, I see more commonality than difference, so I'd like to proceed with collaborating on this. I'd start by creating a feature branch and folding all of the information into a shared design doc. Please let me know your thoughts.
          Hide
          fabbri Aaron Fabbri added a comment -

          Thanks Chris Nauroth. Really cool to compare two independently designed solutions. Thanks for the GCS link, I'll check that out.

          I agree we should proceed and collaborate on this. Feature branch sounds good.

          The main difference I see is that my work focused more on consistency, with the S3 bucket still treated as source of truth, and your work focused more on performance. I hadn't tried anything with the DynamoDB lookup completely short-circuiting the S3 lookup. I think we can reconcile this though.

          We tried to make the MetadataStore interface expressive enough to allow implementations (both the MetadataStore impl. and the client code that uses it) to decide on whether or not the MetadataStore can be source of truth on directory listings:

          • Our MetadataStore#listStatus(Path) returns a CachedDirectory which contains a flag isFullyCached. Implementations may always set that flag to false, indicating that the client needs to consult the backing storage as well.
          • If a client connector wishes to take advantage of the performance benefits, it can publish full directory listings to the MetadataStore via putListStatus() with isFullyCached=true, and also note the isFullyCached flags on the return values from listStatus(). If a client connector does not want to deal with two possible sources of truth (e.g. to simplify failure cases), it can chose not to publish full listings to the MetadataStore, and to ignore any isFullyCached flags that are set on return from MetadataStore#listStatus().
          Show
          fabbri Aaron Fabbri added a comment - Thanks Chris Nauroth . Really cool to compare two independently designed solutions. Thanks for the GCS link, I'll check that out. I agree we should proceed and collaborate on this. Feature branch sounds good. The main difference I see is that my work focused more on consistency, with the S3 bucket still treated as source of truth, and your work focused more on performance. I hadn't tried anything with the DynamoDB lookup completely short-circuiting the S3 lookup. I think we can reconcile this though. We tried to make the MetadataStore interface expressive enough to allow implementations (both the MetadataStore impl. and the client code that uses it) to decide on whether or not the MetadataStore can be source of truth on directory listings: Our MetadataStore#listStatus(Path) returns a CachedDirectory which contains a flag isFullyCached . Implementations may always set that flag to false, indicating that the client needs to consult the backing storage as well. If a client connector wishes to take advantage of the performance benefits, it can publish full directory listings to the MetadataStore via putListStatus() with isFullyCached=true , and also note the isFullyCached flags on the return values from listStatus() . If a client connector does not want to deal with two possible sources of truth (e.g. to simplify failure cases), it can chose not to publish full listings to the MetadataStore , and to ignore any isFullyCached flags that are set on return from MetadataStore#listStatus() .
          Hide
          cnauroth Chris Nauroth added a comment -

          I have created the HADOOP-13345 feature branch and JIRA fix version.

          Show
          cnauroth Chris Nauroth added a comment - I have created the HADOOP-13345 feature branch and JIRA fix version.
          Hide
          eddyxu Lei (Eddy) Xu added a comment -

          Thanks a lot , Chris Nauroth!

          Show
          eddyxu Lei (Eddy) Xu added a comment - Thanks a lot , Chris Nauroth !
          Hide
          cnauroth Chris Nauroth added a comment -

          I am attaching a V2 design document that brings together content from both of the documents that were written independently and attached previously. Once again, feedback is welcome. If it's helpful, I also can set up a shared Google Doc for us to collaborate more directly on later revisions.

          Show
          cnauroth Chris Nauroth added a comment - I am attaching a V2 design document that brings together content from both of the documents that were written independently and attached previously. Once again, feedback is welcome. If it's helpful, I also can set up a shared Google Doc for us to collaborate more directly on later revisions.
          Hide
          liuml07 Mingliang Liu added a comment - - edited

          Thanks Chris Nauroth for updating the design doc and creating the sub-tasks. I suggest we elevate the sub-tasks' priority as "Major" as they are.

          Show
          liuml07 Mingliang Liu added a comment - - edited Thanks Chris Nauroth for updating the design doc and creating the sub-tasks. I suggest we elevate the sub-tasks' priority as "Major" as they are.
          Hide
          fabbri Aaron Fabbri added a comment -

          Thanks Chris Nauroth for updating the doc. I was out last week. There are some minor sections that need reconciliation (text from your doc says always use s3 as source of truth, text from our doc says make it configurable but shoot for support for directory caching), but overall it is very good.

          I'll comment on HADOOP-13448 with some questions I have related to MetadataStore interface design in your patch here.

          Show
          fabbri Aaron Fabbri added a comment - Thanks Chris Nauroth for updating the doc. I was out last week. There are some minor sections that need reconciliation (text from your doc says always use s3 as source of truth, text from our doc says make it configurable but shoot for support for directory caching), but overall it is very good. I'll comment on HADOOP-13448 with some questions I have related to MetadataStore interface design in your patch here.
          Hide
          fabbri Aaron Fabbri added a comment -

          Having the MetadataStore interface is an important first step for us to parallelize our effort here. Thanks again Chris for getting that first patch out.

          I still have questions about the subtasks though. There is still some fuzziness with respect to the policy part. (We may want to have a conf. call to discuss--and I'm open tomorrow.)

          I've been thinking about policy a little and I believe:

          • Allowing MetadataStore implementations to opt in/out of being source of truth is important. Implementations may wish to opt out based on implementation complexity, or lack of transactions for underlying store, or policy (LRU discard).
          • Allowing the client to opt out of relying on MetadataStore as source of truth is also desirable. Workloads that add files outside of hadoop, for example. And opting out is less risky while we stabilize the codebase.

          This implies some configuration parameters (ignoring the naming for now--I assume a future where this is factored out of s3a for any FS client to utilize)

          fs.<client>.metadatastore.allow.authoritative

          • If true, allow configured metadata store (if any) to be source of truth on cached file metadata and directory listings.
          • If true, but configured metadata store does not support being authoritative, this setting will have no effect,
            as the MetadataStore will always return results marked as non-authoritative.

          fs.<client>.metadatastore.class

          • Configure which MetadataStore implementation to use, if any.
          • This may replace fs.s3a.s3guard.enabled proposed in doc?

          fs.metadatastore.<impl-name>.fullycache.directories

          • If the metadata store implementation supports being authoritative on directory listings, this will cause it
            to return DirectoryListMetadata (name tbd) results with fullyCached=true when it has complete directory
            listing.
          • If metadata store implementation does not support this, it should log an error. Client will work correctly
            as implementation will never claim to fully cache listings / PathMetadata.

          We could name this ...<impl-name>.authoritative.directories instead.. We could also add an analogue for files: ..<impl-name>.authoritative.files as well. In my prototype I assumed get() on a single Path could always be authoritative. I could go either way.

          Thoughts?

          Show
          fabbri Aaron Fabbri added a comment - Having the MetadataStore interface is an important first step for us to parallelize our effort here. Thanks again Chris for getting that first patch out. I still have questions about the subtasks though. There is still some fuzziness with respect to the policy part. (We may want to have a conf. call to discuss--and I'm open tomorrow.) I've been thinking about policy a little and I believe: Allowing MetadataStore implementations to opt in/out of being source of truth is important. Implementations may wish to opt out based on implementation complexity, or lack of transactions for underlying store, or policy (LRU discard). Allowing the client to opt out of relying on MetadataStore as source of truth is also desirable. Workloads that add files outside of hadoop, for example. And opting out is less risky while we stabilize the codebase. This implies some configuration parameters (ignoring the naming for now--I assume a future where this is factored out of s3a for any FS client to utilize) fs.<client>.metadatastore.allow.authoritative If true, allow configured metadata store (if any) to be source of truth on cached file metadata and directory listings. If true, but configured metadata store does not support being authoritative, this setting will have no effect, as the MetadataStore will always return results marked as non-authoritative. fs.<client>.metadatastore.class Configure which MetadataStore implementation to use, if any. This may replace fs.s3a.s3guard.enabled proposed in doc? fs.metadatastore.<impl-name>.fullycache.directories If the metadata store implementation supports being authoritative on directory listings, this will cause it to return DirectoryListMetadata (name tbd) results with fullyCached=true when it has complete directory listing. If metadata store implementation does not support this, it should log an error. Client will work correctly as implementation will never claim to fully cache listings / PathMetadata. We could name this ...<impl-name>.authoritative.directories instead.. We could also add an analogue for files: ..<impl-name>.authoritative.files as well. In my prototype I assumed get() on a single Path could always be authoritative. I could go either way. Thoughts?
          Hide
          eddyxu Lei (Eddy) Xu added a comment - - edited

          Hi, Aaron Fabbri thanks for these great suggestions here.

          One question here is:

          • Can we consider fullycache.directories == true iff metadatastore.allow.authoritative == true? If we combine them together, case 2 of fullycache.directories should not happen.

          as the MetadataStore will always return results marked as non-authoritative.

          If we have this flag, we might not need to mark results as well.

          So I think the code like following can make the things simpler:

          List<PathMetadata> subFiles = metadataStore.get(path);
          if (!metadataStore.isAuthoritive()) {
              List<PathMetadata> s3Files = s3.listDir(path);
              // merge subfile and s3Files...
          }
          

          What do you think?

          Show
          eddyxu Lei (Eddy) Xu added a comment - - edited Hi, Aaron Fabbri thanks for these great suggestions here. One question here is: Can we consider fullycache.directories == true iff metadatastore.allow.authoritative == true ? If we combine them together, case 2 of fullycache.directories should not happen. as the MetadataStore will always return results marked as non-authoritative. If we have this flag, we might not need to mark results as well. So I think the code like following can make the things simpler: List<PathMetadata> subFiles = metadataStore.get(path); if (!metadataStore.isAuthoritive()) { List<PathMetadata> s3Files = s3.listDir(path); // merge subfile and s3Files... } What do you think?
          Hide
          fabbri Aaron Fabbri added a comment -

          Thank you for the feedback Lei (Eddy) Xu.

          Let me give an example why I think the fullyCached or isAuthoritative flag is required for return value from MetadataStore.listChildren().

          Assume we have an existing s3 bucket that contains these files:

          /a/b/file0
          /a/b/file1

          Now, assume we start up a Hadoop cluster, with s3guard configured for the MetadataStore to be authoritative, and do the following operations:

          create(/a/b/file2)
          listStatus(/a/b)

          In this case we have to query both the MetadataStore, and the s3 backend, as /a/b/file2 visibility may be subject to eventual consistency. Also the MetadataStore only knows about /a/b/file2, so the client has to consult s3 to learn about file0 and file1. In the listStatus() above, MetadataStore.listChildren(/a/b) will return (("/a/b/file2"), isAuthoritative=false), since the MetadataStore did not get a put() with isAuthoritative=true, nor did it see a mkdir(/a/b) happen.

          Two examples where MetadataStore.listChildren() would return a result with isAuthoritative=true:

          1.
          mkdir(/a/b/c)
          create(/a/b/c/fileA)
          listStatus(/a/b/c)

          Here, since the metadata store saw the creation of /a/b/c, it knows that it has observed all creations and deletions inside the /a/b/c directory.

          2. Extending the original example:

          Existed before cluster startup:
          /a/b/file0
          /a/b/file1

          Then with cluster, we see:
          create(/a/b/file2)
          listStatus(/a/b)
          listStatus(/a/b)

          The first call to listStatus(/a/b) will have to fetch the full directory contents from s3 since MetadataStore.listChildren(/a/b) will return isAuthoritative=false. Once the client gets the full listing from s3, it can call MetadataStore.put(('a/b/file0', '/a/b/file1', '/a/b/file2'), isAuthoritative=true). Now, the MetadataStore has been told it has full contents of /a/b, and the second call to listStatus(/a/b) above will see the MetadataStore return ('a/b/file0', '/a/b/file1', '/a/b/file2'), isAuthoritative=true)

          Show
          fabbri Aaron Fabbri added a comment - Thank you for the feedback Lei (Eddy) Xu . Let me give an example why I think the fullyCached or isAuthoritative flag is required for return value from MetadataStore.listChildren() . Assume we have an existing s3 bucket that contains these files: /a/b/file0 /a/b/file1 Now, assume we start up a Hadoop cluster, with s3guard configured for the MetadataStore to be authoritative, and do the following operations: create(/a/b/file2) listStatus(/a/b) In this case we have to query both the MetadataStore, and the s3 backend, as /a/b/file2 visibility may be subject to eventual consistency. Also the MetadataStore only knows about /a/b/file2, so the client has to consult s3 to learn about file0 and file1. In the listStatus() above, MetadataStore.listChildren(/a/b) will return (("/a/b/file2"), isAuthoritative=false) , since the MetadataStore did not get a put() with isAuthoritative=true , nor did it see a mkdir(/a/b) happen. Two examples where MetadataStore.listChildren() would return a result with isAuthoritative=true : 1. mkdir(/a/b/c) create(/a/b/c/fileA) listStatus(/a/b/c) Here, since the metadata store saw the creation of /a/b/c, it knows that it has observed all creations and deletions inside the /a/b/c directory. 2. Extending the original example: Existed before cluster startup: /a/b/file0 /a/b/file1 Then with cluster, we see: create(/a/b/file2) listStatus(/a/b) listStatus(/a/b) The first call to listStatus(/a/b) will have to fetch the full directory contents from s3 since MetadataStore.listChildren(/a/b) will return isAuthoritative=false . Once the client gets the full listing from s3, it can call MetadataStore.put(('a/b/file0', '/a/b/file1', '/a/b/file2'), isAuthoritative=true) . Now , the MetadataStore has been told it has full contents of /a/b, and the second call to listStatus(/a/b) above will see the MetadataStore return ('a/b/file0', '/a/b/file1', '/a/b/file2'), isAuthoritative=true)
          Hide
          cnauroth Chris Nauroth added a comment -

          Yes, on further reflection, I agree that much of the aspects of policy that I laid out earlier (and their corresponding JIRA sub-tasks) can be simplified. Aaron's last 2 comments look like a step in the right direction. I suggest that we proceed toward implementing what the design doc describes in "Policy C: On-​Demand Source of Truth". A small number of properties like Aaron described might influence the exact behavior. Beyond that, a more complex notion of policy might turn out to be overkill. Some of those JIRA sub-tasks might just get closed as invalid.

          My immediate next step will be to review the work Aaron did on HADOOP-13448 in my absence.

          Show
          cnauroth Chris Nauroth added a comment - Yes, on further reflection, I agree that much of the aspects of policy that I laid out earlier (and their corresponding JIRA sub-tasks) can be simplified. Aaron's last 2 comments look like a step in the right direction. I suggest that we proceed toward implementing what the design doc describes in "Policy C: On-​Demand Source of Truth". A small number of properties like Aaron described might influence the exact behavior. Beyond that, a more complex notion of policy might turn out to be overkill. Some of those JIRA sub-tasks might just get closed as invalid. My immediate next step will be to review the work Aaron did on HADOOP-13448 in my absence.
          Hide
          cnauroth Chris Nauroth added a comment -

          Based on the most recent discussion here about simplifying the notion of policy, I have resolved as invalid sub-tasks HADOOP-13450 and HADOOP-13451. I expect the previous intended scope of these issues will be small enough that we can combine it into the metadata store implementation sub-tasks and possibly a few other sub-tasks to track integration back into S3AFileSystem.

          Show
          cnauroth Chris Nauroth added a comment - Based on the most recent discussion here about simplifying the notion of policy, I have resolved as invalid sub-tasks HADOOP-13450 and HADOOP-13451 . I expect the previous intended scope of these issues will be small enough that we can combine it into the metadata store implementation sub-tasks and possibly a few other sub-tasks to track integration back into S3AFileSystem .
          Hide
          fabbri Aaron Fabbri added a comment -

          FYI Lei (Eddy) Xu, Mingliang Liu, Steve Loughran, I'd like to rebase the feature branch on latest trunk tomorrow morning. Let me know if this works for you.

          Show
          fabbri Aaron Fabbri added a comment - FYI Lei (Eddy) Xu , Mingliang Liu , Steve Loughran , I'd like to rebase the feature branch on latest trunk tomorrow morning. Let me know if this works for you.
          Hide
          liuml07 Mingliang Liu added a comment -

          That sounds perfect. I think Steve also has some plan about this? Perhaps you can sync with him before rebasing. Thanks,

          Show
          liuml07 Mingliang Liu added a comment - That sounds perfect. I think Steve also has some plan about this? Perhaps you can sync with him before rebasing. Thanks,
          Hide
          eddyxu Lei (Eddy) Xu added a comment -

          Thanks, Aaron Fabbri.

          It works for me. Looking forward to it.

          Show
          eddyxu Lei (Eddy) Xu added a comment - Thanks, Aaron Fabbri . It works for me. Looking forward to it.
          Hide
          stevel@apache.org Steve Loughran added a comment - - edited

          +1 for merging; I know the S3AFS class is a moving target, but regular resyncs will stop things divering/breaking each other

          Show
          stevel@apache.org Steve Loughran added a comment - - edited +1 for merging; I know the S3AFS class is a moving target, but regular resyncs will stop things divering/breaking each other
          Hide
          fabbri Aaron Fabbri added a comment -

          Mingliang Liu, Lei (Eddy) Xu, Steve Loughran I just rebased on trunk and force-pushed to the HADOOP-13345 feature branch. You'll need to do the usual branch/reset/rebase dance for your outstanding commits. Shout if you have any questions/concerns.

          Show
          fabbri Aaron Fabbri added a comment - Mingliang Liu , Lei (Eddy) Xu , Steve Loughran I just rebased on trunk and force-pushed to the HADOOP-13345 feature branch. You'll need to do the usual branch/reset/rebase dance for your outstanding commits. Shout if you have any questions/concerns.
          Hide
          liuml07 Mingliang Liu added a comment -

          Thanks!

          Will post the DDBMetadatastore patch upon rebased branch soon.

          Show
          liuml07 Mingliang Liu added a comment - Thanks! Will post the DDBMetadatastore patch upon rebased branch soon.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          thanks: sorry for breaking things —it's the price of working against a changing part of the codebase. Now I'm involved in this branch as well (a) I'll have less time to break things on branch-2 and (b) I'll be more aware of what I've just broken.

          FWIW I don't think you need to do rebase here, merging is better for collaboration, and avoids that hell of having to fix up some patch conflict over a class you know gets deleted later. When this gets pulled into trunk/branch-2, it'll be done as a squashed merge, so there's no harm in doing merges here rather than rebase

          Show
          stevel@apache.org Steve Loughran added a comment - thanks: sorry for breaking things —it's the price of working against a changing part of the codebase. Now I'm involved in this branch as well (a) I'll have less time to break things on branch-2 and (b) I'll be more aware of what I've just broken. FWIW I don't think you need to do rebase here, merging is better for collaboration, and avoids that hell of having to fix up some patch conflict over a class you know gets deleted later. When this gets pulled into trunk/branch-2, it'll be done as a squashed merge, so there's no harm in doing merges here rather than rebase
          Hide
          stevel@apache.org Steve Loughran added a comment -

          HADOOP-13852 allows hadoop to be declare a different version in its version-info than in the POM. This is currently needed to allow spark to act as a regression test for the s3guard work.

          I've already got it in the HADOOP-13786 branch, but by putting it into the base s3guard branch lets me test dynamodb and other patches, while the committer code is still a WiP

          Show
          stevel@apache.org Steve Loughran added a comment - HADOOP-13852 allows hadoop to be declare a different version in its version-info than in the POM. This is currently needed to allow spark to act as a regression test for the s3guard work. I've already got it in the HADOOP-13786 branch, but by putting it into the base s3guard branch lets me test dynamodb and other patches, while the committer code is still a WiP
          Hide
          fabbri Aaron Fabbri added a comment -

          +1 on adding patch v2 from HADOOP-13852 to the s3guard feature branch (HADOOP-13345).

          Show
          fabbri Aaron Fabbri added a comment - +1 on adding patch v2 from HADOOP-13852 to the s3guard feature branch ( HADOOP-13345 ).
          Hide
          stevel@apache.org Steve Loughran added a comment -

          thanks. I've actually got permission to add to trunk...how about I do that and then do another merge of trunk -> HDP-13345?

          Show
          stevel@apache.org Steve Loughran added a comment - thanks. I've actually got permission to add to trunk...how about I do that and then do another merge of trunk -> HDP-13345?
          Hide
          fabbri Aaron Fabbri added a comment -

          Sounds good.

          Show
          fabbri Aaron Fabbri added a comment - Sounds good.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          done

          Show
          stevel@apache.org Steve Loughran added a comment - done
          Hide
          fabbri Aaron Fabbri added a comment -

          Looks like you did the commit and you also did a merge to update HADOOP-13345 w/ trunk. Thanks!

          Show
          fabbri Aaron Fabbri added a comment - Looks like you did the commit and you also did a merge to update HADOOP-13345 w/ trunk. Thanks!
          Hide
          stevel@apache.org Steve Loughran added a comment -

          yeah, but the patch has been reverted from trunk as it broke a small bit of YARN. Once the final patch is in I'll merge up again.

          Show
          stevel@apache.org Steve Loughran added a comment - yeah, but the patch has been reverted from trunk as it broke a small bit of YARN. Once the final patch is in I'll merge up again.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          the CLI needs to be able to rebuild its state from the bucket, including detecting & logging any inconsistencies.

          Show
          stevel@apache.org Steve Loughran added a comment - the CLI needs to be able to rebuild its state from the bucket, including detecting & logging any inconsistencies.
          Hide
          fabbri Aaron Fabbri added a comment -

          Happy new year Steve Loughran and Mingliang Liu. I'd like to figure out the set of JIRAs we want to resolve before merging the feature branch. We've discussed before targeting basic list consistency for the initial feature set. It seems like the main features we need are:

          Needed:

          • Any DynamoDB bugs (initialization stuff, etc).
          • Basic CLI functionality. HADOOP-13650
          • Multi-bucket improvements, including read-only buckets. HADOOP-13876. I think the solution will be to tackle the per-bucket config JIRA HADOOP-13336

          Maybe:

          • Make sure testing is sorted. Currently tests are good but a bit manual. Do we want to do HADOOP-13589 pre merge?

          Post merge:

          • The rest of the JIRAs.

          Let me know what you think. Should we create a new "s3guard phase 2" umbrella JIRA so we can start moving stuff over once we have consensus?

          Show
          fabbri Aaron Fabbri added a comment - Happy new year Steve Loughran and Mingliang Liu . I'd like to figure out the set of JIRAs we want to resolve before merging the feature branch. We've discussed before targeting basic list consistency for the initial feature set. It seems like the main features we need are: Needed: Any DynamoDB bugs (initialization stuff, etc). Basic CLI functionality. HADOOP-13650 Multi-bucket improvements, including read-only buckets. HADOOP-13876 . I think the solution will be to tackle the per-bucket config JIRA HADOOP-13336 Maybe: Make sure testing is sorted. Currently tests are good but a bit manual. Do we want to do HADOOP-13589 pre merge? Post merge: The rest of the JIRAs. Let me know what you think. Should we create a new "s3guard phase 2" umbrella JIRA so we can start moving stuff over once we have consensus?
          Hide
          liuml07 Mingliang Liu added a comment -

          Hi Aaron Fabbri, happy new year!

          The list makes sense to me. I also think of the retry policies should be needed. We can 1) make it pluggable and 2) provide a simple one; other efforts to make the retry policy work better will be appreciated. My current work is focused on fixing DDB related bugs/improvements.

          Thanks,

          Show
          liuml07 Mingliang Liu added a comment - Hi Aaron Fabbri , happy new year! The list makes sense to me. I also think of the retry policies should be needed. We can 1) make it pluggable and 2) provide a simple one; other efforts to make the retry policy work better will be appreciated. My current work is focused on fixing DDB related bugs/improvements. Thanks,
          Hide
          fabbri Aaron Fabbri added a comment -

          Thanks for your input Mingliang Liu. The reason I propose doing the retry logic after merge is:

          1. It will change S3AFileSystem code significantly. At least in terms of dealing with merge conflicts, things like putting a function inside a retry loop (increasing the indent of all the code) might mean a good amount of code churn.

          2. I'm not sure retries are needed for basic list consistency. The failure injection test I added demonstrates that basic listing consistency works. Since most S3 GETs are consistent, I would expect use cases like PUT, list, GET to work as is.

          Let me know if I'm missing something. Looking forward to others' opinions too. Cheers!

          Show
          fabbri Aaron Fabbri added a comment - Thanks for your input Mingliang Liu . The reason I propose doing the retry logic after merge is: 1. It will change S3AFileSystem code significantly. At least in terms of dealing with merge conflicts, things like putting a function inside a retry loop (increasing the indent of all the code) might mean a good amount of code churn. 2. I'm not sure retries are needed for basic list consistency. The failure injection test I added demonstrates that basic listing consistency works. Since most S3 GETs are consistent, I would expect use cases like PUT, list, GET to work as is. Let me know if I'm missing something. Looking forward to others' opinions too. Cheers!
          Hide
          eddyxu Lei (Eddy) Xu added a comment -

          Hi, Aaron Fabbri, Mingliang Liu, Steve Loughran. Happy new year.

          Mingliang Liu, regarding HADOOP-13650 (CLI), for DynamoDBMetadataStore#initialize(Configuration), does it still require s3 fs name defined in the configuration? I think that the user might not be used to specify fs.defaultFS to an S3 bucket. It seems that logic that creating AWS credential from S3 URL is deeply in the code. The consequence of it is that even hadoop s3a [init|destroy] can directly use a DynamoDB URL to create or clear / destroy the metadata store, it still requires the user to specify a s3a url s3a://bucket to create the FS instance. It is also undesirable for mutli-bucket cases, i.e., sharing one dynamodb table with multiple buckets.

          Should we get the current form of CLI committed first, then we change the CLI parameters after the feature branch merged into trunk?

          Thanks

          Show
          eddyxu Lei (Eddy) Xu added a comment - Hi, Aaron Fabbri , Mingliang Liu , Steve Loughran . Happy new year. Mingliang Liu , regarding HADOOP-13650 (CLI), for DynamoDBMetadataStore#initialize(Configuration) , does it still require s3 fs name defined in the configuration? I think that the user might not be used to specify fs.defaultFS to an S3 bucket. It seems that logic that creating AWS credential from S3 URL is deeply in the code. The consequence of it is that even hadoop s3a [init|destroy] can directly use a DynamoDB URL to create or clear / destroy the metadata store, it still requires the user to specify a s3a url s3a://bucket to create the FS instance. It is also undesirable for mutli-bucket cases, i.e., sharing one dynamodb table with multiple buckets. Should we get the current form of CLI committed first, then we change the CLI parameters after the feature branch merged into trunk? Thanks
          Hide
          stevel@apache.org Steve Loughran added a comment -

          I'd like to see manageability, correctness and maintainability, especially to the extent that we are confident that people can start to play with it and not get into a mess.

          This means: that CLI is usable, documented, tested. I think we need some fsck option to verify that everything matches between DB and FS, fsck --fix to do the corrections.

          Other management? Some metrics, consistent with the rest of the S3A metrics, in terms of what they count, and appearing in S3AFileSystem.toString for some diagnostics downstream

          maintainability: let's look at the code and see if its integration with S3A is as clean as we can have it. Once you have things merged into trunk and two branches forking again, it becomes much harder to clean up.

          Show
          stevel@apache.org Steve Loughran added a comment - I'd like to see manageability, correctness and maintainability, especially to the extent that we are confident that people can start to play with it and not get into a mess. This means: that CLI is usable, documented, tested. I think we need some fsck option to verify that everything matches between DB and FS, fsck --fix to do the corrections. Other management? Some metrics, consistent with the rest of the S3A metrics, in terms of what they count, and appearing in S3AFileSystem.toString for some diagnostics downstream maintainability: let's look at the code and see if its integration with S3A is as clean as we can have it. Once you have things merged into trunk and two branches forking again, it becomes much harder to clean up.
          Hide
          cnauroth Chris Nauroth added a comment -

          I would prefer to see HADOOP-13589 completed before merge. Being able to run the existing S3A test suite with S3Guard enabled would help ensure that we're maintaining the existing semantics as much as possible as we iterate further on the code.

          Show
          cnauroth Chris Nauroth added a comment - I would prefer to see HADOOP-13589 completed before merge. Being able to run the existing S3A test suite with S3Guard enabled would help ensure that we're maintaining the existing semantics as much as possible as we iterate further on the code.
          Hide
          fabbri Aaron Fabbri added a comment -

          Hi Chris Nauroth. Just to clarify, you can run all the S3A integration tests with S3Guard enabled, and the steps are documented in s3guard.md. I agree having better integration / automation is important. Will discuss on that JIRA.

          Show
          fabbri Aaron Fabbri added a comment - Hi Chris Nauroth . Just to clarify, you can run all the S3A integration tests with S3Guard enabled, and the steps are documented in s3guard.md. I agree having better integration / automation is important. Will discuss on that JIRA.
          Hide
          fabbri Aaron Fabbri added a comment -

          Thanks Steve Loughran, this is good stuff. Sounds like you are adding (1) CLI polish / docs / fsck and (2) FS metrics to my list.

          In terms of clean integration with S3A code: I spent a lot of time streamlining the S3AFileSystem integration. One of the main warts IMO is the isEmptyDirectory thing, but refactoring that post-merge seems much easier to me (and also isn't S3Guard specific per-se). Anything else actionable we can do to address the maintainability point? Maybe we do another feature branch code review with this in mind after we get the above features in?

          Show
          fabbri Aaron Fabbri added a comment - Thanks Steve Loughran , this is good stuff. Sounds like you are adding (1) CLI polish / docs / fsck and (2) FS metrics to my list. In terms of clean integration with S3A code: I spent a lot of time streamlining the S3AFileSystem integration. One of the main warts IMO is the isEmptyDirectory thing, but refactoring that post-merge seems much easier to me (and also isn't S3Guard specific per-se). Anything else actionable we can do to address the maintainability point? Maybe we do another feature branch code review with this in mind after we get the above features in?
          Hide
          liuml07 Mingliang Liu added a comment -

          Hi Lei (Eddy) Xu,

          You're right that in the current code users have to specify the defaultFS (via configuration file or -fs option from command line) for operating DDB metadata store directly. The s3 URI is used to create AmazonS3 object along with credential object (for S3 and DDB).

          1. AmazonS3 client, which is used for detecting the bucket region, is able to operate any bucket and creating such object is not binding to any specific bucket.
          2. As to the credentials in URI (e.g. s3://user:pass@bucket/), they're optional and deprecated. This pattern is not supported in DDB. However, the DDBClientFactory itself uses the same createAWSCredentialProviderSet as S3ClientFactory does so it honors the creds in URI name. The reason it's not yet supported is that after FS#initialization, S3AFS has stripped the creds and returns the scheme://host only URI for creating a MetadataStore. One possible fix is to pass the name URI which contains the creds to S3Guard#getMetadataStore().
            S3AFileSystem#initialize()
            -      metadataStore = S3Guard.getMetadataStore(this);
            +      metadataStore = S3Guard.getMetadataStore(this, name);
            

          For command line operations, I think fs.defaultFS is a basic config for users and specifying s3://bucket seems not heavy. But still, we can remove this constraint.

          1. Option 1: The DDB table name has to be specified via configuration; and we assume the bucket name is the DDB table name if the defaultFS is not provided (or it's not S3). To determine the region of the bucket, we still assume the S3 bucket (whose name is the same as DDB table name) does exist; and the AmazonS3.getBucketLocation will have the value.
          2. Option 2: The DDB table name and endpoint have to be specified via configuration. We can determine the DDB region by the DDB endpoint. This way, we don't have to know the related S3 bucket for the DDB metadata store to operate.

          I prefer the 2nd approach. I'm not sure both of the options work but I can work on a wip patch recently; or as you suggested, we can support this later.

          Show
          liuml07 Mingliang Liu added a comment - Hi Lei (Eddy) Xu , You're right that in the current code users have to specify the defaultFS (via configuration file or -fs option from command line) for operating DDB metadata store directly. The s3 URI is used to create AmazonS3 object along with credential object (for S3 and DDB). AmazonS3 client, which is used for detecting the bucket region, is able to operate any bucket and creating such object is not binding to any specific bucket. As to the credentials in URI (e.g. s3://user:pass@bucket/), they're optional and deprecated. This pattern is not supported in DDB. However, the DDBClientFactory itself uses the same createAWSCredentialProviderSet as S3ClientFactory does so it honors the creds in URI name. The reason it's not yet supported is that after FS#initialization , S3AFS has stripped the creds and returns the scheme://host only URI for creating a MetadataStore. One possible fix is to pass the name URI which contains the creds to S3Guard#getMetadataStore(). S3AFileSystem#initialize() - metadataStore = S3Guard.getMetadataStore( this ); + metadataStore = S3Guard.getMetadataStore( this , name); For command line operations, I think fs.defaultFS is a basic config for users and specifying s3://bucket seems not heavy. But still, we can remove this constraint. Option 1: The DDB table name has to be specified via configuration; and we assume the bucket name is the DDB table name if the defaultFS is not provided (or it's not S3). To determine the region of the bucket, we still assume the S3 bucket (whose name is the same as DDB table name) does exist; and the AmazonS3.getBucketLocation will have the value. Option 2: The DDB table name and endpoint have to be specified via configuration. We can determine the DDB region by the DDB endpoint. This way, we don't have to know the related S3 bucket for the DDB metadata store to operate. I prefer the 2nd approach. I'm not sure both of the options work but I can work on a wip patch recently; or as you suggested, we can support this later.
          Hide
          liuml07 Mingliang Liu added a comment -

          Yeah, the two points both make sense to me. I'll get my hand dirty by working on that early; the code might not go to the feature branch before merge to trunk. For the 2nd point, I was thinking of rename, which uses the copy operation. But I was not sure GET after copy is consistent. Will check that as well.

          Show
          liuml07 Mingliang Liu added a comment - Yeah, the two points both make sense to me. I'll get my hand dirty by working on that early; the code might not go to the feature branch before merge to trunk. For the 2nd point, I was thinking of rename, which uses the copy operation. But I was not sure GET after copy is consistent. Will check that as well.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          -1 to anything working with user:pass in URIs. IT's wrong because it contaminates the logs with secrets. With HADOOP-13336 you can define per-bucket secrets in core-site.xml, or the command line with -D fs.s3a.bucket.stevel-new.aws.secret.id=mykey; they'll get picked up in he FS instance.

          This means that the bucket URI will be needed for setup, but it doesn't have to involve instantiating the FS, just:

          bucket = new URI(fsName).getHost()
          conf = propagateBucketOptions(origConf, bucket)
          s3ClientFactoryClass = conf.getClass(
                    S3_CLIENT_FACTORY_IMPL, DEFAULT_S3_CLIENT_FACTORY_IMPL,
                    S3ClientFactory.class);
          s3Client = ReflectionUtils.newInstance(s3ClientFactoryClass, conf)
                    .createS3Client(name, uri);
          

          Thats all. But we will need that bucket name if there's any need to go near bucket-related options

          Show
          stevel@apache.org Steve Loughran added a comment - -1 to anything working with user:pass in URIs. IT's wrong because it contaminates the logs with secrets. With HADOOP-13336 you can define per-bucket secrets in core-site.xml, or the command line with -D fs.s3a.bucket.stevel-new.aws.secret.id=mykey ; they'll get picked up in he FS instance. This means that the bucket URI will be needed for setup, but it doesn't have to involve instantiating the FS, just: bucket = new URI(fsName).getHost() conf = propagateBucketOptions(origConf, bucket) s3ClientFactoryClass = conf.getClass( S3_CLIENT_FACTORY_IMPL, DEFAULT_S3_CLIENT_FACTORY_IMPL, S3ClientFactory.class); s3Client = ReflectionUtils.newInstance(s3ClientFactoryClass, conf) .createS3Client(name, uri); Thats all. But we will need that bucket name if there's any need to go near bucket-related options
          Hide
          stevel@apache.org Steve Loughran added a comment -

          I'd like to merge trunk into this branch (i.e. not a rebase, just a merge). Why? I think I need HADOOP-13922 for building/testing spark

          Show
          stevel@apache.org Steve Loughran added a comment - I'd like to merge trunk into this branch (i.e. not a rebase, just a merge). Why? I think I need HADOOP-13922 for building/testing spark
          Hide
          fabbri Aaron Fabbri added a comment -

          +1 merging in trunk.. Works for me Steve Loughran

          Show
          fabbri Aaron Fabbri added a comment - +1 merging in trunk.. Works for me Steve Loughran
          Hide
          liuml07 Mingliang Liu added a comment -

          I re-visited the retry logic and think it's not a needed item for basic list consistency. I can think of a few more JIRAs like better request throttling handling in DynamoDB, failure detection and recovery between writing to S3 and S3Guard etc; I think they can be post merge subtasks. Thanks,

          Show
          liuml07 Mingliang Liu added a comment - I re-visited the retry logic and think it's not a needed item for basic list consistency. I can think of a few more JIRAs like better request throttling handling in DynamoDB, failure detection and recovery between writing to S3 and S3Guard etc; I think they can be post merge subtasks. Thanks,
          Hide
          stevel@apache.org Steve Loughran added a comment -

          I've created a JIRA to explicitly track the things people think is needed for the first preview/merge: HADOOP-13998

          I want the version marker in, as we mustn't let this get into the field without one: otherwise the whole version marker logic is cripped from outset.

          Show
          stevel@apache.org Steve Loughran added a comment - I've created a JIRA to explicitly track the things people think is needed for the first preview/merge: HADOOP-13998 I want the version marker in, as we mustn't let this get into the field without one: otherwise the whole version marker logic is cripped from outset.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          FYI, I've just cherry-picked in HADOOP-13999, "Add -DskipShade maven profile to disable jar shading to reduce compile time"

          I/we will probably need to roll this back before the next trunk merge; until then it lets us build this branch without waiting 6+ minutes for the shade

          Show
          stevel@apache.org Steve Loughran added a comment - FYI, I've just cherry-picked in HADOOP-13999 , "Add -DskipShade maven profile to disable jar shading to reduce compile time" I/we will probably need to roll this back before the next trunk merge; until then it lets us build this branch without waiting 6+ minutes for the shade
          Hide
          stevel@apache.org Steve Loughran added a comment -

          One thing everyone needs to keep an eye on is HADOOP-14000 : support for millions of files. This'll require big changes in the current DirListingMetadata class, as well as hooking up all the S3aFS list* operations

          Show
          stevel@apache.org Steve Loughran added a comment - One thing everyone needs to keep an eye on is HADOOP-14000 : support for millions of files. This'll require big changes in the current DirListingMetadata class, as well as hooking up all the S3aFS list* operations
          Hide
          fabbri Aaron Fabbri added a comment -

          Heads up Steve Loughran; I plan on merging the latest Trunk into the HADOOP-13345 branch in an hour or so.

          Show
          fabbri Aaron Fabbri added a comment - Heads up Steve Loughran ; I plan on merging the latest Trunk into the HADOOP-13345 branch in an hour or so.
          Hide
          fabbri Aaron Fabbri added a comment -

          Done: Just pushed merge from latest trunk.

          Show
          fabbri Aaron Fabbri added a comment - Done: Just pushed merge from latest trunk.
          Hide
          sameer.chouhdary Sameer Choudhary added a comment -

          Hi,

          Today, I attended the talk on the project at Spark Summit 2017. Thanks for putting in all the effort!

          I have a question regarding pricing of DynamoDB. It charges on read/write request rate. So, users might have to pay high amount of price for getting the consistency guarantees. This would especially affect large Spark Jobs with many parallel executing tasks that are trying to read/write to DynamoDB. Putting throttling will affect the job performance. Some benchmarks here would be great.

          A solution could be for S3Guard to additionally support for custom Key Value store such as Apache HBase that supports strictly consistent reads/writes. A user can create a separate cluster or use the same Spark cluster to setup the store. The benefit of the approach is that users can now achieve high throughput on even large Spark jobs with paying just a fraction of cost.

          Show
          sameer.chouhdary Sameer Choudhary added a comment - Hi, Today, I attended the talk on the project at Spark Summit 2017. Thanks for putting in all the effort! I have a question regarding pricing of DynamoDB. It charges on read/write request rate. So, users might have to pay high amount of price for getting the consistency guarantees. This would especially affect large Spark Jobs with many parallel executing tasks that are trying to read/write to DynamoDB. Putting throttling will affect the job performance. Some benchmarks here would be great. A solution could be for S3Guard to additionally support for custom Key Value store such as Apache HBase that supports strictly consistent reads/writes. A user can create a separate cluster or use the same Spark cluster to setup the store. The benefit of the approach is that users can now achieve high throughput on even large Spark jobs with paying just a fraction of cost.
          Hide
          fabbri Aaron Fabbri added a comment -

          S3Guard is specifically designed to allow multiple backend implementations for MetadataStore (the interface that stores metadata). So far we have an in-memory "reference" or testing implementation, and one for DynamoDB. We expect more back ends to be implemented in the future.

          Note there is also a test suite that ensures that different implementations provide correct semantics.

          Show
          fabbri Aaron Fabbri added a comment - S3Guard is specifically designed to allow multiple backend implementations for MetadataStore (the interface that stores metadata). So far we have an in-memory "reference" or testing implementation, and one for DynamoDB. We expect more back ends to be implemented in the future. Note there is also a test suite that ensures that different implementations provide correct semantics.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          I should add to aaron with

          1. the in-memory one really, really is for testing only.
          2. it won't be throttling per-se, more than when you get API calls rejected, the client will back off. See HADOOP-13904.

          I like your thoughts about HBase; there's no obvious reason why this won't work (though you need to persist it somehow). For now though, Dynamo is what we target, so we can just use something that AWS keeps up. It helps for dev & test as we don't need to bring up miniHbase clusters.

          Show
          stevel@apache.org Steve Loughran added a comment - I should add to aaron with the in-memory one really, really is for testing only. it won't be throttling per-se, more than when you get API calls rejected, the client will back off. See HADOOP-13904 . I like your thoughts about HBase; there's no obvious reason why this won't work (though you need to persist it somehow). For now though, Dynamo is what we target, so we can just use something that AWS keeps up. It helps for dev & test as we don't need to bring up miniHbase clusters.
          Hide
          sameer.chouhdary Sameer Choudhary added a comment -

          Makes sense. Thanks! For persistance, frequent snapshotting to S3 have to be implemented by the user for their Metadata store. One that is loss less. However, I agree that for most users Dynamo DB based solution should be sufficient.

          Show
          sameer.chouhdary Sameer Choudhary added a comment - Makes sense. Thanks! For persistance, frequent snapshotting to S3 have to be implemented by the user for their Metadata store. One that is loss less. However, I agree that for most users Dynamo DB based solution should be sufficient.
          Hide
          liuml07 Mingliang Liu added a comment -

          I propose we merge from trunk again. I have fixed the conflicts so if you vote up, I'll simply push.

          commit a434d50fe4547f32de7b1fafb3c370a7123cda2d
          Merge: 8b37b6a96c 02c549484a
          Author: Mingliang Liu <liuml07@apache.org>
          Date:   Thu Feb 16 22:38:55 2017 -0800
          
              Merge branch 'trunk' into HADOOP-13345
          
              After HADOOP-14040, we use shaded aws-sdk uber-JAR so don't have to
              bring DynamoDB dependency explicitly. However, for tests we do need the
              DynamoDBLocal dependency from its Maven repository.
          

          I got integration tests run against us-west-1. Please confirm as this merge is major. Thanks,

          Failed tests:
            ITestS3AEncryptionSSEC.testCreateFileAndReadWithDifferentEncryptionKey:60 Expected to find 'Forbidden (Service: Amazon S3; Status Code: 403;' but got unexpected exception:java.nio.file.AccessDeniedException: s3a://mliu-s3guard/test/testCreateFileAndReadWithDifferentEncryptionKey-0800: Reopen at position 0 on s3a://mliu-s3guard/test/testCreateFileAndReadWithDifferentEncryptionKey-0800: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 4E0A3A7A0B2D8005), S3 Extended Request ID: ZKm3w28W57skopifj0wH5p+c8KF1NVzL7ItNG067aK6FNK9dk1kmGrykda/NI4EhtFmN1/bv60c=
          	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)
          	at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:165)
          	at org.apache.hadoop.fs.s3a.S3AInputStream.lazySeek(S3AInputStream.java:291)
          	at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:374)
          
          Tests in error:
            ITestS3ACredentialsInURL.testInstantiateFromURL:86 » InterruptedIO initTable: ...
            ITestS3AFileSystemContract>FileSystemContractBaseTest.testRenameToDirWithSamePrefixAllowed:669->FileSystemContractBaseTest.rename:525 » AWSServiceIO
          

          For failing test ITestS3AEncryptionSSEC I'm not sure it's the caused by the merge; ITestS3ACredentialsInURL is known not supported as credentials in URL are very unsafe. ITestS3AFileSystemContract#testRenameToDirWithSamePrefixAllowed I can pass it 2nd run.

          Show
          liuml07 Mingliang Liu added a comment - I propose we merge from trunk again. I have fixed the conflicts so if you vote up, I'll simply push. commit a434d50fe4547f32de7b1fafb3c370a7123cda2d Merge: 8b37b6a96c 02c549484a Author: Mingliang Liu <liuml07@apache.org> Date: Thu Feb 16 22:38:55 2017 -0800 Merge branch 'trunk' into HADOOP-13345 After HADOOP-14040, we use shaded aws-sdk uber-JAR so don't have to bring DynamoDB dependency explicitly. However, for tests we do need the DynamoDBLocal dependency from its Maven repository. I got integration tests run against us-west-1. Please confirm as this merge is major. Thanks, Failed tests: ITestS3AEncryptionSSEC.testCreateFileAndReadWithDifferentEncryptionKey:60 Expected to find 'Forbidden (Service: Amazon S3; Status Code: 403;' but got unexpected exception:java.nio.file.AccessDeniedException: s3a: //mliu-s3guard/test/testCreateFileAndReadWithDifferentEncryptionKey-0800: Reopen at position 0 on s3a://mliu-s3guard/test/testCreateFileAndReadWithDifferentEncryptionKey-0800: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 4E0A3A7A0B2D8005), S3 Extended Request ID: ZKm3w28W57skopifj0wH5p+c8KF1NVzL7ItNG067aK6FNK9dk1kmGrykda/NI4EhtFmN1/bv60c= at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158) at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:165) at org.apache.hadoop.fs.s3a.S3AInputStream.lazySeek(S3AInputStream.java:291) at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:374) Tests in error: ITestS3ACredentialsInURL.testInstantiateFromURL:86 » InterruptedIO initTable: ... ITestS3AFileSystemContract>FileSystemContractBaseTest.testRenameToDirWithSamePrefixAllowed:669->FileSystemContractBaseTest.rename:525 » AWSServiceIO For failing test ITestS3AEncryptionSSEC I'm not sure it's the caused by the merge; ITestS3ACredentialsInURL is known not supported as credentials in URL are very unsafe. ITestS3AFileSystemContract#testRenameToDirWithSamePrefixAllowed I can pass it 2nd run.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          ITestS3AEncryptionSSEC is a new test from HADOOP-13075; have a look to see if it is failing for you on trunk & if it does, open a JIRA.

          Maybe we're just translating the exception more strictly.

          I don't think s3guard and credentials in URLs should work together at all, in fact, explicitly refusing to work with them could be extra incentive to stop using it.

          Show
          stevel@apache.org Steve Loughran added a comment - ITestS3AEncryptionSSEC is a new test from HADOOP-13075 ; have a look to see if it is failing for you on trunk & if it does, open a JIRA. Maybe we're just translating the exception more strictly. I don't think s3guard and credentials in URLs should work together at all, in fact, explicitly refusing to work with them could be extra incentive to stop using it.
          Hide
          fabbri Aaron Fabbri added a comment -

          Once we figure out the SSE test failure I'm +1 to do a merge. Looks like exception behavior is different in trunk? Or the trunk test is also broken?

          Also if you can paste a summary of tests (cli command used, number tests run / error / etc), if you still have it handy, that would be awesome. Trying to make sure this branch is stable.

          Show
          fabbri Aaron Fabbri added a comment - Once we figure out the SSE test failure I'm +1 to do a merge. Looks like exception behavior is different in trunk? Or the trunk test is also broken? Also if you can paste a summary of tests (cli command used, number tests run / error / etc), if you still have it handy, that would be awesome. Trying to make sure this branch is stable.
          Hide
          liuml07 Mingliang Liu added a comment -

          Thanks for your comments.

          The summary of test report is:

          $ mvn -Dit.test='ITestS3A*' -Dscale -Dtest=none -Ds3guard -Ddynamo -q clean verify
          
          Results :
          
          Failed tests:
            ITestS3AEncryptionSSEC.testCreateFileAndReadWithDifferentEncryptionKey:60 Expected to find 'Forbidden (Service: Amazon S3; Status Code: 403;' but got unexpected exception:java.nio.file.AccessDeniedException: s3a://mliu-s3guard/test/testCreateFileAndReadWithDifferentEncryptionKey-0800: Reopen at position 0 on s3a://mliu-s3guard/test/testCreateFileAndReadWithDifferentEncryptionKey-0800: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 8A23739237751886), S3 Extended Request ID: BEDP2iHUuZXjZTnU/s1f/8+kHM7F+czV2CAGJm3FEpzxxxo37nb+OqbswYsM7vUpWd682RP+4iY=
          	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)
          	at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:165)
          	at org.apache.hadoop.fs.s3a.S3AInputStream.lazySeek(S3AInputStream.java:291)
          	at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:374)
          	at java.io.DataInputStream.read(DataInputStream.java:149)
          	at org.apache.hadoop.fs.contract.ContractTestUtils.readDataset(ContractTestUtils.java:180)
          	at org.apache.hadoop.fs.contract.ContractTestUtils.verifyFileContents(ContractTestUtils.java:204)
          	at org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEC.lambda$testCreateFileAndReadWithDifferentEncryptionKey$4(ITestS3AEncryptionSSEC.java:80)
          	at org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:346)
          	at org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:418)
          	at org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEC.testCreateFileAndReadWithDifferentEncryptionKey(ITestS3AEncryptionSSEC.java:60)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:497)
          	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
          	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
          	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
          	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
          	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
          	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
          	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
          	at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
          Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 8A23739237751886), S3 Extended Request ID: BEDP2iHUuZXjZTnU/s1f/8+kHM7F+czV2CAGJm3FEpzxxxo37nb+OqbswYsM7vUpWd682RP+4iY=
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1586)
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1254)
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1035)
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:747)
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:721)
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:704)
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:672)
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:654)
          	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:518)
          	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4185)
          	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4132)
          	at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1373)
          	at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:158)
          	... 21 more
          
          
          Tests in error:
            ITestS3ACredentialsInURL.testInstantiateFromURL:86 » InterruptedIO initTable: ...
            ITestS3AFileSystemContract>FileSystemContractBaseTest.testRenameToDirWithSamePrefixAllowed:669->FileSystemContractBaseTest.rename:525 » AWSServiceIO
          
          Tests run: 340, Failures: 1, Errors: 2, Skipped: 13
          

          As we discussed above,

          1. the intermittent failure testRenameToDirWithSamePrefixAllowed is tracked by HADOOP-14036
          2. ITestS3ACredentialsInURL#testInstantiateFromURL is not supported. Should we simply skip this test if the metadata store is enabled (in a separate JIRA)?
          3. ITestS3AEncryptionSSEC started failing after merge because of the strict exception message assertion; it is fine in trunk. The only change is to remove "Forbidden" word as it would be "Access Denied" sometimes along with the same exception class java.nio.file.AccessDeniedException and message Service: Amazon S3; Status Code: 403; Error Code: AccessDenied;. For this I made the change when merging.
          Show
          liuml07 Mingliang Liu added a comment - Thanks for your comments. The summary of test report is: $ mvn -Dit.test='ITestS3A*' -Dscale -Dtest=none -Ds3guard -Ddynamo -q clean verify Results : Failed tests: ITestS3AEncryptionSSEC.testCreateFileAndReadWithDifferentEncryptionKey:60 Expected to find 'Forbidden (Service: Amazon S3; Status Code: 403;' but got unexpected exception:java.nio.file.AccessDeniedException: s3a: //mliu-s3guard/test/testCreateFileAndReadWithDifferentEncryptionKey-0800: Reopen at position 0 on s3a://mliu-s3guard/test/testCreateFileAndReadWithDifferentEncryptionKey-0800: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 8A23739237751886), S3 Extended Request ID: BEDP2iHUuZXjZTnU/s1f/8+kHM7F+czV2CAGJm3FEpzxxxo37nb+OqbswYsM7vUpWd682RP+4iY= at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158) at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:165) at org.apache.hadoop.fs.s3a.S3AInputStream.lazySeek(S3AInputStream.java:291) at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:374) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.fs.contract.ContractTestUtils.readDataset(ContractTestUtils.java:180) at org.apache.hadoop.fs.contract.ContractTestUtils.verifyFileContents(ContractTestUtils.java:204) at org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEC.lambda$testCreateFileAndReadWithDifferentEncryptionKey$4(ITestS3AEncryptionSSEC.java:80) at org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:346) at org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:418) at org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEC.testCreateFileAndReadWithDifferentEncryptionKey(ITestS3AEncryptionSSEC.java:60) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 8A23739237751886), S3 Extended Request ID: BEDP2iHUuZXjZTnU/s1f/8+kHM7F+czV2CAGJm3FEpzxxxo37nb+OqbswYsM7vUpWd682RP+4iY= at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1586) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1254) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1035) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:747) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:721) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:704) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:672) at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:654) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:518) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4185) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4132) at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1373) at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:158) ... 21 more Tests in error: ITestS3ACredentialsInURL.testInstantiateFromURL:86 » InterruptedIO initTable: ... ITestS3AFileSystemContract>FileSystemContractBaseTest.testRenameToDirWithSamePrefixAllowed:669->FileSystemContractBaseTest.rename:525 » AWSServiceIO Tests run: 340, Failures: 1, Errors: 2, Skipped: 13 As we discussed above, the intermittent failure testRenameToDirWithSamePrefixAllowed is tracked by HADOOP-14036 ITestS3ACredentialsInURL#testInstantiateFromURL is not supported. Should we simply skip this test if the metadata store is enabled (in a separate JIRA)? ITestS3AEncryptionSSEC started failing after merge because of the strict exception message assertion; it is fine in trunk. The only change is to remove "Forbidden" word as it would be "Access Denied" sometimes along with the same exception class java.nio.file.AccessDeniedException and message Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; . For this I made the change when merging.
          Hide
          fabbri Aaron Fabbri added a comment -

          Thanks Mingliang Liu!

          $ mvn -Dit.test='ITestS3A*' -Dscale -Dtest=none -Ds3guard -Ddynamo -q clean verify

          Thats ok.. It does miss ITestS3Guard

          {ListConsistency, ToolDynamoDB}

          , FYI, but you got most of the tests.

          2. ITestS3ACredentialsInURL#testInstantiateFromURL is not supported. Should we simply skip this test if the metadata store is enabled (in a separate JIRA)?

          Yes. Nothing new here and we do need to fix it.

          3. ITestS3AEncryptionSSEC started failing after merge because of the strict exception message assertion; it is fine in trunk. The only change is to remove "Forbidden" word as it would be "Access Denied" sometimes along with the same exception class java.nio.file.AccessDeniedException and message Service: Amazon S3; Status Code: 403; Error Code: AccessDenied;. For this I made the change when merging.

          Sounds ok. Curious, what is our difference in HADOOP-13345 that changes this? Is our feature branch exception behavior different?

          Show
          fabbri Aaron Fabbri added a comment - Thanks Mingliang Liu ! $ mvn -Dit.test='ITestS3A*' -Dscale -Dtest=none -Ds3guard -Ddynamo -q clean verify Thats ok.. It does miss ITestS3Guard {ListConsistency, ToolDynamoDB} , FYI, but you got most of the tests. 2. ITestS3ACredentialsInURL#testInstantiateFromURL is not supported. Should we simply skip this test if the metadata store is enabled (in a separate JIRA)? Yes. Nothing new here and we do need to fix it. 3. ITestS3AEncryptionSSEC started failing after merge because of the strict exception message assertion; it is fine in trunk. The only change is to remove "Forbidden" word as it would be "Access Denied" sometimes along with the same exception class java.nio.file.AccessDeniedException and message Service: Amazon S3; Status Code: 403; Error Code: AccessDenied;. For this I made the change when merging. Sounds ok. Curious, what is our difference in HADOOP-13345 that changes this? Is our feature branch exception behavior different?
          Hide
          liuml07 Mingliang Liu added a comment -

          Thanks Aaron Fabbri for prompt reviewing test report!

          Thats ok.. It does miss ITestS3Guard{ListConsistency, ToolDynamoDB}

          That's a good catch. I just learned these two tests. However, org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency fails before/after merge. Do I need to configure something special?

          cmvn -Dit.test='ITestS3Guard*' -Dtest=none -Dscale -Ds3guard -Ddynamo -q clean verify
          
          -------------------------------------------------------
           T E S T S
          -------------------------------------------------------
          Running org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency
          Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 14.992 sec <<< FAILURE! - in org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency
          testListStatusWriteBack(org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency)  Time elapsed: 13.552 sec  <<< FAILURE!
          java.lang.AssertionError: Unexpected number of results from metastore. Metastore should only know about /XYZ: DirListingMetadata{path=s3a://mliu-s3guard/test/ListStatusWriteBack, listMap={s3a://mliu-s3guard/test/ListStatusWriteBack/XYZ=PathMetadata{fileStatus=S3AFileStatus{path=s3a://mliu-s3guard/test/ListStatusWriteBack/XYZ; isDirectory=true; modification_time=0; access_time=0; owner=mliu; group=mliu; permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=true}, s3a://mliu-s3guard/test/ListStatusWriteBack/123=PathMetadata{fileStatus=S3AFileStatus{path=s3a://mliu-s3guard/test/ListStatusWriteBack/123; isDirectory=true; modification_time=0; access_time=0; owner=mliu; group=mliu; permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=true}}, isAuthoritative=false}
          	at org.junit.Assert.fail(Assert.java:88)
          	at org.junit.Assert.assertTrue(Assert.java:41)
          	at org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency.testListStatusWriteBack(ITestS3GuardListConsistency.java:127)
          

          Curious, what is our difference in HADOOP-13345 that changes this? Is our feature branch exception behavior different?

          We don't really change anything in that part. I guess the reason is that, when enabling S3Guard, the code path that fails in S3AFileSystem changes for that test somehow. For example (to be confirmed), the request w/o S3Guard was calling getFileStatus() and fails with access denied exception containing "Forbidden" keyword; while the request w/ S3Guard is able to call getFileStatus() and fails later with read operations, which then fails with access denied exception containing "Access Denied" keyword. So I think relaxing exception message assertion in test should work just fine.

          Show
          liuml07 Mingliang Liu added a comment - Thanks Aaron Fabbri for prompt reviewing test report! Thats ok.. It does miss ITestS3Guard{ListConsistency, ToolDynamoDB} That's a good catch. I just learned these two tests. However, org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency fails before/after merge. Do I need to configure something special? cmvn -Dit.test='ITestS3Guard*' -Dtest=none -Dscale -Ds3guard -Ddynamo -q clean verify ------------------------------------------------------- T E S T S ------------------------------------------------------- Running org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 14.992 sec <<< FAILURE! - in org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency testListStatusWriteBack(org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency) Time elapsed: 13.552 sec <<< FAILURE! java.lang.AssertionError: Unexpected number of results from metastore. Metastore should only know about /XYZ: DirListingMetadata{path=s3a: //mliu-s3guard/test/ListStatusWriteBack, listMap={s3a://mliu-s3guard/test/ListStatusWriteBack/XYZ=PathMetadata{fileStatus=S3AFileStatus{path=s3a://mliu-s3guard/test/ListStatusWriteBack/XYZ; isDirectory= true ; modification_time=0; access_time=0; owner=mliu; group=mliu; permission=rwxrwxrwx; isSymlink= false } isEmptyDirectory= true }, s3a://mliu-s3guard/test/ListStatusWriteBack/123=PathMetadata{fileStatus=S3AFileStatus{path=s3a://mliu-s3guard/test/ListStatusWriteBack/123; isDirectory= true ; modification_time=0; access_time=0; owner=mliu; group=mliu; permission=rwxrwxrwx; isSymlink= false } isEmptyDirectory= true }}, isAuthoritative= false } at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency.testListStatusWriteBack(ITestS3GuardListConsistency.java:127) Curious, what is our difference in HADOOP-13345 that changes this? Is our feature branch exception behavior different? We don't really change anything in that part. I guess the reason is that, when enabling S3Guard, the code path that fails in S3AFileSystem changes for that test somehow. For example (to be confirmed), the request w/o S3Guard was calling getFileStatus() and fails with access denied exception containing "Forbidden" keyword; while the request w/ S3Guard is able to call getFileStatus() and fails later with read operations, which then fails with access denied exception containing "Access Denied" keyword. So I think relaxing exception message assertion in test should work just fine.
          Hide
          fabbri Aaron Fabbri added a comment -

          Sorry for delayed response..

          org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency fails before/after merge. Do I need to configure something special?

          You should not have to.

          Strange, it has been working for me. Could be a difference in the tables we use. I will run that test on the latest code now and see what happens. Could be related to HADOOP-14096, which just got fixed.

          We don't really change anything in that part. I guess the reason is that, when enabling S3Guard, the code path that fails in S3AFileSystem changes for that test somehow.

          Ok.. I'm wondering if that fix should be a separate commit instead of modifying the merge commit? Maybe ping Steve Loughran for his opinion.

          Show
          fabbri Aaron Fabbri added a comment - Sorry for delayed response.. org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency fails before/after merge. Do I need to configure something special? You should not have to. Strange, it has been working for me. Could be a difference in the tables we use. I will run that test on the latest code now and see what happens. Could be related to HADOOP-14096 , which just got fixed. We don't really change anything in that part. I guess the reason is that, when enabling S3Guard, the code path that fails in S3AFileSystem changes for that test somehow. Ok.. I'm wondering if that fix should be a separate commit instead of modifying the merge commit? Maybe ping Steve Loughran for his opinion.
          Hide
          liuml07 Mingliang Liu added a comment - - edited

          Hi Aaron, I updated from feature branch but still have the following error. This is the same for w/ and w/o merging from trunk and I assume it's not related to merge (not committed yet). I'll have a look at the test. Thanks,

          mvn -Dit.test='ITestS3GuardListConsistency' -Dtest=none -Dscale -Ds3guard -Ddynamo -q clean verify
          
          -------------------------------------------------------
           T E S T S
          -------------------------------------------------------
          Running org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency
          Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.544 sec <<< FAILURE! - in org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency
          testListStatusWriteBack(org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency)  Time elapsed: 3.147 sec  <<< FAILURE!
          java.lang.AssertionError: Unexpected number of results from metastore. Metastore should only know about /XYZ: DirListingMetadata{path=s3a://mliu-s3guard/test/ListStatusWriteBack, listMap={s3a://mliu-s3guard/test/ListStatusWriteBack/XYZ=PathMetadata{fileStatus=S3AFileStatus{path=s3a://mliu-s3guard/test/ListStatusWriteBack/XYZ; isDirectory=true; modification_time=0; access_time=0; owner=mliu; group=mliu; permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=true}, s3a://mliu-s3guard/test/ListStatusWriteBack/123=PathMetadata{fileStatus=S3AFileStatus{path=s3a://mliu-s3guard/test/ListStatusWriteBack/123; isDirectory=true; modification_time=0; access_time=0; owner=mliu; group=mliu; permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=true}}, isAuthoritative=false}
          	at org.junit.Assert.fail(Assert.java:88)
          	at org.junit.Assert.assertTrue(Assert.java:41)
          	at org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency.testListStatusWriteBack(ITestS3GuardListConsistency.java:127)
          

          I'm wondering if that fix should be a separate commit instead of modifying the merge commit?

          That makes sense. I'll file a separate JIRA for tracking this and submit a patch for fixing it (unless Steve objects). Merging commit should be simple/small/clear if possible.
          UPDATE: filed HADOOP-14102. Thanks,

          Show
          liuml07 Mingliang Liu added a comment - - edited Hi Aaron, I updated from feature branch but still have the following error. This is the same for w/ and w/o merging from trunk and I assume it's not related to merge (not committed yet). I'll have a look at the test. Thanks, mvn -Dit.test='ITestS3GuardListConsistency' -Dtest=none -Dscale -Ds3guard -Ddynamo -q clean verify ------------------------------------------------------- T E S T S ------------------------------------------------------- Running org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.544 sec <<< FAILURE! - in org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency testListStatusWriteBack(org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency) Time elapsed: 3.147 sec <<< FAILURE! java.lang.AssertionError: Unexpected number of results from metastore. Metastore should only know about /XYZ: DirListingMetadata{path=s3a: //mliu-s3guard/test/ListStatusWriteBack, listMap={s3a://mliu-s3guard/test/ListStatusWriteBack/XYZ=PathMetadata{fileStatus=S3AFileStatus{path=s3a://mliu-s3guard/test/ListStatusWriteBack/XYZ; isDirectory= true ; modification_time=0; access_time=0; owner=mliu; group=mliu; permission=rwxrwxrwx; isSymlink= false } isEmptyDirectory= true }, s3a://mliu-s3guard/test/ListStatusWriteBack/123=PathMetadata{fileStatus=S3AFileStatus{path=s3a://mliu-s3guard/test/ListStatusWriteBack/123; isDirectory= true ; modification_time=0; access_time=0; owner=mliu; group=mliu; permission=rwxrwxrwx; isSymlink= false } isEmptyDirectory= true }}, isAuthoritative= false } at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency.testListStatusWriteBack(ITestS3GuardListConsistency.java:127) I'm wondering if that fix should be a separate commit instead of modifying the merge commit? That makes sense. I'll file a separate JIRA for tracking this and submit a patch for fixing it (unless Steve objects). Merging commit should be simple/small/clear if possible. UPDATE: filed HADOOP-14102 . Thanks,
          Hide
          stevel@apache.org Steve Loughran added a comment -

          I'm trying to run the tests against s3 frankfurt, failing as I've an older table there, which I'll have to delete. Only here's the fun part, because the version check is in DynamoDBMetadataStore.initialize(), and that gets called by the CLI Destroy before metastore.destroy can be called, there's currently no way to destroy a table of an incompatible s3guard version from the CLI.

          How about that version check is made a (secret) internal config option, and in the destroy operation, that option is set so that the init code skips the version check?

          Show
          stevel@apache.org Steve Loughran added a comment - I'm trying to run the tests against s3 frankfurt, failing as I've an older table there, which I'll have to delete. Only here's the fun part, because the version check is in DynamoDBMetadataStore.initialize() , and that gets called by the CLI Destroy before metastore.destroy can be called, there's currently no way to destroy a table of an incompatible s3guard version from the CLI. How about that version check is made a (secret) internal config option, and in the destroy operation, that option is set so that the init code skips the version check?
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Also, as tables are global & shared with others, how about we add to the version marker the username and timestamp of creation, along with an optional message.

          Show
          stevel@apache.org Steve Loughran added a comment - Also, as tables are global & shared with others, how about we add to the version marker the username and timestamp of creation, along with an optional message.
          Hide
          liuml07 Mingliang Liu added a comment -

          Thanks Steve Loughran for testing and commenting here. I plan to commit the merge-from-trunk and file new JIRA for making ITestS3GuardListConsistency stable. The improvements to version marker/check can be a separate JIRA as well. If Aaron Fabbri thinks differently, we can also address these before merging.

          Show
          liuml07 Mingliang Liu added a comment - Thanks Steve Loughran for testing and commenting here. I plan to commit the merge-from-trunk and file new JIRA for making ITestS3GuardListConsistency stable. The improvements to version marker/check can be a separate JIRA as well. If Aaron Fabbri thinks differently, we can also address these before merging.
          Hide
          fabbri Aaron Fabbri added a comment -

          Works for me, thanks Mingliang Liu

          Show
          fabbri Aaron Fabbri added a comment - Works for me, thanks Mingliang Liu
          Hide
          stevel@apache.org Steve Loughran added a comment -

          merged trunk in again; test with -Ds3guard all tests worked except that intermittent root dir test (which is really annoying me; I think I may just skip unless I can fix it)

          Show
          stevel@apache.org Steve Loughran added a comment - merged trunk in again; test with -Ds3guard all tests worked except that intermittent root dir test (which is really annoying me; I think I may just skip unless I can fix it)
          Hide
          stevel@apache.org Steve Loughran added a comment -

          I've been thinking a bit about DDB table names in a large organisation. What we've done today: table name == bucket name works for us developers, but I'm so sure it will work in large orgs. Even in house I can see a trend for developers like rajesh to use his name in tables to help assign ownership.

          What about the prefix "s3guard+" on all buckets unless otherwise stated? That way, people looking at costs of AWS accounts can see which costs are due to s3guard, without having to look into the tables, or search for matching buckets of the same name...

          Show
          stevel@apache.org Steve Loughran added a comment - I've been thinking a bit about DDB table names in a large organisation. What we've done today: table name == bucket name works for us developers, but I'm so sure it will work in large orgs. Even in house I can see a trend for developers like rajesh to use his name in tables to help assign ownership. What about the prefix "s3guard+" on all buckets unless otherwise stated? That way, people looking at costs of AWS accounts can see which costs are due to s3guard, without having to look into the tables, or search for matching buckets of the same name...
          Hide
          liuml07 Mingliang Liu added a comment -

          Easy and useful change. According to:

          DDB - TableName
          The name of the table to create.

          Type: String

          Length Constraints: Minimum length of 3. Maximum length of 255.

          Pattern:

          [a-zA-Z0-9_.-]+

          Required: Yes

          And

          S3: Rules for Bucket Naming:
          Bucket names must be at least 3 and no more than 63 characters long.
          Bucket names must be a series of one or more labels. Adjacent labels are separated by a single period (.). Bucket names can contain lowercase letters, numbers, and hyphens. Each label must start and end with a lowercase letter or a number.
          Bucket names must not be formatted as an IP address (e.g., 192.168.5.4).
          When using virtual hosted–style buckets with SSL, the SSL wildcard certificate only matches buckets that do not contain periods. To work around this, use HTTP or write your own certificate verification logic. We recommend that you do not use periods (".") in bucket names.

          S3Guard_ or S3Guard- works good as a prefix.

          Alternatively, we can add tags for DDB tables. See http://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_TagResource.html

          Show
          liuml07 Mingliang Liu added a comment - Easy and useful change. According to: DDB - TableName The name of the table to create. Type: String Length Constraints: Minimum length of 3. Maximum length of 255. Pattern: [a-zA-Z0-9_.-]+ Required: Yes And S3: Rules for Bucket Naming: Bucket names must be at least 3 and no more than 63 characters long. Bucket names must be a series of one or more labels. Adjacent labels are separated by a single period (.). Bucket names can contain lowercase letters, numbers, and hyphens. Each label must start and end with a lowercase letter or a number. Bucket names must not be formatted as an IP address (e.g., 192.168.5.4). When using virtual hosted–style buckets with SSL, the SSL wildcard certificate only matches buckets that do not contain periods. To work around this, use HTTP or write your own certificate verification logic. We recommend that you do not use periods (".") in bucket names. S3Guard_ or S3Guard- works good as a prefix. Alternatively, we can add tags for DDB tables. See http://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_TagResource.html
          Hide
          fabbri Aaron Fabbri added a comment -

          What about the prefix "s3guard+" on all buckets unless otherwise stated?

          We generally try to discourage the per-bucket DDB table naming, as customers with jobs that access multiple buckets can end up with a slew of DDB tables, and it is a waste to provision them each for peak load.

          We generally try to do one DDB table per cluster, with optional sharing between clusters. I'm leaning towards a default name "s3guard-metadata" which is populated above Hadoop (i.e. CM or Ambari).

          Personally I'd suggest not bothering too much with the per-bucket case.

          Show
          fabbri Aaron Fabbri added a comment - What about the prefix "s3guard+" on all buckets unless otherwise stated? We generally try to discourage the per-bucket DDB table naming, as customers with jobs that access multiple buckets can end up with a slew of DDB tables, and it is a waste to provision them each for peak load. We generally try to do one DDB table per cluster, with optional sharing between clusters. I'm leaning towards a default name "s3guard-metadata" which is populated above Hadoop (i.e. CM or Ambari). Personally I'd suggest not bothering too much with the per-bucket case.
          Hide
          liuml07 Mingliang Liu added a comment -

          Hi all, I'll merge from trunk again in 24 hours for latest conflict changes in FileSystemContractBaseTest. I will commit if no test failing. Thanks,

          Show
          liuml07 Mingliang Liu added a comment - Hi all, I'll merge from trunk again in 24 hours for latest conflict changes in FileSystemContractBaseTest . I will commit if no test failing. Thanks,
          Hide
          liuml07 Mingliang Liu added a comment -
          $ mvn -Dit.test='ITestS3A*, ITestS3Guard*' -Dtest=none -Dscale -Ds3guard -Ddynamo -q clean verify
          
          Results :
          
          Tests run: 348, Failures: 0, Errors: 0, Skipped: 16
          

          Merge happened.

          Show
          liuml07 Mingliang Liu added a comment - $ mvn -Dit.test='ITestS3A*, ITestS3Guard*' -Dtest=none -Dscale -Ds3guard -Ddynamo -q clean verify Results : Tests run: 348, Failures: 0, Errors: 0, Skipped: 16 Merge happened.
          Hide
          liuml07 Mingliang Liu added a comment -

          I'm expecting some conflicts because of HADOOP-14135 and HADOOP-14248. Also there are some improvements meaningful to S3Guard like HADOOP-14255 and HADOOP-14247. Let's issue another merge from trunk after these are resolved.

          Show
          liuml07 Mingliang Liu added a comment - I'm expecting some conflicts because of HADOOP-14135 and HADOOP-14248 . Also there are some improvements meaningful to S3Guard like HADOOP-14255 and HADOOP-14247 . Let's issue another merge from trunk after these are resolved.
          Hide
          liuml07 Mingliang Liu added a comment -

          As all the four recent changes in trunk are committed, I'll merge from trunk again. I'll run the integration tests again before that.

          Show
          liuml07 Mingliang Liu added a comment - As all the four recent changes in trunk are committed, I'll merge from trunk again. I'll run the integration tests again before that.
          Hide
          liuml07 Mingliang Liu added a comment - - edited
          $ mvn -Dit.test='ITestS3A*,ITestS3Guard*,ITestDynamo*' -Dtest=none -Dscale -Ds3guard -Ddynamo -q clean verify
          
          Results :
          
          Tests run: 364, Failures: 0, Errors: 0, Skipped: 16
          

          Merge happened.

          Show
          liuml07 Mingliang Liu added a comment - - edited $ mvn -Dit.test='ITestS3A*,ITestS3Guard*,ITestDynamo*' -Dtest=none -Dscale -Ds3guard -Ddynamo -q clean verify Results : Tests run: 364, Failures: 0, Errors: 0, Skipped: 16 Merge happened.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Broke my test runs. HADOOP-14216 is the cause. Workaround detailed on that now-reopened JIRA.

          Show
          stevel@apache.org Steve Loughran added a comment - Broke my test runs. HADOOP-14216 is the cause. Workaround detailed on that now-reopened JIRA.
          Hide
          stevel@apache.org Steve Loughran added a comment - - edited

          HADOOP-14432 adds tests and robustness for copyFromLocalFile(false, true, dst, dst). tests will verify that s3guard doesn't cause regressions

          (it does, BTW, but the tests will ensure it stays that way, as it adds the coverage)

          Show
          stevel@apache.org Steve Loughran added a comment - - edited HADOOP-14432 adds tests and robustness for copyFromLocalFile(false, true, dst, dst) . tests will verify that s3guard doesn't cause regressions (it does, BTW, but the tests will ensure it stays that way, as it adds the coverage)
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Think I may have hit my first inconsistency BTW, in a localdb s3guard test. open() worked, but the first attempt to read the file triggered the FNFE.

          As discussed on HADOOP-14303; we should consider what our retry policy is. Here I think 404 -> fail fast.

          + S3aInputStream retries on some non-recoverable events, as it does one extra attempt on any exception. This can lead to 404s triggering a retry rather than fail fast.

          testSequentialRead(org.apache.hadoop.fs.contract.s3a.ITestS3AContractOpen)  Time elapsed: 1.221 sec  <<< ERROR!
          java.io.FileNotFoundException: Reopen at position 0 on s3a://hwdev-steve-ireland-new/fork-0007/test/testsequentialread.txt: com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: 8D81F218D02DE21E), S3 Extended Request ID: aXUWP6yYGSsP9ofVawyIteGZWBmkNTFjmRCvwAR1KyJmtR0A6H6UOggE4OlYB2ZOJ99F3MV74fU=
          	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:166)
          	at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:165)
          	at org.apache.hadoop.fs.s3a.S3AInputStream.onReadFailure(S3AInputStream.java:348)
          	at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:321)
          	at java.io.FilterInputStream.read(FilterInputStream.java:83)
          	at org.apache.hadoop.fs.contract.AbstractContractOpenTest.testSequentialRead(AbstractContractOpenTest.java:156)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:498)
          	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
          	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
          	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
          	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
          	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
          	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
          	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
          	at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
          Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: 8D81F218D02DE21E)
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1586)
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1254)
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1035)
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:747)
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:721)
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:704)
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:672)
          	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:654)
          	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:518)
          	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4185)
          	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4132)
          	at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1373)
          	at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:158)
          	at org.apache.hadoop.fs.s3a.S3AInputStream.onReadFailure(S3AInputStream.java:348)
          	at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:321)
          	at java.io.FilterInputStream.read(FilterInputStream.java:83)
          	at org.apache.hadoop.fs.contract.AbstractContractOpenTest.testSequentialRead(AbstractContractOpenTest.java:156)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          	at java.lang.reflect.Method.invoke(Method.java:498)
          	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
          	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
          	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
          	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
          	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
          	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
          	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
          	at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
          
          Show
          stevel@apache.org Steve Loughran added a comment - Think I may have hit my first inconsistency BTW, in a localdb s3guard test. open() worked, but the first attempt to read the file triggered the FNFE. As discussed on HADOOP-14303 ; we should consider what our retry policy is. Here I think 404 -> fail fast. + S3aInputStream retries on some non-recoverable events, as it does one extra attempt on any exception. This can lead to 404s triggering a retry rather than fail fast. testSequentialRead(org.apache.hadoop.fs.contract.s3a.ITestS3AContractOpen) Time elapsed: 1.221 sec <<< ERROR! java.io.FileNotFoundException: Reopen at position 0 on s3a: //hwdev-steve-ireland- new /fork-0007/test/testsequentialread.txt: com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: 8D81F218D02DE21E), S3 Extended Request ID: aXUWP6yYGSsP9ofVawyIteGZWBmkNTFjmRCvwAR1KyJmtR0A6H6UOggE4OlYB2ZOJ99F3MV74fU= at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:166) at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:165) at org.apache.hadoop.fs.s3a.S3AInputStream.onReadFailure(S3AInputStream.java:348) at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:321) at java.io.FilterInputStream.read(FilterInputStream.java:83) at org.apache.hadoop.fs.contract.AbstractContractOpenTest.testSequentialRead(AbstractContractOpenTest.java:156) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: 8D81F218D02DE21E) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1586) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1254) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1035) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:747) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:721) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:704) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:672) at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:654) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:518) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4185) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4132) at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1373) at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:158) at org.apache.hadoop.fs.s3a.S3AInputStream.onReadFailure(S3AInputStream.java:348) at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:321) at java.io.FilterInputStream.read(FilterInputStream.java:83) at org.apache.hadoop.fs.contract.AbstractContractOpenTest.testSequentialRead(AbstractContractOpenTest.java:156) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
          Hide
          fabbri Aaron Fabbri added a comment -

          Very interesting Steve Loughran, thanks for sharing. I've heard that S3 GET is supposed to be consistent, except maybe after a previous negative GET. So, I'm trying to understand if that is the case. I suppose we naturally have a negative GET preceeding the S3 object creation, where S3AFileSystem#create() does a getFileStatus() to see if the file already exists... So we have

          • Create test file:
            GET -> 404 (existence check)
            PUT ...
            S3Guard: Record (path, metadata)
          • Read test file:
            S3Guard -> Yes, file exists (short-circuit getFileStatus())
            GET -> 404 (eventual consistency)

          The simple solution would be to add a bit of plumbing into the InputStream so it knows that "the file should exist" and thus 404 should be subject to a retry policy. That bit would be set when we get a hit from the MetadataStore's get(). I'm not sure we'd ever want to retry in other cases, as it slows down applications that may just be trying to confirm a file does not exist.

          Show
          fabbri Aaron Fabbri added a comment - Very interesting Steve Loughran , thanks for sharing. I've heard that S3 GET is supposed to be consistent, except maybe after a previous negative GET. So, I'm trying to understand if that is the case. I suppose we naturally have a negative GET preceeding the S3 object creation, where S3AFileSystem#create() does a getFileStatus() to see if the file already exists... So we have Create test file: GET -> 404 (existence check) PUT ... S3Guard: Record (path, metadata) Read test file: S3Guard -> Yes, file exists (short-circuit getFileStatus()) GET -> 404 (eventual consistency) The simple solution would be to add a bit of plumbing into the InputStream so it knows that "the file should exist" and thus 404 should be subject to a retry policy. That bit would be set when we get a hit from the MetadataStore's get(). I'm not sure we'd ever want to retry in other cases, as it slows down applications that may just be trying to confirm a file does not exist.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          This s a read pipeline. What I think has happened is the client did open(), and s3guard skipped the existence check as ddb said it was there (and how long it was). The HTTP stream isn't set up in open(); it relies on the HEAD to have done the check first (a getFileStatus() is called to verify the path isn't a dir; if the path isn't there it fails. (note we could do a simpler check without the LIST call in the dir scan).

          Because with s3Guard the HEAD request is skipped, it's only on the first seek that an attempt is made to GET the file contents. No file, error. There's nothing wrong with that per-se, it just means that if s3guard is inconsistent with the store, things show up later.

          1. could this be reported? e.g when an FNFE is raised when opening a stream on a s3guarded bucket, warn use this may be an inconsistency.
          2. S3AInputStream relies on the file length being normative {see calculateRequestLimit). If DDB thinks there is less data than there is, the extra data isn't picked up. You won't be able to seek past the amount of data that s3guard thinks is in the file, even if there is now more

          We may want to have s3guard in non-auth mode do the HEAD on the final entry for that failfast and to get the length. (side topic: if we do that, and note the length is different, what to do in s3guard itself?). (This could be done in s3a input stream, as it if fadvise=normal it could start with a full GET of the file & pick up content-length there. Its for the seek-optimised random IO that we'd want to postpone the GET until the first readFully(), and limit its length to something shorter

          Show
          stevel@apache.org Steve Loughran added a comment - This s a read pipeline. What I think has happened is the client did open(), and s3guard skipped the existence check as ddb said it was there (and how long it was). The HTTP stream isn't set up in open(); it relies on the HEAD to have done the check first (a getFileStatus() is called to verify the path isn't a dir; if the path isn't there it fails. (note we could do a simpler check without the LIST call in the dir scan). Because with s3Guard the HEAD request is skipped, it's only on the first seek that an attempt is made to GET the file contents. No file, error. There's nothing wrong with that per-se, it just means that if s3guard is inconsistent with the store, things show up later. 1. could this be reported? e.g when an FNFE is raised when opening a stream on a s3guarded bucket, warn use this may be an inconsistency. 2. S3AInputStream relies on the file length being normative {see calculateRequestLimit ). If DDB thinks there is less data than there is, the extra data isn't picked up. You won't be able to seek past the amount of data that s3guard thinks is in the file, even if there is now more We may want to have s3guard in non-auth mode do the HEAD on the final entry for that failfast and to get the length. (side topic: if we do that, and note the length is different, what to do in s3guard itself?). (This could be done in s3a input stream, as it if fadvise=normal it could start with a full GET of the file & pick up content-length there. Its for the seek-optimised random IO that we'd want to postpone the GET until the first readFully(), and limit its length to something shorter
          Hide
          fabbri Aaron Fabbri added a comment -

          This s a read pipeline.

          Ah, I read that wrong, sorry.

          1. could this be reported? e.g when an FNFE is raised when opening a stream on a s3guarded bucket, warn use this may be an inconsistency.

          For now, this sounds reasonable.

          2. S3AInputStream relies on the file length being normative {see calculateRequestLimit). If DDB thinks there is less data than there is, the extra data isn't picked up. You won't be able to seek past the amount of data that s3guard thinks is in the file, even if there is now more

          I can't think of any normal cases off top of my head where the MetadataStore length would be wrong (can you)? Still this is a good point on side-effects of skipping s3 for the getObjectMetadata().

          We may want to have s3guard in non-auth mode do the HEAD on the final entry for that failfast and to get the length.

          Yes. I also think we should add a new config flag for this behavior: Leave fs.s3a.metadatastore.authoritative to be for listings, add a new fs.s3a.metadatastore.getfilestatus.authoritative for this case. That way you can still get the same behavior we have today (which is useful IMO).

          (side topic: if we do that, and note the length is different, what to do in s3guard itself?).

          "Correct" thing to do is go into a retry policy until there is consensus. And we should really be doing the dynamo and s3 requests async (in parallel) so the round trips can overlap.

          Show
          fabbri Aaron Fabbri added a comment - This s a read pipeline. Ah, I read that wrong, sorry. 1. could this be reported? e.g when an FNFE is raised when opening a stream on a s3guarded bucket, warn use this may be an inconsistency. For now, this sounds reasonable. 2. S3AInputStream relies on the file length being normative {see calculateRequestLimit). If DDB thinks there is less data than there is, the extra data isn't picked up. You won't be able to seek past the amount of data that s3guard thinks is in the file, even if there is now more I can't think of any normal cases off top of my head where the MetadataStore length would be wrong (can you)? Still this is a good point on side-effects of skipping s3 for the getObjectMetadata(). We may want to have s3guard in non-auth mode do the HEAD on the final entry for that failfast and to get the length. Yes. I also think we should add a new config flag for this behavior: Leave fs.s3a.metadatastore.authoritative to be for listings, add a new fs.s3a.metadatastore.getfilestatus.authoritative for this case. That way you can still get the same behavior we have today (which is useful IMO). (side topic: if we do that, and note the length is different, what to do in s3guard itself?). "Correct" thing to do is go into a retry policy until there is consensus. And we should really be doing the dynamo and s3 requests async (in parallel) so the round trips can overlap.
          Hide
          fabbri Aaron Fabbri added a comment -

          FYI Steve Loughran, I created two JIRAs for your suggestions here: HADOOP-14467, and HADOOP-14468.

          Show
          fabbri Aaron Fabbri added a comment - FYI Steve Loughran , I created two JIRAs for your suggestions here: HADOOP-14467 , and HADOOP-14468 .
          Hide
          liuml07 Mingliang Liu added a comment -

          I'll do a new merge from trunk today. I merge on my local machine and almost finish all the integration tests. Unless there is any objection or concerns, I'll push the merge after I post a clean test report by the end of day.

          Show
          liuml07 Mingliang Liu added a comment - I'll do a new merge from trunk today. I merge on my local machine and almost finish all the integration tests. Unless there is any objection or concerns, I'll push the merge after I post a clean test report by the end of day.
          Hide
          liuml07 Mingliang Liu added a comment -

          Kinda clean integration tests.

          1. Running w/o s3guard mvn -Dit.test='ITestS3A*' -Dtest=none -Dscale -q clean verify, all test cases pass.
          2. Running with DynamoDB web service us-west-1 region, mvn -Dit.test='ITestS3A*,ITestS3Guard*,ITestDynamo*' -Dtest=none -Ds3guard -Ddynamo -q verify. Only one test failure, ITestS3AEncryptionSSEC. This has been identified and reported by HADOOP-14448.
          3. Running with DynamoDB Local (in-memory DDB simulator for test), mvn -Dit.test='ITestS3A*,ITestS3Guard*,ITestDynamo*' -Dtest=none -Ds3guard -Ddynamodblocal -q verify. As above, only one test failure, ITestS3AEncryptionSSEC. This has been identified and reported by HADOOP-14448.
          4. Running with Local mode (in-memory metadata store),
            $ mvn -Dit.test='ITestS3A*,ITestS3Guard*,ITestDynamo*' -Dtest=none -Ds3guard -Dlocal -q verify
            Results :
            
            Tests run: 390, Failures: 0, Errors: 0, Skipped: 55
            
          Show
          liuml07 Mingliang Liu added a comment - Kinda clean integration tests. Running w/o s3guard mvn -Dit.test='ITestS3A*' -Dtest=none -Dscale -q clean verify , all test cases pass. Running with DynamoDB web service us-west-1 region, mvn -Dit.test='ITestS3A*,ITestS3Guard*,ITestDynamo*' -Dtest=none -Ds3guard -Ddynamo -q verify . Only one test failure, ITestS3AEncryptionSSEC. This has been identified and reported by HADOOP-14448 . Running with DynamoDB Local (in-memory DDB simulator for test), mvn -Dit.test='ITestS3A*,ITestS3Guard*,ITestDynamo*' -Dtest=none -Ds3guard -Ddynamodblocal -q verify . As above, only one test failure, ITestS3AEncryptionSSEC. This has been identified and reported by HADOOP-14448 . Running with Local mode (in-memory metadata store), $ mvn -Dit.test='ITestS3A*,ITestS3Guard*,ITestDynamo*' -Dtest=none -Ds3guard -Dlocal -q verify Results : Tests run: 390, Failures: 0, Errors: 0, Skipped: 55
          Hide
          fabbri Aaron Fabbri added a comment -

          Sounds good to me Mingliang Liu. Thank you for doing the merge.

          Show
          fabbri Aaron Fabbri added a comment - Sounds good to me Mingliang Liu . Thank you for doing the merge.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          I'm getting ITestS3GuardListConsistency hanging when I run with dynamo or localdynamo. Anyone else?

          Show
          stevel@apache.org Steve Loughran added a comment - I'm getting ITestS3GuardListConsistency hanging when I run with dynamo or localdynamo. Anyone else?
          Hide
          fabbri Aaron Fabbri added a comment -

          I'll test it in a moment. How long did you wait? I thought someone increased the visibility delay for the inconsistent s3 client, and IIRC the test waits 2x that long in some cases.

          Show
          fabbri Aaron Fabbri added a comment - I'll test it in a moment. How long did you wait? I thought someone increased the visibility delay for the inconsistent s3 client, and IIRC the test waits 2x that long in some cases.
          Hide
          fabbri Aaron Fabbri added a comment -

          Not hanging for me, but took about 8 1/2 minutes to complete.

          Show
          fabbri Aaron Fabbri added a comment - Not hanging for me, but took about 8 1/2 minutes to complete.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          OK, it took so long that I assumed it was a failure. Too long for a test run as it is hurting overall test execution time

          Show
          stevel@apache.org Steve Loughran added a comment - OK, it took so long that I assumed it was a failure. Too long for a test run as it is hurting overall test execution time

            People

            • Assignee:
              cnauroth Chris Nauroth
              Reporter:
              cnauroth Chris Nauroth
            • Votes:
              6 Vote for this issue
              Watchers:
              70 Start watching this issue

              Dates

              • Created:
                Updated:

                Development