Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • HADOOP-13345
    • fs/s3
    • None

    Description

      Provide Hadoop metrics showing operational details of the S3Guard implementation.

      The metrics will be implemented in this ticket:

      ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time spent
      in rechecks attempting to achieve consistency. Repeated for multiple percentile values
      of N.  This metric is an indicator of the additional latency cost of running S3A with
      S3Guard.
      ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
      recheck was required while attempting to achieve consistency.
      ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time spent in
      operations against the consistent store, including both write operations during file system
      mutations and read operations during file system consistency checks. Repeated for
      multiple percentile values of N. This metric is an indicator of latency to the consistent
      store implementation.
      ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
      against the consistent store, including both write operations during file system mutations
      and read operations during file system consistency checks.
      ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
      during operations against the consistent store implementation.
      ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
      during operations against the consistent store implementation.
      ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard failed to
      achieve consistency, even after exhausting all rechecks. A high count may indicate
      unexpected out­of­band modification of the S3 bucket contents, such as by an external
      tool that does not make corresponding updates to the consistent store.

      Attachments

        1. HADOOP-13453-HADOOP-13345-001.patch
          6 kB
          Ai Deng
        2. HADOOP-13453-HADOOP-13345-002.patch
          13 kB
          Ai Deng
        3. HADOOP-13453-HADOOP-13345-003.patch
          18 kB
          Ai Deng
        4. HADOOP-13453-HADOOP-13345-004.patch
          11 kB
          Steve Loughran
        5. HADOOP-13453-HADOOP-13345-005.patch
          12 kB
          Steve Loughran

        Activity

          stevel@apache.org Steve Loughran added a comment -
          1. these should all go into org.apache.hadoop.fs.s3a.Statistic & Instrumentation
          2. tracking total & ongoing dynamoDB request rates could be useful, as it will help identify when you've over/under provisioned your DDB.
          3. include stats on detected inconsistencies
          4. include in S3AFileSystem.toString.
          stevel@apache.org Steve Loughran added a comment - these should all go into org.apache.hadoop.fs.s3a.Statistic & Instrumentation tracking total & ongoing dynamoDB request rates could be useful, as it will help identify when you've over/under provisioned your DDB. include stats on detected inconsistencies include in S3AFileSystem.toString .
          aid129 Ai Deng added a comment -

          Hello stevel@apache.org actually, I think as all the metrics (in this story) are send by the new implementation of s3Guard, maybe it's better to separate this new metrics code with S3AInstrumentation. The currently S3AInstrumentation has already 800 lines. We can more easily to disable the s3 guard metrics if we separate the two. But I'm not sure how much codes are reusable in S3AInstrumentation for the new metrics.

          Also I don't find any tests for S3AInstrumentation, how we test these metrics system in Hadoop?

          Sorry for the basic question, i'm really new for work on Hadoop code base.

          aid129 Ai Deng added a comment - Hello stevel@apache.org actually, I think as all the metrics (in this story) are send by the new implementation of s3Guard, maybe it's better to separate this new metrics code with S3AInstrumentation. The currently S3AInstrumentation has already 800 lines. We can more easily to disable the s3 guard metrics if we separate the two. But I'm not sure how much codes are reusable in S3AInstrumentation for the new metrics. Also I don't find any tests for S3AInstrumentation, how we test these metrics system in Hadoop? Sorry for the basic question, i'm really new for work on Hadoop code base.
          stevel@apache.org Steve Loughran added a comment -

          They're going to have to go into that file because those are the metrics published by the S3A filesystem when deployed, returned by S3AStorageStatistics in a call to S3AFileSystem.getStorageStatistics(), and printed in {{S3AFileSystem.toString(). We could choose whether to add the specific metrics to every S3a FS instance; that's something to consider. Listing the values but returning 0 for all gauges and counters is the most consistent.

          Don't worry about the class length: if you look at it in detail, there's two nested classes + support methods explicitly for output/input streams...you don't need to go there. The rest of the code is fairly simple

          1. add new values to org.apache.hadoop.fs.s3a.Statistic; prefix s3guard_
          2. In S3AInstrumentation, add counters to the array COUNTERS_TO_CREATE; gauges to GAUGES_TO_CREATE
          3. Pass in an instance of the instrumentation down to S3Guard
          4. have the code call incrementCounter and increment/decrementGauge as appropriate
          5. I'd like a simple counter of s3guard_enabled and s3guard_authoritative, which will be 0 when there's no s3guard running, 1 when the respective booleans are up. Why? Remote visibility

          You make a good point, "where are the tests?". The answer is: the metrics can be used to test the internal state of the S3 classes, therefore become implicitly tested there.

          Take a look at ITestS3ADirectoryPerformance for a key example of this: our test cases use the counters of the various HTTP operations as the means to verify that API calls work as expected. (note that s3guard, by reducing these, has complicated the tests)

          That is, you verify the counters work by asserting that they change as you make operations to the DFS. see: http://steveloughran.blogspot.co.uk/2016/04/distributed-testing-making-use-of.html for more of my thinking here

          Sorry for the basic question, i'm really new for work on Hadoop code base.

          happy to explain my reasoning. We've all started off staring at a vast amount of code that we don't understand; there are still big bits of Hadoop that I don't go near.

          stevel@apache.org Steve Loughran added a comment - They're going to have to go into that file because those are the metrics published by the S3A filesystem when deployed, returned by S3AStorageStatistics in a call to S3AFileSystem.getStorageStatistics(), and printed in {{S3AFileSystem.toString() . We could choose whether to add the specific metrics to every S3a FS instance; that's something to consider. Listing the values but returning 0 for all gauges and counters is the most consistent. Don't worry about the class length: if you look at it in detail, there's two nested classes + support methods explicitly for output/input streams...you don't need to go there. The rest of the code is fairly simple add new values to org.apache.hadoop.fs.s3a.Statistic; prefix s3guard_ In S3AInstrumentation , add counters to the array COUNTERS_TO_CREATE ; gauges to GAUGES_TO_CREATE Pass in an instance of the instrumentation down to S3Guard have the code call incrementCounter and increment/decrementGauge as appropriate I'd like a simple counter of s3guard_enabled and s3guard_authoritative , which will be 0 when there's no s3guard running, 1 when the respective booleans are up. Why? Remote visibility You make a good point, "where are the tests?". The answer is: the metrics can be used to test the internal state of the S3 classes, therefore become implicitly tested there. Take a look at ITestS3ADirectoryPerformance for a key example of this: our test cases use the counters of the various HTTP operations as the means to verify that API calls work as expected. (note that s3guard, by reducing these, has complicated the tests) That is, you verify the counters work by asserting that they change as you make operations to the DFS. see: http://steveloughran.blogspot.co.uk/2016/04/distributed-testing-making-use-of.html for more of my thinking here Sorry for the basic question, i'm really new for work on Hadoop code base. happy to explain my reasoning. We've all started off staring at a vast amount of code that we don't understand; there are still big bits of Hadoop that I don't go near.
          aid129 Ai Deng added a comment - - edited

          steve_l Thank you very much for the explication, that's very helpful.

          I have 2 questions for the moment, for sure there are more to come.

          1. I see 2 pattern to change the counter value in S3AInstrumentation, have a proper method like fileCreated() or pass one Statistic to the generic method incrementCounter(), it is for a reason we keep both? Looks like you suggest to use the second approach.
          2. I can't find any usage of S3AFileSystem.getStorageStatistics() in the project, what is the main propose of this statistics? it's for use outside of Hadoop? I don't need pass an instance of storageStatistics to S3Guard? In S3AFileSystem, we always increment the both.
              protected void incrementStatistic(Statistic statistic, long count) {
                instrumentation.incrementCounter(statistic, count);
                storageStatistics.incrementCounter(statistic, count);
              }
            
          aid129 Ai Deng added a comment - - edited steve_l Thank you very much for the explication, that's very helpful. I have 2 questions for the moment, for sure there are more to come. I see 2 pattern to change the counter value in S3AInstrumentation, have a proper method like fileCreated() or pass one Statistic to the generic method incrementCounter(), it is for a reason we keep both? Looks like you suggest to use the second approach. I can't find any usage of S3AFileSystem.getStorageStatistics() in the project, what is the main propose of this statistics? it's for use outside of Hadoop? I don't need pass an instance of storageStatistics to S3Guard? In S3AFileSystem, we always increment the both. protected void incrementStatistic(Statistic statistic, long count) { instrumentation.incrementCounter(statistic, count); storageStatistics.incrementCounter(statistic, count); }
          stevel@apache.org Steve Loughran added a comment -

          Hi, don't worry about asking questions, we'll do our best to get you contributing code —it benefits all of us if you are adding code to Hadoop.

          The split between low level increment named counter and more elegant "event with internal counters?". The event ones are cleaner, as they stop the rest of the code having to know exactly which counters/gauges to use. Consider the elegant ones the best approach, and the direct invocation us being lazy.

          The S3aInstrumentation class also has a set of explicit named counters "filesDeleted" as well as lots of ones that are only listed in the arrays GAUGES_TO_CREATE and COUNTERS_TO_CREATE. That's evolution over time; I got bored of having to name and register lots of fields, and realised I could do it from the arrays, at the cost of a hash lookup on every increment.

          Outside the S3a class itself, i've tried to have external inner classes to do the counting, with the results merged in at the end (example: the input and output streams), with the inner classes using simple long values, rather than atomics. Why? Eliminates any delays during increments, and lets us override the toString() values for input/output streams with dumps of the values (go on, try it!). We can have many input/output streams per FS instance, so the risk of contention for atomic int/log values is potentially quite high.

          I think for s3guard we could add a new inner class passed in to each s3guard instance; it would export the various methods for events that s3guard could raise, such as tableCreated(), tableDeleted() —these can directly increment the atomic counters in the instrumentation, as we'd only have a 1:1 map of S3aFS instance and a s3guard store instance.

          Regarding access the statistics, that's hooked up to FileSystem.getStorageStatistics(), which is intended to provide the storage stats for any FS; s3a and HDFS share common statistic names for the common statistics. The latest versions of Tez do collect the statistics of jobs, and so give you the aggregate statistics across your entire query. Until now, only Filesystem.getStatistics() has been used, which returns a fixed set of values (bytes read/written, etc). Spark still only collects those; it'd take some migration to hadoop 2.8+ to pick up the new data. Until then, it's something we can use in tests.

          stevel@apache.org Steve Loughran added a comment - Hi, don't worry about asking questions, we'll do our best to get you contributing code —it benefits all of us if you are adding code to Hadoop. The split between low level increment named counter and more elegant "event with internal counters?". The event ones are cleaner, as they stop the rest of the code having to know exactly which counters/gauges to use. Consider the elegant ones the best approach, and the direct invocation us being lazy. The S3aInstrumentation class also has a set of explicit named counters "filesDeleted" as well as lots of ones that are only listed in the arrays GAUGES_TO_CREATE and COUNTERS_TO_CREATE . That's evolution over time; I got bored of having to name and register lots of fields, and realised I could do it from the arrays, at the cost of a hash lookup on every increment. Outside the S3a class itself, i've tried to have external inner classes to do the counting, with the results merged in at the end (example: the input and output streams), with the inner classes using simple long values, rather than atomics. Why? Eliminates any delays during increments, and lets us override the toString() values for input/output streams with dumps of the values (go on, try it!). We can have many input/output streams per FS instance, so the risk of contention for atomic int/log values is potentially quite high. I think for s3guard we could add a new inner class passed in to each s3guard instance; it would export the various methods for events that s3guard could raise, such as tableCreated() , tableDeleted() —these can directly increment the atomic counters in the instrumentation, as we'd only have a 1:1 map of S3aFS instance and a s3guard store instance. Regarding access the statistics, that's hooked up to FileSystem.getStorageStatistics() , which is intended to provide the storage stats for any FS; s3a and HDFS share common statistic names for the common statistics. The latest versions of Tez do collect the statistics of jobs, and so give you the aggregate statistics across your entire query. Until now, only Filesystem.getStatistics() has been used, which returns a fixed set of values (bytes read/written, etc). Spark still only collects those; it'd take some migration to hadoop 2.8+ to pick up the new data. Until then, it's something we can use in tests.
          aid129 Ai Deng added a comment -

          Steve, thank you for sharing these knowledge and thought. It's a good idea to having a inner class for S3guard metrics.

          I have started a little with all your help, but I will be on holiday for next two weeks (back to China for the new year). I really hope I can resolve this ticket (could work more quick on this after the holiday), but if the timing is not match to the plan of Hadoop13345, please affect this ticket to someone else, so we can finish in time.

          I will try to catch up with you in China.

          aid129 Ai Deng added a comment - Steve, thank you for sharing these knowledge and thought. It's a good idea to having a inner class for S3guard metrics. I have started a little with all your help, but I will be on holiday for next two weeks (back to China for the new year). I really hope I can resolve this ticket (could work more quick on this after the holiday), but if the timing is not match to the plan of Hadoop13345, please affect this ticket to someone else, so we can finish in time. I will try to catch up with you in China.
          stevel@apache.org Steve Loughran added a comment -

          —don't worry about being on holiday for the next few weeks; a fair few people have gone off to enjoy themselves. Take a break from your emails and enjoy yourself!

          stevel@apache.org Steve Loughran added a comment - —don't worry about being on holiday for the next few weeks; a fair few people have gone off to enjoy themselves. Take a break from your emails and enjoy yourself!
          fabbri Aaron Fabbri added a comment -

          I will comment as I see new places in the code that could use metrics:

          • HADOOP-13904 (if it gets committed) in DynamoDBMetadataStore#retryBackoff()
          fabbri Aaron Fabbri added a comment - I will comment as I see new places in the code that could use metrics: HADOOP-13904 (if it gets committed) in DynamoDBMetadataStore#retryBackoff()
          aid129 Ai Deng added a comment -

          fabbri Thanks.
          stevel@apache.org I just made a simple change. Could you please check that? (wip patch) Just make sure I'm on the right way to doing things. Thanks.

          aid129 Ai Deng added a comment - fabbri Thanks. stevel@apache.org I just made a simple change. Could you please check that? (wip patch) Just make sure I'm on the right way to doing things. Thanks.
          stevel@apache.org Steve Loughran added a comment -

          thanks, I've hit the submit patch button, but jenkins will fail as the patch isn't going to apply

          the way Hadoop builds work is you need to include the branch name if its not trunk, here

          HADOOP-13453-HADOOP-13345-001.patch
          

          Try that with the existing code and you should have the machines review the patch (which we rely on to do the basic checks).

          For object store tests we also require the submitter to declare which s3 infrastructure they tested against —because Jenkins doesn't run those tests. Here's the details: https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md

          Patch-wise, base design looks good: you've added a new statistic and passing the instrumentation down to add it.

          We just need to think of all the statistics to collect.

          Thanks for doing this.

          stevel@apache.org Steve Loughran added a comment - thanks, I've hit the submit patch button, but jenkins will fail as the patch isn't going to apply the way Hadoop builds work is you need to include the branch name if its not trunk, here HADOOP-13453-HADOOP-13345-001.patch Try that with the existing code and you should have the machines review the patch (which we rely on to do the basic checks). For object store tests we also require the submitter to declare which s3 infrastructure they tested against —because Jenkins doesn't run those tests. Here's the details: https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md Patch-wise, base design looks good: you've added a new statistic and passing the instrumentation down to add it. We just need to think of all the statistics to collect. Thanks for doing this.
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          -1 patch 0m 7s HADOOP-13453 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help.



          This message was automatically generated.

          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. -1 patch 0m 7s HADOOP-13453 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. Subsystem Report/Notes JIRA Issue HADOOP-13453 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12853617/HADOOP-13453.wip-01.patch Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/11689/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 17s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 13m 5s HADOOP-13345 passed
          +1 compile 0m 24s HADOOP-13345 passed
          +1 checkstyle 0m 16s HADOOP-13345 passed
          +1 mvnsite 0m 27s HADOOP-13345 passed
          +1 mvneclipse 0m 22s HADOOP-13345 passed
          +1 findbugs 0m 33s HADOOP-13345 passed
          +1 javadoc 0m 16s HADOOP-13345 passed
          +1 mvninstall 0m 23s the patch passed
          +1 compile 0m 19s the patch passed
          +1 javac 0m 19s the patch passed
          +1 checkstyle 0m 12s the patch passed
          +1 mvnsite 0m 22s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 36s the patch passed
          +1 javadoc 0m 12s the patch passed
          +1 unit 0m 36s hadoop-aws in the patch passed.
          +1 asflicense 0m 16s The patch does not generate ASF License warnings.
          20m 9s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HADOOP-13453
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12854675/HADOOP-13453-HADOOP-13345-001.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 929c7a818845 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision HADOOP-13345 / 95e0143
          Default Java 1.8.0_121
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/11717/testReport/
          modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/11717/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 17s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 13m 5s HADOOP-13345 passed +1 compile 0m 24s HADOOP-13345 passed +1 checkstyle 0m 16s HADOOP-13345 passed +1 mvnsite 0m 27s HADOOP-13345 passed +1 mvneclipse 0m 22s HADOOP-13345 passed +1 findbugs 0m 33s HADOOP-13345 passed +1 javadoc 0m 16s HADOOP-13345 passed +1 mvninstall 0m 23s the patch passed +1 compile 0m 19s the patch passed +1 javac 0m 19s the patch passed +1 checkstyle 0m 12s the patch passed +1 mvnsite 0m 22s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 36s the patch passed +1 javadoc 0m 12s the patch passed +1 unit 0m 36s hadoop-aws in the patch passed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 20m 9s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HADOOP-13453 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12854675/HADOOP-13453-HADOOP-13345-001.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 929c7a818845 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision HADOOP-13345 / 95e0143 Default Java 1.8.0_121 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/11717/testReport/ modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/11717/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 17s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 14m 1s HADOOP-13345 passed
          +1 compile 0m 19s HADOOP-13345 passed
          +1 checkstyle 0m 15s HADOOP-13345 passed
          +1 mvnsite 0m 21s HADOOP-13345 passed
          +1 mvneclipse 0m 22s HADOOP-13345 passed
          +1 findbugs 0m 28s HADOOP-13345 passed
          +1 javadoc 0m 14s HADOOP-13345 passed
          +1 mvninstall 0m 17s the patch passed
          +1 compile 0m 17s the patch passed
          +1 javac 0m 17s the patch passed
          +1 checkstyle 0m 11s the patch passed
          +1 mvnsite 0m 18s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 32s the patch passed
          +1 javadoc 0m 11s the patch passed
          +1 unit 0m 35s hadoop-aws in the patch passed.
          +1 asflicense 0m 17s The patch does not generate ASF License warnings.
          20m 25s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HADOOP-13453
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12854676/HADOOP-13453-HADOOP-13345-002.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 72388b8af098 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision HADOOP-13345 / 95e0143
          Default Java 1.8.0_121
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/11718/testReport/
          modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/11718/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 17s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 14m 1s HADOOP-13345 passed +1 compile 0m 19s HADOOP-13345 passed +1 checkstyle 0m 15s HADOOP-13345 passed +1 mvnsite 0m 21s HADOOP-13345 passed +1 mvneclipse 0m 22s HADOOP-13345 passed +1 findbugs 0m 28s HADOOP-13345 passed +1 javadoc 0m 14s HADOOP-13345 passed +1 mvninstall 0m 17s the patch passed +1 compile 0m 17s the patch passed +1 javac 0m 17s the patch passed +1 checkstyle 0m 11s the patch passed +1 mvnsite 0m 18s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 32s the patch passed +1 javadoc 0m 11s the patch passed +1 unit 0m 35s hadoop-aws in the patch passed. +1 asflicense 0m 17s The patch does not generate ASF License warnings. 20m 25s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HADOOP-13453 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12854676/HADOOP-13453-HADOOP-13345-002.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 72388b8af098 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision HADOOP-13345 / 95e0143 Default Java 1.8.0_121 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/11718/testReport/ modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/11718/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          aid129 Ai Deng added a comment -

          Hi stevel@apache.org Thank you for the information. I will keep adding more metrics. (still on the S3Guard class level)

          Regards of the metrics mentioned in the document (I have copied it to the Jira ticket), as we consider the metadata in the store is "fresher", and use it first, we don't do any recheck(for the inconsistent between S3 and metadata store) right? So the metrics "S3GuardRechecksNthPercentileLatency", "S3GuardRechecksNumOps", "S3GuardInconsistencies" will not need any more.

          For the Jenkins build, I can't find the "Submit" button on the Jira, it because my user permission or I miss something?

          aid129 Ai Deng added a comment - Hi stevel@apache.org Thank you for the information. I will keep adding more metrics. (still on the S3Guard class level) Regards of the metrics mentioned in the document (I have copied it to the Jira ticket), as we consider the metadata in the store is "fresher", and use it first, we don't do any recheck(for the inconsistent between S3 and metadata store) right? So the metrics "S3GuardRechecksNthPercentileLatency", "S3GuardRechecksNumOps", "S3GuardInconsistencies" will not need any more. For the Jenkins build, I can't find the "Submit" button on the Jira, it because my user permission or I miss something?
          aid129 Ai Deng added a comment -

          Looks like the Jenkins run automatically for the patch. Will modify the existing S3Guard tests scenarios to test the metrics added.

          aid129 Ai Deng added a comment - Looks like the Jenkins run automatically for the patch. Will modify the existing S3Guard tests scenarios to test the metrics added.
          stevel@apache.org Steve Loughran added a comment -

          I'm afraid HADOOP-13914 has just broken the patch, which means, sadly, you get to do the merge. Let's get this in before anything else traumatic comes in, so other patches get to suffer next time.

          I like what you've done measuring latency as well as counts. I think we could actually do this more broadly. I think the timing counting should be in a finally() clause though, so timings for failures get included too. (side issue: count success and failures separately? with different timings?)

          I would like to think about how we could avoiding having to pass the instrumentation around all the time. Ideally, we could just pass it in as a constructor to the metadata store. Alternatively, that store could collect metrics and we could wire it up, but I don't see an easy way to do that in Hadoop metrics (compared to Coda Hale's). The easiest would be just to pass in the S3AInstrumentation (or an inner class) down, but currently the metastore interface is not specific to S3A only.

          If we add an interface for metadata store instrumentation, then S3AInstrumentation can implement it in an inner class, and S3AFS can pass it down during initialization. Th's would let the metastore do all it wants, with well defined strings, of course.

          What do people think?

          stevel@apache.org Steve Loughran added a comment - I'm afraid HADOOP-13914 has just broken the patch, which means, sadly, you get to do the merge. Let's get this in before anything else traumatic comes in, so other patches get to suffer next time. I like what you've done measuring latency as well as counts. I think we could actually do this more broadly. I think the timing counting should be in a finally() clause though, so timings for failures get included too. (side issue: count success and failures separately? with different timings?) I would like to think about how we could avoiding having to pass the instrumentation around all the time. Ideally, we could just pass it in as a constructor to the metadata store. Alternatively, that store could collect metrics and we could wire it up, but I don't see an easy way to do that in Hadoop metrics (compared to Coda Hale's). The easiest would be just to pass in the S3AInstrumentation (or an inner class) down, but currently the metastore interface is not specific to S3A only. If we add an interface for metadata store instrumentation, then S3AInstrumentation can implement it in an inner class, and S3AFS can pass it down during initialization. Th's would let the metastore do all it wants, with well defined strings, of course. What do people think?
          aid129 Ai Deng added a comment -

          Hi stevel@apache.org, I have added a new patch following your suggestion. If it is ok, we can discuss the metrics we want to add?

          I come out this list of operation and latency metrics for this ticket, can you check if I miss anything? Thank you.

          Status:
          S3GUARD_METADATASTORE_ENABLED
          S3GUARD_METADATASTORE_IS_AUTHORITATIVE
          Operations:
          S3GUARD_METADATASTORE_INITIALIZATION
          S3GUARD_METADATASTORE_DELETE_PATH
          S3GUARD_METADATASTORE_DELETE_PATH_LATENCY
          S3GUARD_METADATASTORE_DELETE_SUBTREE_PATCH
          S3GUARD_METADATASTORE_GET_PATH
          S3GUARD_METADATASTORE_GET_PATH_LATENCY
          S3GUARD_METADATASTORE_GET_CHILDREN_PATH
          S3GUARD_METADATASTORE_GET_CHILDREN_PATH_LATENCY
          S3GUARD_METADATASTORE_MOVE_PATH
          S3GUARD_METADATASTORE_PUT_PATH
          S3GUARD_METADATASTORE_PUT_PATH_LATENCY
          S3GUARD_METADATASTORE_CLOSE
          S3GUARD_METADATASTORE_DESTORY
          From S3Guard:
          S3GUARD_METADATASTORE_MERGE_DIRECTORY
          For the failures:
          S3GUARD_METADATASTORE_DELETE_FAILURE
          S3GUARD_METADATASTORE_GET_FAILURE
          S3GUARD_METADATASTORE_PUT_FAILURE
          Etc:
          S3GUARD_METADATASTORE_PUT_RETRY_TIMES

          aid129 Ai Deng added a comment - Hi stevel@apache.org , I have added a new patch following your suggestion. If it is ok, we can discuss the metrics we want to add? I come out this list of operation and latency metrics for this ticket, can you check if I miss anything? Thank you. Status: S3GUARD_METADATASTORE_ENABLED S3GUARD_METADATASTORE_IS_AUTHORITATIVE Operations: S3GUARD_METADATASTORE_INITIALIZATION S3GUARD_METADATASTORE_DELETE_PATH S3GUARD_METADATASTORE_DELETE_PATH_LATENCY S3GUARD_METADATASTORE_DELETE_SUBTREE_PATCH S3GUARD_METADATASTORE_GET_PATH S3GUARD_METADATASTORE_GET_PATH_LATENCY S3GUARD_METADATASTORE_GET_CHILDREN_PATH S3GUARD_METADATASTORE_GET_CHILDREN_PATH_LATENCY S3GUARD_METADATASTORE_MOVE_PATH S3GUARD_METADATASTORE_PUT_PATH S3GUARD_METADATASTORE_PUT_PATH_LATENCY S3GUARD_METADATASTORE_CLOSE S3GUARD_METADATASTORE_DESTORY From S3Guard: S3GUARD_METADATASTORE_MERGE_DIRECTORY For the failures: S3GUARD_METADATASTORE_DELETE_FAILURE S3GUARD_METADATASTORE_GET_FAILURE S3GUARD_METADATASTORE_PUT_FAILURE Etc: S3GUARD_METADATASTORE_PUT_RETRY_TIMES
          aid129 Ai Deng added a comment -

          I think maybe measure the number of path has been operated (put, get … ) in MetaStore could be interesting. The end user can see how big their S3 file system has been managed in S3Guard.

          aid129 Ai Deng added a comment - I think maybe measure the number of path has been operated (put, get … ) in MetaStore could be interesting. The end user can see how big their S3 file system has been managed in S3Guard.
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 21s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 12m 34s HADOOP-13345 passed
          +1 compile 0m 20s HADOOP-13345 passed
          +1 checkstyle 0m 15s HADOOP-13345 passed
          +1 mvnsite 0m 22s HADOOP-13345 passed
          +1 mvneclipse 0m 14s HADOOP-13345 passed
          +1 findbugs 0m 27s HADOOP-13345 passed
          +1 javadoc 0m 15s HADOOP-13345 passed
          +1 mvninstall 0m 18s the patch passed
          +1 compile 0m 18s the patch passed
          +1 javac 0m 18s the patch passed
          -0 checkstyle 0m 12s hadoop-tools/hadoop-aws: The patch generated 19 new + 25 unchanged - 0 fixed = 44 total (was 25)
          +1 mvnsite 0m 20s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 32s the patch passed
          +1 javadoc 0m 11s the patch passed
          +1 unit 0m 36s hadoop-aws in the patch passed.
          +1 asflicense 0m 16s The patch does not generate ASF License warnings.
          18m 59s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HADOOP-13453
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12859449/HADOOP-13453-HADOOP-13345-003.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux bd705d5c15d7 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision HADOOP-13345 / b54e1b2
          Default Java 1.8.0_121
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/11851/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-aws.txt
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/11851/testReport/
          modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/11851/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 21s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 12m 34s HADOOP-13345 passed +1 compile 0m 20s HADOOP-13345 passed +1 checkstyle 0m 15s HADOOP-13345 passed +1 mvnsite 0m 22s HADOOP-13345 passed +1 mvneclipse 0m 14s HADOOP-13345 passed +1 findbugs 0m 27s HADOOP-13345 passed +1 javadoc 0m 15s HADOOP-13345 passed +1 mvninstall 0m 18s the patch passed +1 compile 0m 18s the patch passed +1 javac 0m 18s the patch passed -0 checkstyle 0m 12s hadoop-tools/hadoop-aws: The patch generated 19 new + 25 unchanged - 0 fixed = 44 total (was 25) +1 mvnsite 0m 20s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 32s the patch passed +1 javadoc 0m 11s the patch passed +1 unit 0m 36s hadoop-aws in the patch passed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 18m 59s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HADOOP-13453 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12859449/HADOOP-13453-HADOOP-13345-003.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux bd705d5c15d7 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision HADOOP-13345 / b54e1b2 Default Java 1.8.0_121 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/11851/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-aws.txt Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/11851/testReport/ modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/11851/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          stevel@apache.org Steve Loughran added a comment -

          This is on my review todo list, don't think I've forgotten!

          stevel@apache.org Steve Loughran added a comment - This is on my review todo list, don't think I've forgotten!
          stevel@apache.org Steve Loughran added a comment -

          I see your point about quantile config: don't know what to do there.

          • be careful with those imports, S3AInstrumentation had all its imports expanded and moved around. Import sections are one of the main merge-conflict areas, so its critical to keep changes there to a minimum. We normally turn off any IDE automatic features to avoid this.
          • moved off the split of separate interface and impl for the instrumentation; everything is closely couple enough we don't need to abstract things away.

          I'm doing a patch with my changes; if everything is happy and I can run a full integration test suite (HADOOP-14216 has broken this), then I'll +1 it; once it is in we can expand the metrics

          stevel@apache.org Steve Loughran added a comment - I see your point about quantile config: don't know what to do there. be careful with those imports, S3AInstrumentation had all its imports expanded and moved around. Import sections are one of the main merge-conflict areas, so its critical to keep changes there to a minimum. We normally turn off any IDE automatic features to avoid this. moved off the split of separate interface and impl for the instrumentation; everything is closely couple enough we don't need to abstract things away. I'm doing a patch with my changes; if everything is happy and I can run a full integration test suite ( HADOOP-14216 has broken this), then I'll +1 it; once it is in we can expand the metrics
          stevel@apache.org Steve Loughran added a comment -

          Patch 003 with some tuning.

          Something is wrong with my IDE (IntelliJ IDEA 2017.1) and it is reorganising imports without warning. I think I've fixed it here.

          Tested: none. Someone has gone and broken XInclude

          stevel@apache.org Steve Loughran added a comment - Patch 003 with some tuning. Something is wrong with my IDE (IntelliJ IDEA 2017.1) and it is reorganising imports without warning. I think I've fixed it here. Tested: none. Someone has gone and broken XInclude
          stevel@apache.org Steve Loughran added a comment -

          Patch 005; accidentally lost an import while cleaning up IDE import games.

          Tested: s3a ireland with the opts -Dparallel-tests -DtestsThreadCount=8 -Ddynamo. All well. Best test run for ages. Hopefully that means that DDB is fixing that intermittent root contract test failure.

          stevel@apache.org Steve Loughran added a comment - Patch 005; accidentally lost an import while cleaning up IDE import games. Tested: s3a ireland with the opts -Dparallel-tests -DtestsThreadCount=8 -Ddynamo . All well. Best test run for ages. Hopefully that means that DDB is fixing that intermittent root contract test failure.
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 18s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 16m 0s HADOOP-13345 passed
          +1 compile 0m 24s HADOOP-13345 passed
          +1 checkstyle 0m 19s HADOOP-13345 passed
          +1 mvnsite 0m 27s HADOOP-13345 passed
          +1 mvneclipse 0m 30s HADOOP-13345 passed
          +1 findbugs 0m 36s HADOOP-13345 passed
          +1 javadoc 0m 16s HADOOP-13345 passed
          +1 mvninstall 0m 24s the patch passed
          +1 compile 0m 22s the patch passed
          +1 javac 0m 22s the patch passed
          -0 checkstyle 0m 13s hadoop-tools/hadoop-aws: The patch generated 1 new + 38 unchanged - 0 fixed = 39 total (was 38)
          +1 mvnsite 0m 23s the patch passed
          +1 mvneclipse 0m 14s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 40s the patch passed
          +1 javadoc 0m 13s the patch passed
          +1 unit 0m 40s hadoop-aws in the patch passed.
          +1 asflicense 0m 30s The patch does not generate ASF License warnings.
          23m 57s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:612578f
          JIRA Issue HADOOP-13453
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12863312/HADOOP-13453-HADOOP-13345-005.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 40fb9ba1fa78 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision HADOOP-13345 / af8250a
          Default Java 1.8.0_121
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/12097/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-aws.txt
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/12097/testReport/
          modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/12097/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 16m 0s HADOOP-13345 passed +1 compile 0m 24s HADOOP-13345 passed +1 checkstyle 0m 19s HADOOP-13345 passed +1 mvnsite 0m 27s HADOOP-13345 passed +1 mvneclipse 0m 30s HADOOP-13345 passed +1 findbugs 0m 36s HADOOP-13345 passed +1 javadoc 0m 16s HADOOP-13345 passed +1 mvninstall 0m 24s the patch passed +1 compile 0m 22s the patch passed +1 javac 0m 22s the patch passed -0 checkstyle 0m 13s hadoop-tools/hadoop-aws: The patch generated 1 new + 38 unchanged - 0 fixed = 39 total (was 38) +1 mvnsite 0m 23s the patch passed +1 mvneclipse 0m 14s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 40s the patch passed +1 javadoc 0m 13s the patch passed +1 unit 0m 40s hadoop-aws in the patch passed. +1 asflicense 0m 30s The patch does not generate ASF License warnings. 23m 57s Subsystem Report/Notes Docker Image:yetus/hadoop:612578f JIRA Issue HADOOP-13453 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12863312/HADOOP-13453-HADOOP-13345-005.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 40fb9ba1fa78 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision HADOOP-13345 / af8250a Default Java 1.8.0_121 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/12097/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-aws.txt Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/12097/testReport/ modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/12097/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          aid129 Ai Deng added a comment -

          Ok, will test the path05 when I back from holiday. Happy easter everyone!

          aid129 Ai Deng added a comment - Ok, will test the path05 when I back from holiday. Happy easter everyone!
          aid129 Ai Deng added a comment -

          Hi steve_l happy with patch5, we should push it to the branch and start to add more metrics?

          aid129 Ai Deng added a comment - Hi steve_l happy with patch5, we should push it to the branch and start to add more metrics?
          stevel@apache.org Steve Loughran added a comment -

          +1
          yeah. I'll do it now, one last test run to see all is well

          stevel@apache.org Steve Loughran added a comment - +1 yeah. I'll do it now, one last test run to see all is well
          stevel@apache.org Steve Loughran added a comment -

          OK, it's in: that's for this

          now we have to think about what extra things to measure....

          stevel@apache.org Steve Loughran added a comment - OK, it's in: that's for this now we have to think about what extra things to measure....
          aid129 Ai Deng added a comment -

          Cool, I listed my suggestion for the metrics in previous comment, what is your thoughts? Let's decide the list first.

          aid129 Ai Deng added a comment - Cool, I listed my suggestion for the metrics in previous comment, what is your thoughts? Let's decide the list first.
          stevel@apache.org Steve Loughran added a comment -

          Why not take that list, create a new JIRA off HADOOP-13345 "add more s3guard metrics" and suggest those as the start.

          One interesting one to see if we could detect would be mismatches between s3guard and the underlying object store: if we can observe inconsistencies (how?) then that should be measured. The S3mper blog posts looks at how netflix detected consistency issues in S3 that way

          stevel@apache.org Steve Loughran added a comment - Why not take that list, create a new JIRA off HADOOP-13345 "add more s3guard metrics" and suggest those as the start. One interesting one to see if we could detect would be mismatches between s3guard and the underlying object store: if we can observe inconsistencies (how?) then that should be measured. The S3mper blog posts looks at how netflix detected consistency issues in S3 that way

          People

            aid129 Ai Deng
            cnauroth Chris Nauroth
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: