Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-12346

Increase some default timeouts / retries for S3a connector

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.1
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: fs/s3
    • Labels:
      None
    • Target Version/s:

      Description

      I've been seeing some flakiness in jobs runnings against S3a, both first hand and with other accounts, for which increasing fs.s3a.connection.timeout and fs.s3a.attempts.maximum have been a reliable solution. I propose we increase the defaults.

        Issue Links

          Activity

          Hide
          mackrorysd Sean Mackrory added a comment -

          Attaching patch that increases configuration, code and documentation files.

          Show
          mackrorysd Sean Mackrory added a comment - Attaching patch that increases configuration, code and documentation files.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          -1 to any changes to s3n; its used enough that changing things would only cause surprises. Things installed via management tooling can pick up the defaults from those, and the people who do the tools can therefore control what those defaults are. More succinctly "we don't like changing defaults, even when they aren't always the best"

          As s3a is newer, that's probably more amenable to change, under the "getting it working completely" category

          Show
          stevel@apache.org Steve Loughran added a comment - -1 to any changes to s3n; its used enough that changing things would only cause surprises. Things installed via management tooling can pick up the defaults from those, and the people who do the tools can therefore control what those defaults are. More succinctly "we don't like changing defaults, even when they aren't always the best" As s3a is newer, that's probably more amenable to change, under the "getting it working completely" category
          Hide
          stevel@apache.org Steve Loughran added a comment -

          sorry, misread: you aren't changing s3n are you?

          I'll let others look at this patch, but I don't currently have any reason to -1 it.

          Show
          stevel@apache.org Steve Loughran added a comment - sorry, misread: you aren't changing s3n are you? I'll let others look at this patch, but I don't currently have any reason to -1 it.
          Hide
          mackrorysd Sean Mackrory added a comment -

          Correct, my patch applies to s3a only.

          Show
          mackrorysd Sean Mackrory added a comment - Correct, my patch applies to s3a only.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 24m 50s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 javac 9m 6s There were no new javac warning messages.
          +1 javadoc 10m 35s There were no new javadoc warning messages.
          +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings.
          +1 site 3m 0s Site still builds.
          +1 checkstyle 1m 31s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 26s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 2m 39s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 common tests 22m 43s Tests passed in hadoop-common.
          +1 tools/hadoop tests 0m 13s Tests passed in hadoop-aws.
              77m 5s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12751721/0001-HADOOP-12346.-Increase-some-default-timeouts-retries.patch
          Optional Tests javadoc javac unit findbugs checkstyle site
          git revision trunk / 22de7c1
          hadoop-common test log https://builds.apache.org/job/PreCommit-HADOOP-Build/7517/artifact/patchprocess/testrun_hadoop-common.txt
          hadoop-aws test log https://builds.apache.org/job/PreCommit-HADOOP-Build/7517/artifact/patchprocess/testrun_hadoop-aws.txt
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/7517/testReport/
          Java 1.7.0_55
          uname Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/7517/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 24m 50s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac 9m 6s There were no new javac warning messages. +1 javadoc 10m 35s There were no new javadoc warning messages. +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings. +1 site 3m 0s Site still builds. +1 checkstyle 1m 31s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 26s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 2m 39s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 common tests 22m 43s Tests passed in hadoop-common. +1 tools/hadoop tests 0m 13s Tests passed in hadoop-aws.     77m 5s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12751721/0001-HADOOP-12346.-Increase-some-default-timeouts-retries.patch Optional Tests javadoc javac unit findbugs checkstyle site git revision trunk / 22de7c1 hadoop-common test log https://builds.apache.org/job/PreCommit-HADOOP-Build/7517/artifact/patchprocess/testrun_hadoop-common.txt hadoop-aws test log https://builds.apache.org/job/PreCommit-HADOOP-Build/7517/artifact/patchprocess/testrun_hadoop-aws.txt Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/7517/testReport/ Java 1.7.0_55 uname Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/7517/console This message was automatically generated.
          Hide
          mackrorysd Sean Mackrory added a comment -

          I did manual testing of this by running a bunch of distcp and teragen / terasort jobs against S3 with various size datasets at different times of day. I haven't had any failures with the new defaults applied - it's noticably more reliable. I haven't added tests because no functionality / code is changing - just default configuration values. An automated test seems impractical since this is intended to address occasional flakiness.

          Show
          mackrorysd Sean Mackrory added a comment - I did manual testing of this by running a bunch of distcp and teragen / terasort jobs against S3 with various size datasets at different times of day. I haven't had any failures with the new defaults applied - it's noticably more reliable. I haven't added tests because no functionality / code is changing - just default configuration values. An automated test seems impractical since this is intended to address occasional flakiness.
          Hide
          andrew.wang Andrew Wang added a comment -

          This looks like a straightforward change to some defaults; I'm +1. Will trust in what sounds like extensive testing by Sean.

          Steve Loughran any thoughts? else I'll commit later.

          Show
          andrew.wang Andrew Wang added a comment - This looks like a straightforward change to some defaults; I'm +1. Will trust in what sounds like extensive testing by Sean. Steve Loughran any thoughts? else I'll commit later.
          Hide
          eddyxu Lei (Eddy) Xu added a comment -

          LGTM +1.

          Show
          eddyxu Lei (Eddy) Xu added a comment - LGTM +1.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          +1

          Show
          stevel@apache.org Steve Loughran added a comment - +1
          Hide
          eddyxu Lei (Eddy) Xu added a comment -

          Committed.

          Thanks much for the reviews, Andrew Wang and Steve Loughran.
          Also thanks the effort of working on this, Sean Mackrory.

          Show
          eddyxu Lei (Eddy) Xu added a comment - Committed. Thanks much for the reviews, Andrew Wang and Steve Loughran . Also thanks the effort of working on this, Sean Mackrory .
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8369 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8369/)
          HADOOP-12346. Increase some default timeouts / retries for S3a connector. (Sean Mackrory via Lei (Eddy) Xu) (lei: rev 6ab2d19f5c010ab1d318214916ba95daa91a4dbf)

          • hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
          • hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
          • hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8369 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8369/ ) HADOOP-12346 . Increase some default timeouts / retries for S3a connector. (Sean Mackrory via Lei (Eddy) Xu) (lei: rev 6ab2d19f5c010ab1d318214916ba95daa91a4dbf) hadoop-common-project/hadoop-common/src/main/resources/core-default.xml hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #326 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/326/)
          HADOOP-12346. Increase some default timeouts / retries for S3a connector. (Sean Mackrory via Lei (Eddy) Xu) (lei: rev 6ab2d19f5c010ab1d318214916ba95daa91a4dbf)

          • hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
          • hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
          • hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #326 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/326/ ) HADOOP-12346 . Increase some default timeouts / retries for S3a connector. (Sean Mackrory via Lei (Eddy) Xu) (lei: rev 6ab2d19f5c010ab1d318214916ba95daa91a4dbf) hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md hadoop-common-project/hadoop-common/src/main/resources/core-default.xml hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Yarn-trunk #1053 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1053/)
          HADOOP-12346. Increase some default timeouts / retries for S3a connector. (Sean Mackrory via Lei (Eddy) Xu) (lei: rev 6ab2d19f5c010ab1d318214916ba95daa91a4dbf)

          • hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
          • hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
          • hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #1053 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1053/ ) HADOOP-12346 . Increase some default timeouts / retries for S3a connector. (Sean Mackrory via Lei (Eddy) Xu) (lei: rev 6ab2d19f5c010ab1d318214916ba95daa91a4dbf) hadoop-common-project/hadoop-common/src/main/resources/core-default.xml hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #320 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/320/)
          HADOOP-12346. Increase some default timeouts / retries for S3a connector. (Sean Mackrory via Lei (Eddy) Xu) (lei: rev 6ab2d19f5c010ab1d318214916ba95daa91a4dbf)

          • hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
          • hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
          • hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #320 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/320/ ) HADOOP-12346 . Increase some default timeouts / retries for S3a connector. (Sean Mackrory via Lei (Eddy) Xu) (lei: rev 6ab2d19f5c010ab1d318214916ba95daa91a4dbf) hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2269 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2269/)
          HADOOP-12346. Increase some default timeouts / retries for S3a connector. (Sean Mackrory via Lei (Eddy) Xu) (lei: rev 6ab2d19f5c010ab1d318214916ba95daa91a4dbf)

          • hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
          • hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
          • hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2269 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2269/ ) HADOOP-12346 . Increase some default timeouts / retries for S3a connector. (Sean Mackrory via Lei (Eddy) Xu) (lei: rev 6ab2d19f5c010ab1d318214916ba95daa91a4dbf) hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #311 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/311/)
          HADOOP-12346. Increase some default timeouts / retries for S3a connector. (Sean Mackrory via Lei (Eddy) Xu) (lei: rev 6ab2d19f5c010ab1d318214916ba95daa91a4dbf)

          • hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
          • hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
          • hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #311 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/311/ ) HADOOP-12346 . Increase some default timeouts / retries for S3a connector. (Sean Mackrory via Lei (Eddy) Xu) (lei: rev 6ab2d19f5c010ab1d318214916ba95daa91a4dbf) hadoop-common-project/hadoop-common/src/main/resources/core-default.xml hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2250 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2250/)
          HADOOP-12346. Increase some default timeouts / retries for S3a connector. (Sean Mackrory via Lei (Eddy) Xu) (lei: rev 6ab2d19f5c010ab1d318214916ba95daa91a4dbf)

          • hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
          • hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
          • hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2250 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2250/ ) HADOOP-12346 . Increase some default timeouts / retries for S3a connector. (Sean Mackrory via Lei (Eddy) Xu) (lei: rev 6ab2d19f5c010ab1d318214916ba95daa91a4dbf) hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java hadoop-common-project/hadoop-common/src/main/resources/core-default.xml

            People

            • Assignee:
              mackrorysd Sean Mackrory
              Reporter:
              mackrorysd Sean Mackrory
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development