Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.8.0
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: fs/s3
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      S3A has added support for configurable input policies. Similar to fadvise, this configuration provides applications with a way to specify their expected access pattern (sequential or random) while reading a file. S3A then performs optimizations tailored to that access pattern. See site documentation of the fs.s3a.experimental.input.fadvise configuration property for more details. Please be advised that this feature is experimental and subject to backward-incompatible changes in future releases.
      Show
      S3A has added support for configurable input policies. Similar to fadvise, this configuration provides applications with a way to specify their expected access pattern (sequential or random) while reading a file. S3A then performs optimizations tailored to that access pattern. See site documentation of the fs.s3a.experimental.input.fadvise configuration property for more details. Please be advised that this feature is experimental and subject to backward-incompatible changes in future releases.

      Description

      Currently file's "contentLength" is set as the "requestedStreamLen", when invoking S3AInputStream::reopen(). As a part of lazySeek(), sometimes the stream had to be closed and reopened. But lots of times the stream was closed with abort() causing the internal http connection to be unusable. This incurs lots of connection establishment cost in some jobs. It would be good to set the correct value for the stream length to avoid connection aborts.

      I will post the patch once aws tests passes in my machine.

        Attachments

        1. HADOOP-13203-branch-2-010.patch
          57 kB
          Steve Loughran
        2. HADOOP-13203-branch-2-009.patch
          57 kB
          Steve Loughran
        3. HADOOP-13203-branch-2-008.patch
          53 kB
          Steve Loughran
        4. HADOOP-13203-branch-2-007.patch
          42 kB
          Steve Loughran
        5. HADOOP-13203-branch-2-006.patch
          43 kB
          Steve Loughran
        6. HADOOP-13203-branch-2-005.patch
          28 kB
          Steve Loughran
        7. stream_stats.tar.gz
          716 kB
          Rajesh Balamohan
        8. HADOOP-13203-branch-2-004.patch
          6 kB
          Rajesh Balamohan
        9. HADOOP-13203-branch-2-003.patch
          6 kB
          Rajesh Balamohan
        10. HADOOP-13203-branch-2-002.patch
          6 kB
          Rajesh Balamohan
        11. HADOOP-13203-branch-2-001.patch
          3 kB
          Rajesh Balamohan

          Issue Links

            Activity

              People

              • Assignee:
                rajesh.balamohan Rajesh Balamohan
                Reporter:
                rajesh.balamohan Rajesh Balamohan
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: