Details

    • Sub-task
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 3.0.0-beta1
    • 2.9.0, 3.0.0-beta1
    • build, fs/s3
    • None

    Description

      the AWS SDK in Hadoop 3.0.-beta-1 prints a warning whenever you call abort() on a stream, which is what we need to do whenever doing long-distance seeks in a large file opened with fadvise=normal

      2017-09-20 17:51:50,459 [ScalaTest-main-running-S3ASeekReadSuite] INFO  s3.S3ASeekReadSuite (Logging.scala:logInfo(54)) - 
      2017-09-20 17:51:50,460 [ScalaTest-main-running-S3ASeekReadSuite] INFO  s3.S3ASeekReadSuite (Logging.scala:logInfo(54)) - Starting read() [pos = 45603305]
      2017-09-20 17:51:50,461 [ScalaTest-main-running-S3ASeekReadSuite] WARN  internal.S3AbortableInputStream (S3AbortableInputStream.java:close(163)) - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.
      2017-09-20 17:51:51,263 [ScalaTest-main-running-S3ASeekReadSuite] INFO  s3.S3ASeekReadSuite (Logging.scala:logInfo(54)) - Duration of read() [pos = 45603305] = 803,650,637 nS
      

      This goes away if we upgrade to the latest SDK, at least for the non-localdynamo bits

      Attachments

        1. HADOOP-14890-branch-2-002.patch
          1 kB
          Steve Loughran
        2. HADOOP-14890-001.patch
          0.7 kB
          Steve Loughran

        Issue Links

          Activity

            People

              stevel@apache.org Steve Loughran
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: