Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1441

Infinite WAITING on dead s3n connection.

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Not A Problem
    • Affects Version/s: 0.20.1
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      Debian 64 bit. Cloudera hadoop distro

      Description

      I am trying to read an image from a s3n URL as follows:

      final FileSystem fileSystem;
      fileSystem = FileSystem.get(new URI(filePath), config);
      final FSDataInputStream fileData = fileSystem.open(new Path(filePath));
      baseImage = ImageIO.read(fileData);
      fileData.close();

      When the ImageIO.read fails, i catch the exception and retry. It works flawlessly on the first four retries, but on the fifth retry, the fileSystem.open command block infinitely.
      Here's the stack trace:

      "Thread-83" prio=10 tid=0x00007f9740182000 nid=0x7cbc in Object.wait() [0x0000000040bbf000]
      java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)

      • waiting on <0x00007f974a822438> (a org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
        at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:509)
      • locked <0x00007f974a822438> (a org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
        at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:394)
        at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:152)
        at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:396)
        at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:324)
        at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:357)
        at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRestHead(RestS3Service.java:652)
        at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Service.java:1556)
        at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectDetailsImpl(RestS3Service.java:1492)
        at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1793)
        at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1225)
        at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:111)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at org.apache.hadoop.fs.s3native.$Proxy1.retrieveMetadata(Unknown Source)
        at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:355)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:690)
        at org.apache.hadoop.fs.s3native.NativeS3FileSystem.open(NativeS3FileSystem.java:476)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:398)
        at com.spratpix.mapreduce.MapUrl2ImageData.loadImageAndStoreTo(MapUrl2ImageData.java:226)

      I suppose this is because S3 is throttling connections to the same file, but the s3n client cannot handle that correctly.

        Activity

        Hide
        Subroto Sanyal added a comment -

        This issue will come if the user keeps on opening the connection and doesn't close it while making a re-try.
        Ideally on a failure of ImageIO.read() operation; user should close the InputStream and then re-try. The reason why the client code is waiting for infinite timeout after 4 connection is because RestUtils of Jets3t library passes the property http.connection-manager.max-per-host as 4.

        Close the connections while proceeding for re-try. The issue looks to be invalid.

        Show
        Subroto Sanyal added a comment - This issue will come if the user keeps on opening the connection and doesn't close it while making a re-try. Ideally on a failure of ImageIO.read() operation; user should close the InputStream and then re-try. The reason why the client code is waiting for infinite timeout after 4 connection is because RestUtils of Jets3t library passes the property http.connection-manager.max-per-host as 4. Close the connections while proceeding for re-try. The issue looks to be invalid.
        Hide
        Subroto Sanyal added a comment -

        User can configure the values to over-ride the default values:

        Property Description
        httpclient.max-connections The maximum number of simultaneous connections to allow globally
        httpclient.max-connections-per-host The maximum number of simultaneous connections to allow for a single host

        For more details please refer the Jets3t configuration link:
        http://jets3t.s3.amazonaws.com/toolkit/configuration.html#jets3t

        Show
        Subroto Sanyal added a comment - User can configure the values to over-ride the default values: Property Description httpclient.max-connections The maximum number of simultaneous connections to allow globally httpclient.max-connections-per-host The maximum number of simultaneous connections to allow for a single host For more details please refer the Jets3t configuration link: http://jets3t.s3.amazonaws.com/toolkit/configuration.html#jets3t
        Hide
        Andrei Savu added a comment -

        +1 for closing it as invalid. This is the expected behaviour when connections are not returned to the pool.

        Show
        Andrei Savu added a comment - +1 for closing it as invalid. This is the expected behaviour when connections are not returned to the pool.

          People

          • Assignee:
            Unassigned
            Reporter:
            Hajo Nils Krabbenhöft
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development