Uploaded image for project: 'jclouds'
  1. jclouds
  2. JCLOUDS-847

S3 poor upload performance

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.8.1
    • 2.2.0
    • jclouds-blobstore
    • JDK 1.7.0_55, 64bit, Windows 7
      EU bucket, https

    Description

      Hi.

      I'm using jclouds 1.8.1 together with the Apache HttpClient module to upload files to S3. During tests, I encountered that upload performance is quite poor in comparison to jets3t or windows tools like Cloudberry S3 Explorer.

      Sending a 10MB binary file on a cable connection (100mbit down/5mbit up), to an EU bucket (https, default endpoints), from a Windows 7 machine (JDK 1.7.0_55, 64bit) gives the following results:

      jclouds: ~55 secs
      Amazon Java SDK: ~55 secs.
      jets3t: ~18 secs
      S3 Explorer: ~18 secs

      Using a faster connection upload time increased up to 200 seconds with jclouds/Amazon SDK. The rest kept the same around 18 secs.

      So I wondered, where this difference comes from. I started digging into the source code of jclouds, jets3t, httpclient and took a look at the network packages which are send.

      Long story short: too small buffer sizes!

      Jclouds uses for the payload the "default" HttpEntities which HttpClient provides. Such as FileEntity and InputStreamEntity. These are using an output buffer size of hardcoded 4096 bytes.
      This seems no problem, when the available upload bandwidth is quite small, but slows down the connection on bigger bandwidth - as it seems.

      For testing I simply created my own HttpClient module, based on the shipped ones and made a simple change that adds a 128k buffer to the to-be-send entity. The result is, that upload performance is now up to the other guys like jets3t and S3 Explorer.

      Please find attached a small maven project that can be used demonstrate the difference.

      To be honest, I'm not too deep into the jclouds code to provide a proper patch, but my suggestion would be to provide an own (jclouds) implementation of File- and InputStreamEntity that provide proper output buffer sizes. Maybe with an option to overwrite them by configuration.

      I also tried the HttpClient "http.socket.buffer-size", but that hadn't any effect. Also the 2.0.0-SNAPSHOT version shows no difference.

      What do you guys think? Is this a known problem? Or are there other settings to increase the upload performance?

      BTW: The same problem exists with the default JavaUrlHttpCommandExecutorServiceModule which also
      uses a 4k buffer. Also tried the OkHttp driver with the same results (1.8.1, 2.0.0-SNAPHOT failed with an exception).

      Thanks!
      Veit

      Attachments

        1. s3-upload-test.zip
          9 kB
          Veit Guna

        Issue Links

          Activity

            People

              gaul Andrew Gaul
              vguna Veit Guna
              Votes:
              3 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m