Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-10681

Remove synchronized blocks from SnappyCodec and ZlibCodec buffering inner loop

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.2.0, 2.4.0, 2.5.0
    • 2.6.0
    • performance
    • Remove unnecessary synchronized blocks from Snappy/Zlib codecs.

    Description

      The current implementation of SnappyCompressor spends more time within the java loop of copying from the user buffer into the direct buffer allocated to the compressor impl, than the time it takes to compress the buffers.

      The bottleneck was found to be java monitor code inside SnappyCompressor.

      The methods are neatly inlined by the JIT into the parent caller (BlockCompressorStream::write), which unfortunately does not flatten out the synchronized blocks.

      The loop does a write of small byte[] buffers (each IFile key+value).

      I counted approximately 6 monitor enter/exit blocks per k-v pair written.

      Attachments

        1. compress-cmpxchg-small.png
          164 kB
          Gopal Vijayaraghavan
        2. HADOOP-10681.1.patch
          16 kB
          Gopal Vijayaraghavan
        3. HADOOP-10681.2.patch
          13 kB
          Gopal Vijayaraghavan
        4. HADOOP-10681.3.patch
          18 kB
          Gopal Vijayaraghavan
        5. HADOOP-10681.4.patch
          20 kB
          Gopal Vijayaraghavan
        6. perf-top-spill-merge.png
          126 kB
          Gopal Vijayaraghavan
        7. snappy-perf-unsync.png
          95 kB
          Gopal Vijayaraghavan

        Issue Links

          Activity

            People

              gopalv Gopal Vijayaraghavan
              gopalv Gopal Vijayaraghavan
              Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: