Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15499

Performance severe drop when running RawErasureCoderBenchmark with NativeRSRawErasureCoder

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0, 3.0.1, 3.0.2, 3.1.1
    • Fix Version/s: 3.2.0, 3.1.1, 3.0.4
    • Component/s: None
    • Labels:
      None

      Description

      Run RawErasureCoderBenchmark  which is a micro-benchmark to test EC codec encoding/decoding performance. 

      50 concurrency Native ISA-L coder has the less throughput than 1 concurrency Native ISA-L case. It's abnormal. 

       

      bin/hadoop jar ./share/hadoop/common/hadoop-common-3.2.0-SNAPSHOT-tests.jar org.apache.hadoop.io.erasurecode.rawcoder.RawErasureCoderBenchmark encode 3 1 1024 1024
      Using 126MB buffer.
      ISA-L coder encode 1008MB data, with chunk size 1024KB
      Total time: 0.19 s.
      Total throughput: 5390.37 MB/s
      Threads statistics:
      1 threads in total.
      Min: 0.18 s, Max: 0.18 s, Avg: 0.18 s, 90th Percentile: 0.18 s.

       

      bin/hadoop jar ./share/hadoop/common/hadoop-common-3.2.0-SNAPSHOT-tests.jar org.apache.hadoop.io.erasurecode.rawcoder.RawErasureCoderBenchmark encode 3 50 1024 10240
      Using 120MB buffer.
      ISA-L coder encode 54000MB data, with chunk size 10240KB
      Total time: 11.58 s.
      Total throughput: 4662 MB/s
      Threads statistics:
      50 threads in total.
      Min: 0.55 s, Max: 11.5 s, Avg: 6.32 s, 90th Percentile: 10.45 s.

       

      RawErasureCoderBenchmark shares a single coder between all concurrent threads. While 

      NativeRSRawEncoder and NativeRSRawDecoder has synchronized key work on doDecode and doEncode function. So 50 concurrent threads are forced to use the shared coder encode/decode function one by one. 

       

      To resolve the issue, there are two approaches. 

      1. Refactor RawErasureCoderBenchmark  to use dedicated coder for each concurrent thread.
      2. Refactor NativeRSRawEncoder  and NativeRSRawDecoder  to get better concurrency.  Since the synchronized key work is to try to protect the private variable nativeCoder from being checked in doEncode/doDecode and  being modified in release.  We can use reentrantReadWriteLock to increase the concurrency since doEncode/doDecode can be called multiple times without change the nativeCoder state.

       I prefer approach 2 and will upload a patch later. 

       

       

       

       

        Attachments

        1. HADOOP-15499.002.patch
          13 kB
          Sammi Chen
        2. HADOOP-15499.001.patch
          12 kB
          Sammi Chen

          Activity

            People

            • Assignee:
              Sammi Sammi Chen
              Reporter:
              Sammi Sammi Chen
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: