Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-1241

improve trevni performance on string deserialization

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.7.3
    • 1.7.4
    • java
    • None
    • Optimized string de-serialization

    Description

      I have been trying to implement a storage function for Apache Pig that writes data in Trevni format. I found that the storage function was very slow when reading whole records.

      I did some profiling (with Yourkit) and found that most of the CPU time was being spent in org.apache.trevni.InputBuffer$readString() (specifically in the String() method). I changed to java.nio.charset.CharsetDecoder.decode for deserialization and saw a big improvement. Changes are included in the patch.

      Attachments

        1. AVRO-1241
          2 kB
          Joseph Adler

        Issue Links

          Activity

            People

              jadler Joseph Adler
              jadler Joseph Adler
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: