Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-5288

Improve performance of PigTextRawBytesComparator

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.18.0
    • None
    • None
    • Reviewed

    Description

      Came across this stacktrace for a group by when investigating a different performance issue.

      "TezChild" #22 daemon prio=5 os_prio=0 tid=0x00007fa935495000 nid=0x7c3e runnable [0x00007fa91d354000]
         java.lang.Thread.State: RUNNABLE
              at sun.nio.cs.UTF_8$Decoder.decodeLoop(UTF_8.java:412)
              at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:579)
              at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:802)
              at org.apache.hadoop.io.Text.decode(Text.java:412)
              at org.apache.hadoop.io.Text.decode(Text.java:389)
              at org.apache.hadoop.io.Text.toString(Text.java:280)
              at org.apache.pig.impl.io.NullableText.getValueAsPigType(NullableText.java:46)
              at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextRawComparator.compare(PigTextRawComparator.java:95)
              at org.apache.tez.runtime.library.common.ValuesIterator.readNextKey(ValuesIterator.java:188)
              at org.apache.tez.runtime.library.common.ValuesIterator.access$300(ValuesIterator.java:47)
              at org.apache.tez.runtime.library.common.ValuesIterator$1$1.next(ValuesIterator.java:143)
              at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POShuffleTezLoad.getNextTuple(POShuffleTezLoad.java:218)
      

      Conversion to String and comparing is a wastage (result of extending from PigTextRawBytesComparator which is used in sorting). PigCharArrayWritableComparator which is the equivalent used in mapreduce does not. It directly compares it as a Text.

      Attachments

        1. PIG-5288-1.patch
          2 kB
          Rohini Palaniswamy

        Activity

          People

            rohini Rohini Palaniswamy
            rohini Rohini Palaniswamy
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: