Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4298

Descending order-by is broken in some cases when key is bytearrays

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.14.0
    • None
    • None

    Description

      Here is a repo script (using PigPen )-

      REGISTER pigpen.jar;
      
      load4254 = LOAD 'input.clj'
          USING PigStorage('\n')
          AS (value:chararray);
      
      DEFINE udf4265 pigpen.PigPenFnDataBag('(clojure.core/require (quote [pigpen.runtime]) (quote [clojure.edn]))','(pigpen.runtime/exec [(pigpen.runtime/process->bind (pigpen.runtime/pre-process :pig :native)) (pigpen.runtime/map->bind clojure.edn/read-string) (pigpen.runtime/key-selector->bind clojure.core/identity) (pigpen.runtime/process->bind (pigpen.runtime/post-process :pig :native-key-frozen-val))])');
      
      generate4263 = FOREACH load4254 GENERATE
          FLATTEN(udf4265(value));
      generate4257 = FOREACH generate4263 GENERATE
          $0 AS key,
          $1 AS value;
      
      order4258 = ORDER generate4257 BY key DESC; <-- sort order isn't changed by DESC
      dump order4258;
      

      This script returns the same result for both ASC and DESC orders.

      The problem is as follows-

      1. PigBytesRawComparator calls BinInterSedesTupleRawComparator.compare().
      2. BinInterSedesTupleRawComparator applies descending order.
      3. PigBytesRawComparator applies descending order again to what BinInterSedesTupleRawComparator returns.

      Therefore, descending order is never applied.

      Attachments

        1. repo.tar.gz
          6.72 MB
          Cheolsoo Park
        2. PIG-4298-2.patch
          3 kB
          Daniel Dai
        3. PIG-4298-1.patch
          1 kB
          Cheolsoo Park

        Activity

          People

            cheolsoo Cheolsoo Park
            cheolsoo Cheolsoo Park
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: