Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3907

In-Built function COR does not work with any other numeric type other than double.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.11.1
    • Fix Version/s: None
    • Component/s: build, piggybank
    • Labels:
      None

      Description

      Apache pig provides in-built function 'COR' (correlation). COR is used to calculate the correlation between various variables.
      COR function does not work if we provide any variable of datatype int or long. We need to explicitly cast the variables as double in the pig script. Which is never a good idea on the UI end.

      I have tried to unit test the correlation function by supplying some int values and it fails to iterate the bag. Same is the case, when supplying some int,long and double variables as input parameters to the COR function. However, my unit test for doubles gives the correct output.
      I have also tried to run the script on Hadoop Cluster, it fails if we have any variable other than double.
      It shows the following error on Hadoop cluster:
      ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2999: Unexpected internal error. null
      or sometimes ERROR 1066: Unable to open iterator for alias aliasName. Backend error : null

      In the Java Code of COR function, it casts everything to double, which is correct.But in the computeAll(-,-) function, the cast on iterators to yield x and y does creates a problem.

      exact code :
      double x =(Double)iterator_x.next().get(0); // error when int or long
      double y =(Double)iterator_y.next().get(0); // error when int or long

      Solutions: could be overriding the method getArgToFuncMapping() and defining Various classes IntCOR, LongCOR,FloatCOR. As it is done for some other UDFs like VAR.

      Please, fix the issue in piggybank as well as in Built-in Library of Pig.
      I am using Apache pig 0.11.0

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              rishi.pandey Rishi Pandey
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: