Uploaded image for project: 'Apache MADlib'
  1. Apache MADlib
  2. MADLIB-1231

Exception in correlation/covariance when mean of a column is null

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      While working on https://issues.apache.org/jira/browse/MADLIB-1128, we found a bug in the correlation/covariance module.

      Repro steps:

      create table foo(i int , j int);
      insert into foo values(1,NULL);
      madlib=# select madlib.correlation('foo','foo_out');
      ERROR:  plpy.SPIError: Function "madlib.correlation_transition(double precision[],double precision[],double precision[])": Correlation: Mean vector contains NULL. (UDF_impl.hpp:210)  (entry db 127.0.0.1:15432 pid=46000)
      CONTEXT:  Traceback (most recent call last):
        PL/Python function "correlation", line 23, in <module>
          return correlation.correlation(**globals())
        PL/Python function "correlation", line 71, in correlation
        PL/Python function "correlation", line 207, in _populate_output_table
      PL/Python function "correlation"
      madlib=#
      

      This was introduced in https://issues.apache.org/jira/browse/MADLIB-1166 when we started supporting null values.

      Attachments

        Activity

          People

            Unassigned Unassigned
            nikhilkak Nikhil
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: