Uploaded image for project: 'Apache MADlib'
  1. Apache MADlib
  2. MADLIB-853

Issue with "summary" function

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      Summary function with grouping column resulted in error.
      Context: This is an active data science lab engagement with a customer.
      Table details:
      #rows = 625888334
      #distinct values in grouping column = 235
      #columns = 20

      select madlib.summary ( 
      	'wip.hpwell',
      	'wip.hpwell_groupbyw_summary',
      	'rig_a,rig_name,filename_var,date_time,depth,bitp,bitw,blockh,diffp,flowin,hookl,mudw,pumpp,pumps1,pumps2,rop,topdr,topdt',
      	'wname',
      	TRUE,
      	TRUE,
      	NULL,
      	10,
      	TRUE );
      
      NOTICE:  table "hnp_pi_well_groupbywell_summary" does not exist, skipping
      CONTEXT:  SQL statement "DROP TABLE IF EXISTS wip.hnp_pi_well_groupbywell_summary"
      PL/Python function "summary"
      ERROR:  plpy.SPIError:  (execHHashagg.c:912)  (seg51 slice1 sdw7.gphd.local:1028 pid=249989) (plpython.c:4648)
      CONTEXT:  Traceback (most recent call last):
        PL/Python function "summary", line 25, in <module>
          get_distinct, get_quartiles, ntile_array, how_many_mfv, get_estimates)
        PL/Python function "summary", line 72, in summary
        PL/Python function "summary", line 389, in run
      PL/Python function "summary"
      ********** Error **********
      
      ERROR: plpy.SPIError:  (execHHashagg.c:912)  (seg51 slice1 sdw7.gphd.local:1028 pid=249989) (plpython.c:4648)
      SQL state: XX000
      Context: Traceback (most recent call last):
        PL/Python function "summary", line 25, in <module>
          get_distinct, get_quartiles, ntile_array, how_many_mfv, get_estimates)
        PL/Python function "summary", line 72, in summary
        PL/Python function "summary", line 389, in run
      PL/Python function "summary"
      
      

      Note that using "summary" on the full table without grouping column runs fine.

        Attachments

          Activity

            People

            • Assignee:
              riyer Rahul Iyer
              Reporter:
              rraghu Rashmi Raghu
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: