Uploaded image for project: 'Apache MADlib'
  1. Apache MADlib
  2. MADLIB-1136

Getting "ERROR: plpy.SPIError: Function" when calling linregr_train function with big data

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      hi MADLib developers,

      we have been trying to use MADlib on Greenplum to in-database perform linear regression calculation on a large amount of data (789,626,243 rows of data, segmented in ~475,000 groups). However, after running the following SQL statement for a little bit more than ten minutes, the following error message occurs:
      SQL statement:
      SELECT madlib.linregr_train(
      'xinos_plus_case_dlinterference_v2.temp_neighbor_pair_cqi_prb_nonull',
      'xinos_plus_case_dlinterference_v2.taipei_lm_result_temp',
      'average_cqi', 'array[1, prb_utilization]',
      'main_lnbts_id,main_lncel_id,lnbts_id,lncel_id');

      Error message:
      ERROR: plpy.SPIError: Function "madlib.linregr_merge_states(madlib.bytea8,madlib.bytea8)": ByteString improperly aligned for alignment request in seek(). (UDF_impl.hpp:210) (seg2 59-120-199-107.HINET-IP.hinet.net:50002 pid=9137) (plpython.c:4648)

      If we downsize the input data to 269837688 rows, then the same SQL statement can run with successful result.

      We are not sure if what we encountered here is a bug or an issue with how we use this MADLib linear regression function and we will appreciate it a lot if you could give us some pointers.

      We are willing to provide more information about input data (e.g. data schema) for further investigation if needed.

      thank you very much for taking care of this issue.

      David

      Attachments

        Activity

          People

            Unassigned Unassigned
            david.chen@wavein.com.tw David Chen
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: