Uploaded image for project: 'Apache MADlib'
  1. Apache MADlib
  2. MADLIB-1274

Association rules error on output schema

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Not A Bug
    • None
    • v1.15.1
    • None

    Description

      Error observed on:

      • Postgres 9.6
      • Greenplum Database 5.9.0
        This is a small AWS single node GP, 4 segments on a machine with 8 VCPUs, and plenty of available memory
        [gpadmin@ip-172-21-0-246 RetailDemo]$ cat /proc/meminfo
        MemTotal: 62711428 kB
        MemFree: 59786076 kB
        MemAvailable: 60281836 kB

      Load data

      DROP TABLE IF EXISTS order_items;
      CREATE TABLE order_items(  itemid INTEGER,
                                 orderid INTEGER,
                                 productid INTEGER,
                                 quantity INTEGER,
                                 productname TEXT);                        
      INSERT INTO order_items VALUES
      (      5 ,    1044 ,         9 ,        3 , 'Kirby cukes'),
      (     11 ,      37 ,         2 ,        3 , 'Ooopsi Cola'),
      (     12 ,      37 ,        21 ,        3 , 'black radish'),
      (     15 ,      37 ,        49 ,        3 , 'Leg of lamb'),
      (     18 ,      37 ,        37 ,        3 , 'Uggo Waffles'),
      (     20 ,      37 ,        76 ,        3 , 'Happy Valley White Peaches'),
      (     21 ,      37 ,        29 ,        3 , 'Breakstone Whole Milk Cottage Cheese'),
      (     22 ,      37 ,        25 ,        3 , 'ugli fruit'),
      (      4 ,    1044 ,        44 ,        3 , 'ground beef'),
      (      6 ,    1044 ,        17 ,        3 , 'napa'),
      (      9 ,    1044 ,        10 ,        3 , 'dill'),
      (     13 ,      37 ,        21 ,        3 , 'black radish'),
      (     24 ,      37 ,        47 ,        3 , 'Ball Park Franks'),
      (     25 ,      37 ,        69 ,        3 , 'Ball Park Mustard'),
      (     26 ,      37 ,        64 ,        3 , 'Ballpark Hot Dog Rolls'),
      (     27 ,    1044 ,        47 ,        3 , 'Ball Park Franks'),
      (     28 ,    1044 ,        69 ,        3 , 'Ball Park Mustard'),
      (     29 ,    1044 ,        64 ,        3 , 'Ballpark Hot Dog Rolls'),
      (     30 ,    1044 ,        70 ,        3 , 'Homer''s Strawberry Jam'),
      (     31 ,    1044 ,        71 ,        3 , 'Mr Peanut Peanut Butter'),
      (     32 ,      37 ,        71 ,        3 , 'Mr Peanut Peanut Butter'),
      (     33 ,      37 ,        70 ,        3 , 'Homer''s Strawberry Jam'),
      (      1 ,    1044 ,         1 ,        3 , 'Pivotal Apple Juice'),
      (      3 ,    1044 ,        77 ,        3 , 'Pivotal Baked Beans'),
      (     14 ,      37 ,        53 ,        3 , 'Old Zurich Swiss Cheese'),
      (     17 ,      37 ,        49 ,        3 , 'Leg of lamb'),
      (     19 ,      37 ,        18 ,        3 , 'california navels'),
      (      2 ,    1044 ,        41 ,        3 , '12" Dinner Plates'),
      (      7 ,    1044 ,        32 ,        3 , 'Vermot Extra Sharp Cheddar'),
      (      8 ,    1044 ,        71 ,        3 , 'Mr Peanut Peanut Butter'),
      (     10 ,    1044 ,        39 ,        3 , 'Pivotal Soft and Smooth 24 pack'),
      (     16 ,      37 ,        22 ,        3 , 'triple wahsed spinach'),
      (     23 ,      37 ,        61 ,        3 , 'Brooklyn Bagel 6 pack');
      

      (1)
      XXX
      This one is not an error, it is just running for a long time since there are a gazillion rules generated since not capped by `max_itemset_size` param. See later comment 9/17/18.
      XXX

      Run assoc rules:

      SELECT * FROM madlib.assoc_rules( .25,
                                        .5,
                                        'orderid',
                                        'productid',
                                        'order_items',
                                        NULL,
                                        TRUE
                                      );
      

      does not return.

      (2)
      Run assoc rules with output table specified results in:

      SELECT * FROM madlib.assoc_rules(.10,                  -- Support
                                       .10,                  -- Confidence
                                       'orderid',            -- Transaction id col
                                       'productname',        -- Product col
                                       'order_items',        -- Input data
                                       'pivotalmarkets',     -- Output data
                                       TRUE);                -- Verbose
      
      

      results in error:

      InternalError: (psycopg2.InternalError) plpy.Error: the output schema does not exist
      CONTEXT:  Traceback (most recent call last):
        PL/Python function "assoc_rules", line 31, in <module>
          'NULL'
        PL/Python function "assoc_rules", line 107, in assoc_rules
        PL/Python function "assoc_rules", line 21, in __assert
      PL/Python function "assoc_rules"
       [SQL: "SELECT * FROM madlib.assoc_rules(.10,                  -- Support\n                                 .10,                  -- Confidence\n                                 'orderid',            -- Transaction id col\n                                 'productname',        -- Product col\n                                 'order_items',        -- Input data\n                                 'pivotalmarkets',     -- Output data\n                                 TRUE);                -- Verbose"]
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              fmcquillan Frank McQuillan
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: