Uploaded image for project: 'Apache AsterixDB'
  1. Apache AsterixDB
  2. ASTERIXDB-2481

Out of Memory error doing aggregation - need a bound

    XMLWordPrintableJSON

Details

    Description

      This is the schema:

      CREATE TYPE Test AS open { unique2: int64 };
      
      CREATE DATASET wisconsin_5gb(Test) PRIMARY KEY unique2;
      

      This is the query:

      SELECT
          min(t.oddOnePercent) as min, 
          max(t.oddOnePercent) as max, 
          count(distinct t.oddOnePercent) as cnt
      FROM wisconsin_5gb t;
      

      The plan for this query:

      distribute result [$$46]
      -- DISTRIBUTE_RESULT  |UNPARTITIONED|
        exchange
        -- ONE_TO_ONE_EXCHANGE  |UNPARTITIONED|
          project ([$$46])
          -- STREAM_PROJECT  |UNPARTITIONED|
            assign [$$46] <- [{"min": $$48, "max": $$49, "cnt": $$50}]
            -- ASSIGN  |UNPARTITIONED|
              project ([$$48, $$49, $$50])
              -- STREAM_PROJECT  |UNPARTITIONED|
                subplan {
                          aggregate [$$50] <- [agg-sql-sum($$53)]
                          -- AGGREGATE  |LOCAL|
                            aggregate [$$53] <- [agg-sql-count($$43)]
                            -- AGGREGATE  |LOCAL|
                              distinct ([$$43])
                              -- MICRO_PRE_SORTED_DISTINCT_BY  |LOCAL|
                                order (ASC, $$43) 
                                -- IN_MEMORY_STABLE_SORT [$$43(ASC)]  |LOCAL|
                                  assign [$$43] <- [$$52.getField("oddOnePercent")]
                                  -- ASSIGN  |UNPARTITIONED|
                                    assign [$$52] <- [$#4.getField(0)]
                                    -- ASSIGN  |UNPARTITIONED|
                                      unnest $#4 <- scan-collection($$28)
                                      -- UNNEST  |UNPARTITIONED|
                                        nested tuple source
                                        -- NESTED_TUPLE_SOURCE  |UNPARTITIONED|
                       }
                -- SUBPLAN  |UNPARTITIONED|
                  aggregate [$$28, $$48, $$49] <- [listify($$27), agg-sql-min($$33), agg-sql-max($$33)]
                  -- AGGREGATE  |UNPARTITIONED|
                    exchange
                    -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
                      project ([$$27, $$33])
                      -- STREAM_PROJECT  |PARTITIONED|
                        assign [$$33, $$27] <- [$$t.getField("oddOnePercent"), {"t": $$t}]
                        -- ASSIGN  |PARTITIONED|
                          project ([$$t])
                          -- STREAM_PROJECT  |PARTITIONED|
                            exchange
                            -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                              data-scan []<-[$$47, $$t] <- Default.wisconsin_5gb
                              -- DATASOURCE_SCAN  |PARTITIONED|
                                exchange
                                -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                  empty-tuple-source
                                  -- EMPTY_TUPLE_SOURCE  |PARTITIONED|
      

      Attachments

        1. nc-1.log
          26 kB
          Gift Sinthong
        2. cc.log
          36 kB
          Gift Sinthong
        3. Screen Shot 2018-11-14 at 3.12.31 PM.png
          133 kB
          Gift Sinthong

        Issue Links

          Activity

            People

              alsuliman Ali Alsuliman
              psinthon Gift Sinthong
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: