Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-6203

Avg operator test in SparkAdapterTest

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.36.0
    • 1.38.0
    • spark
    • None

    Description

       

      I test the avg operator in SparkAdapterTest

      @Test void testGroupBy() {
        final String sql = "select sum(x) as SUM_X, min(y) as MIN_Y, max(y) as MAX_Y, avg(x) as AVG_X,"
            + "count(*) as CNT_Y, count(distinct y) as CNT_DIST_Y\n"
            + "from " + VALUES2 + "\n"
            + "group by x";
      
        final String plan = "PLAN="
            + "EnumerableCalc(expr#0..5=[{inputs}], expr#6=[CAST($t1):INTEGER NOT NULL], expr#7=[CAST($t2):CHAR(1) NOT NULL], expr#8=[CAST($t3):CHAR(1) NOT NULL], expr#9=[CAST($t4):BIGINT NOT NULL], SUM_X=[$t6], MIN_Y=[$t7], MAX_Y=[$t8], AVG_X=[$t0], CNT_Y=[$t9], CNT_DIST_Y=[$t5])\n"
            + "  EnumerableAggregate(group=[{0}], SUM_X=[MIN($2) FILTER $7], MIN_Y=[MIN($3) FILTER $7], MAX_Y=[MIN($4) FILTER $7], CNT_Y=[MIN($5) FILTER $7], CNT_DIST_Y=[COUNT($1) FILTER $6])\n"
            + "    EnumerableCalc(expr#0..6=[{inputs}], expr#7=[0], expr#8=[=($t6, $t7)], expr#9=[1], expr#10=[=($t6, $t9)], proj#0..5=[{exprs}], $g_0=[$t8], $g_1=[$t10])\n"
            + "      EnumerableAggregate(group=[{0, 1}], groups=[[{0, 1}, {0}]], SUM_X=[$SUM0($0)], MIN_Y=[MIN($1)], MAX_Y=[MAX($1)], CNT_Y=[COUNT()], $g=[GROUPING($0, $1)])\n"
            + "        EnumerableValues(tuples=[[{ 1, 'a' }, { 2, 'b' }, { 1, 'b' }, { 2, 'c' }, { 2, 'c' }]])\n";
      
        final String[] expectedResult = {
            "SUM_X=2; MIN_Y=a; MAX_Y=b; AVG_X=1; CNT_Y=2; CNT_DIST_Y=2",
            "SUM_X=6; MIN_Y=b; MAX_Y=c; AVG_X=2; CNT_Y=3; CNT_DIST_Y=2"
        };
      
        sql(sql).returnsUnordered(expectedResult)
            .explainContains(plan);
      } 

       

      It can be found that there is no corresponding avg operator in EnumerableAggregate and EnumerableCalc. Why is this? Spark sql will split the avg operator into sum/count form. I don’t know if this is a bug.

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            caicancai Caican Cai
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: