Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-1362

Count(nullable-column) is incorrectly pushed into group scan operator

    XMLWordPrintableJSON

Details

    Description

      The following query on TPC-DS table web_returns produces wrong result because the aggregate count(wr_return_quantity) gets pushed into the parquet group scan operator even though wr_return_quantity is nullable and apparently the parquet metadata does not have stats on nullable column.

      0: jdbc:drill:zk=local> select count(wr_return_quantity) from web_returns;
      ------------

      EXPR$0

      ------------

      71763

      ------------

      0: jdbc:drill:zk=local> explain plan for select count(wr_return_quantity) from web_returns;

      +------------+------------+
      |    text    |    json    |
      +------------+------------+
      | 00-00    Screen
      00-01      Project(EXPR$0=[$0])
      00-02        Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@e4acaad])
      

      For reference, here are the correct results:
      tpcds=# select count(wr_return_quantity) from web_returns;
      count
      -------
      68616
      (1 row)

      tpcds=# select count from web_returns;
      count
      -------
      71763
      (1 row)

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              amansinha100 Aman Sinha
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: