Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3014

Hdfs scan node should wait for all filters, not just partition filters

    XMLWordPrintableJSON

Details

    Description

      HdfsScanNode::WaitForPartitionFilters() shouldn't only wait for partition filters, as there is often a lot of value in doing row-based filtering as well.

      However, we should also avoid waiting for filters that we can't apply (e.g. if row filtering is turned off, or if the file format is not parquet). The best thing to do is probably to wait separately, in the parquet column reader, since that gives row filters the maximum time to arrive and doesn't affect waiting for partition filters earlier in scanner's pipeline.

      Attachments

        Activity

          People

            henryr Henry Robinson
            henryr Henry Robinson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: