Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6625

Skip dictionary and collection conjunct assignment for non-Parquet scans.

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: Impala 2.9.0, Impala 2.10.0, Impala 2.11.0
    • Fix Version/s: Impala 2.13.0, Impala 3.1.0
    • Component/s: Frontend
    • Labels:
    • Epic Color:
      ghx-label-5

      Description

      In HdfsScanNode.init() we try to assign dictionary and collection conjuncts even for non-Parquet scans. Such predicates only make sense for Parquet scans, so there is no point in collecting them for other scans.

      The current behavior is undesirable because:

      • init() can be substantially slower because assigning dictionary filters may involve evaluating exprs in the BE which can be expensive
      • the explain plan of non-Parquet scans may have a section "parquet dictionary predicates" which is confusing/misleading

      Relevant code snippet from HdfsScanNode:

      @Override
        public void init(Analyzer analyzer) throws ImpalaException {
          conjuncts_ = orderConjunctsByCost(conjuncts_);
          checkForSupportedFileFormats();
      
          assignCollectionConjuncts(analyzer);
          computeDictionaryFilterConjuncts(analyzer);
      
          // compute scan range locations with optional sampling
          Set<HdfsFileFormat> fileFormats = computeScanRangeLocations(analyzer);
      ...
          if (fileFormats.contains(HdfsFileFormat.PARQUET)) { <--- assignment should go in here
            computeMinMaxTupleAndConjuncts(analyzer);
          }
      ...
      }
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                poojanilangekar Pooja Nilangekar
                Reporter:
                alex.behm Alexander Behm
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: