Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-10978

Allow PrunedFilterScan to eliminate predicates from further evaluation

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.3.0, 1.4.0, 1.5.0
    • Fix Version/s: 1.6.0
    • Component/s: SQL
    • Labels:
      None
    • Target Version/s:

      Description

      Currently PrunedFilterScan allows implementors to push down predicates to an underlying datasource. This is done solely as an optimization as the predicate will be reapplied on the Spark side as well. This allows for bloom-filter like operations but ends up doing a redundant scan for those sources which can do accurate pushdowns.

      In addition it makes it difficult for underlying sources to accept queries which reference non-existent to provide ancillary function. In our case we allow a solr query to be passed in via a non-existent solr_query column. Since this column is not returned when Spark does a filter on "solr_query" nothing passes.

      Suggestion on the ML from Michael Armbrust

      We have to try and maintain binary compatibility here, so probably the easiest thing to do here would be to add a method to the class. Perhaps something like:

      def unhandledFilters(filters: Array[Filter]): Array[Filter] = filters

      By default, this could return all filters so behavior would remain the same, but specific implementations could override it. There is still a chance that this would conflict with existing methods, but hopefully that would not be a problem in practice.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                lian cheng Cheng Lian
                Reporter:
                rspitzer Russell Spitzer
              • Votes:
                1 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: