Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38852

Better Data Source V2 operator pushdown framework

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.3.0
    • None
    • SQL
    • None

    Description

      Currently, Spark supports push down Filters and Aggregates to data source.
      However, the Data Source V2 operator pushdown framework has the following shortcomings:

      1. Only simple filter and aggregate are supported, which makes it impossible to apply in most scenarios
      2. The incompatibility of SQL syntax makes it impossible to apply in most scenarios
      3. Aggregate push down does not support multiple partitions of data sources
      4. Spark's additional aggregate will cause some overhead
      5. Limit push down is not supported
      6. Top n push down is not supported
      7. Aggregate push down does not support group by expressions
      8. Aggregate push down does not support not use aggregate functions
      9. Offset push down is not supported
      10. Paging push down is not supported
      11. UDF/UDAF push down is not supported

      Attachments

        Issue Links

          1.
          pushDownPredicate=false should prevent push down filters to JDBC data source Sub-task Resolved Jiaan Geng
          2.
          Improve the implement of aggregate pushdown. Sub-task Resolved Jiaan Geng
          3.
          Move compileAggregates from JDBCRDD to JdbcDialect Sub-task Resolved Jiaan Geng
          4.
          Support push down top N to JDBC data source V2 Sub-task Resolved Jiaan Geng
          5.
          Support datasource v2 complete aggregate pushdown Sub-task Resolved Jiaan Geng
          6.
          Improve the implement of JDBCV2Suite Sub-task Resolved Jiaan Geng
          7.
          Translate more standard aggregate functions for pushdown Sub-task Resolved Jiaan Geng
          8.
          Upgrade h2 from 1.4.195 to 2.0.202 Sub-task Resolved Jiaan Geng
          9.
          DS V2 supports partial aggregate push-down AVG Sub-task Resolved Jiaan Geng
          10.
          Compile aggregate functions of build-in JDBC dialect Sub-task Resolved Jiaan Geng
          11.
          A new framework to represent catalyst expressions in DS v2 APIs Sub-task Resolved Jiaan Geng
          12.
          Reactor framework so as JDBC dialect could compile expression by self way Sub-task Resolved Jiaan Geng
          13.
          Add factory method getConnection into JDBCDialect. Sub-task Resolved Jiaan Geng
          14.
          If `Sum`, `Count`, `Any` accompany distinct, cannot do partial agg push down. Sub-task Resolved Jiaan Geng
          15.
          Refactor framework so as JDBC dialect could compile filter by self way Sub-task Resolved Jiaan Geng
          16.
          DS V2 aggregate push-down supports project with alias Sub-task Resolved Jiaan Geng
          17.
          DS V2 topN push-down supports project with alias Sub-task Resolved Jiaan Geng
          18.
          Datasource v2 supports partial topN push-down Sub-task Resolved Jiaan Geng
          19.
          Support push down Cast to JDBC data source V2 Sub-task Resolved Jiaan Geng
          20.
          If limit could pushed down and Data source only have one partition, DS V2 should not do limit again Sub-task Resolved Apache Spark
          21.
          DS V2 supports push down misc non-aggregate functions Sub-task Resolved Jiaan Geng
          22.
          DS V2 supports push down math functions Sub-task Resolved Jiaan Geng
          23.
          Update document of JDBC options for pushDownAggregate and pushDownLimit Sub-task Resolved Jiaan Geng
          24.
          DS V2 supports push down string functions Sub-task Resolved Zhixiong Chen
          25.
          DS V2 supports push down collection functions Sub-task Open Unassigned
          26.
          DS V2 supports push down misc functions Sub-task Resolved Zhixiong Chen
          27.
          DS V2 supports push down OFFSET operator Sub-task Resolved Jiaan Geng
          28.
          DS V2 aggregate push-down supports group by expressions Sub-task Resolved Jiaan Geng
          29.
          DS V2 supports push down datetime functions Sub-task Resolved Jiaan Geng
          30.
          DS V2 Top N push-down supports order by expressions Sub-task Resolved Jiaan Geng
          31.
          DS V2 Limit push-down should avoid out of memory Sub-task Resolved Unassigned
          32.
          DS V2 aggregate partial push-down should supports group by without aggregate functions Sub-task Resolved Jiaan Geng
          33.
          DS V2 supports push down DS V2 UDF Sub-task Resolved Jiaan Geng
          34.
          DS V2 aggregate push down can work with OFFSET or LIMIT Sub-task Resolved Jiaan Geng
          35.
          H2Dialect should override getJDBCType so as make the data type is correct Sub-task Resolved Jiaan Geng
          36.
          Jdbc dialect should decide which function could be pushed down. Sub-task Resolved Jiaan Geng
          37.
          JDBC dialect supports registering dialect specific functions Sub-task Resolved Jiaan Geng
          38.
          Compile build-in linear regression aggregate functions for JDBC dialect Sub-task Resolved Jiaan Geng
          39.
          Translate linear regression aggregate functions for pushdown Sub-task Resolved Jiaan Geng
          40.
          Capitalize sql keywords in JDBCV2Suite Sub-task Resolved Jiaan Geng
          41.
          DS V2 supports push down misc non-aggregate functions(non ANSI) Sub-task Resolved Jiaan Geng
          42.
          DS V2 supports push down math functions(non ANSI) Sub-task Resolved Jiaan Geng
          43.
          DS V2 pushdown should unify the compile API Sub-task Resolved Jiaan Geng
          44.
          Update document of JDBC options for pushDownOffset Sub-task Resolved Jiaan Geng
          45.
          DS V2 aggregate push down can work with Top N or Paging (Sort with group expressions) Sub-task Resolved Jiaan Geng
          46.
          Simplify V2ExpressionBuilder by extract common method. Sub-task Resolved Jiaan Geng
          47.
          Translate common used aggregate functions for pushdown (non ANSI) Sub-task Open Unassigned
          48.
          Organize the check of push down information for JDBCV2Suite Sub-task Resolved miracle
          49.
          DS V2 supports push down string functions(non ANSI) Sub-task Resolved Unassigned
          50.
          DS V2 push-down translate Cast if the cast is safe Sub-task Resolved Jiaan Geng
          51.
          DS V2 pushdown should unify the translate path Sub-task Resolved Jiaan Geng
          52.
          DS V2 expressions should have the default toString Sub-task Resolved Apache Spark
          53.
          Extract the function that construct the select statement for JDBC dialect. Sub-task Resolved Jiaan Geng
          54.
          DS V2 pushdown supports supports JDBC dialects compile `SortOrder` by themselves Sub-task Resolved Jiaan Geng
          55.
          DS V2 pushdown could let JDBC dialect decide to push down offset and limit Sub-task Resolved Jiaan Geng
          56.
          The built-in dialects support OFFSET and paging query. Sub-task Resolved Unassigned
          57.
          Fix the bug that pushdown offset or paging is invalid for some built-in dialect Sub-task Resolved Unassigned
          58.
          Change the default value of JDBC options about push down to true Sub-task Resolved Jiaan Geng
          59.
          Abstract the excluded method for better test for JDBC docker tests. Sub-task Resolved Jiaan Geng
          60.
          Improve the hashCode for Some DS V2 Expression Sub-task Resolved Jiaan Geng
          61.
          DS V2 supports push down Mode Sub-task Resolved Jiaan Geng
          62.
          Escape the single quote, _ and % for DS V2 pushdown Sub-task Resolved Jiaan Geng
          63.
          DS V2 supports push down PERCENTILE_CONT and PERCENTILE_DISC Sub-task Resolved Jiaan Geng

          Activity

            People

              Unassigned Unassigned
              beliefer Jiaan Geng
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: