Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-35042

Support traversal pruning in transform/resolve functions and their call sites

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.2.0
    • None
    • Optimizer
    • None

    Description

      Transform/resolve functions are called ~280k times per query on average for a TPC-DS query, which are way more than necessary. We can reduce those calls with early exit information and conditions. ThisĀ doc some evaluation numbers with a prototype.

      Attachments

        1.
        Support traversal pruning in the transform function family Sub-task Resolved Yingyi Bu
        2.
        Support traversal pruning in resolve functions in AnalysisHelper Sub-task Resolved Yingyi Bu
        3.
        Use static treePatternBitSet for Leaf expressions like AttributeReference and Literal Sub-task Resolved Yingyi Bu
        4.
        Fix BitSet.union Sub-task Resolved Unassigned
        5.
        Migrate to transformWithPruning or resolveWithPruning for subquery related rules Sub-task Resolved Yingyi Bu
        6.
        Migrate to transformWithPruning for leftover optimizer rules Sub-task Resolved Yingyi Bu
        7.
        Migrate to transformWithPruning or resolveWithPruning for expression rules Sub-task Resolved Yingyi Bu
        8.
        Migrate to transformWithPruning or resolveWithPruning for object rules Sub-task Resolved Yingyi Bu
        9.
        Migrate to transformWithPruning or resolveWithPruning for rules in finishAnalysis Sub-task Resolved Yingyi Bu
        10.
        Migrate to resolveWithPruning for two command rules Sub-task Resolved Unassigned
        11.
        Support traversal pruning in transformUpWithNewOutput Sub-task Open Unassigned
        12.
        Add rule id to all Analyzer rules in fixed point batches Sub-task Resolved Yingyi Bu
        13.
        Migrate to transformWithPruning for top-level rules under catalyst/optimizer Sub-task Resolved Yingyi Bu
        14.
        Migrate to transformWithPruning for rules in optimizer/Optimizer.scala Sub-task Resolved Yingyi Bu
        15.
        Migrate transformAllExpressions callsites to transformAllExpressionsWithPruning Sub-task Resolved Apache Spark
        16.
        Add tree pattern pruning into Analyzer rules Sub-task Resolved Yingyi Bu
        17.
        Add rule id pruning to the TypeCoercion rule Sub-task Resolved Yingyi Bu
        18.
        Support traversal pruning in extendedResolutionRules and postHocResolutionRules Sub-task In Progress Unassigned
        19.
        Add tree pattern pruning to CTESubstitution rule Sub-task Resolved Josh Rosen
        20.
        Identify aggregation expression in the nodePatterns of PythonUDF Sub-task Resolved Gengliang Wang
        21.
        Add a linter rule to enforce transforming with pruning Sub-task Open Unassigned

        Activity

          People

            Unassigned Unassigned
            buyingyi Yingyi Bu
            Gengliang Wang Gengliang Wang
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: