Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44286

Define the computing logic through PartitionEvaluator API and use it in SQL operators

    XMLWordPrintableJSON

Details

    • Umbrella
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.5.0
    • None
    • SQL
    • None

    Description

      Define the computing logic through PartitionEvaluator API and use it in SQL operators. This will avoid lambda-based distributed execution 
      Ref : SPARK-43061.

       

      Note: this is an umbrella Jira to apply PartitionEvaluator  based approach in SQL operators

      Attachments

        1.
        Define the computing logic through PartitionEvaluator API and use it in RowToColumnarExec & ColumnarToRowExec SQL operators. Sub-task Resolved Vinod KC
        2.
        Define the computing logic through PartitionEvaluator API and use it in ShuffledHashJoinExec Sub-task Open Unassigned
        3.
        Define the computing logic through PartitionEvaluator API and use it in SortMergeJoinExec Sub-task Resolved Vinod KC
        4.
        Define the computing logic through PartitionEvaluator API and use it in BroadcastNestedLoopJoinExec & BroadcastHashJoinExec Sub-task Open Unassigned
        5.
        Define the computing logic through PartitionEvaluator API and use it in WindowGroupLimitExec Sub-task Resolved Jiaan Geng
        6.
        Define the computing logic through PartitionEvaluator API and use it in WindowExec and WindowInPandasExec Sub-task Resolved Jiaan Geng
        7.
        Define the computing logic through PartitionEvaluator API and use it in BaseScriptTransformationExec, InMemoryTableScanExec, ReferenceSort, HiveTableScanExec, SortExec Sub-task Open Unassigned
        8.
        Use PartitionEvaluator API in MapInBatchExec Sub-task Resolved Vinod KC
        9.
        Use PartitionEvaluator API in AggregateInPandasExec, AttachDistributedSequenceExec Sub-task Open Unassigned
        10.
        Use PartitionEvaluator API in FileSourceScanExec, RowDataSourceScanExec, MergeRowsExec Sub-task Open Unassigned
        11.
        Use PartitionEvaluator API in CollectMetricsExec, GenerateExec, ExpandExec Sub-task Open Unassigned
        12.
        Define the computing logic through PartitionEvaluator API and use it in CollectLimitExec, CollectTailExec, LocalLimitExec and GlobalLimitExec Sub-task Resolved Unassigned
        13.
        Use PartitionEvaluator API in DebugExec Sub-task Resolved Jia Fan
        14.
        Use PartitionEvaluator API in MergingSessionsExec & UpdatingSessionsExec Sub-task Open Unassigned
        15.
        Use PartitionEvaluator API in HashAggregateExec, ObjectHashAggregateExec, SortAggregateExec Sub-task Open Unassigned
        16.
        Use PartitionEvaluator API in ArrowEvalPythonExec, BatchEvalPythonExec Sub-task Resolved Vinod KC
        17.
        Use PartitionEvaluator API in ArrowEvalPythonUDTFExec & BatchEvalPythonUDTFExec Sub-task Open Unassigned
        18.
        Use PartitionEvaluator API in MapElementsExec, MapGroupsExec, MapPartitionsExec Sub-task Open Unassigned
        19.
        Use PartitionEvaluator API in CoGroupExec, DeserializeToObjectExec, ExternalRDDScanExec Sub-task Open Unassigned
        20.
        Use PartitionEvaluator API in FlatMapGroupsInPandasExec, FlatMapCoGroupsInPandasExec Sub-task Open Unassigned

        Activity

          People

            Unassigned Unassigned
            vinodkc Vinod KC
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: