Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-45889

Implement push-down filter with partition ID and grouping key (if possible) for state data source reader

    XMLWordPrintableJSON

Details

    • Task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0
    • None
    • Structured Streaming
    • None

    Description

      If the query filters the state data via partition ID, it is a good chance for state data source to avoid spinning all state store instances and wasting resource. We can spin state store instances for only necessary partitions.

      Same thing applies to grouping keys, although the criteria on distribution is bound to the operator rather than the key in state store, hence it could be very tricky unless we can follow the same criteria on distribution for the operator.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kabhwan Jungtaek Lim
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: