Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-12255

[Rust] [Ballista] Integrate scheduler with DataFusion

    XMLWordPrintableJSON

Details

    Description

      The Ballista scheduler breaks a query down into stages based on changes in partitioning int he plan, where each stage is broken down into tasks that can be executed concurrently.

      Rather than trying to run all the partitions at once, Ballista executors process n concurrent tasks at a time and then request new tasks from the scheduler.

      This approach would help DataFusion scale better and it would be ideal to use the same scheduler to scale across cores in DataFusion and across nodes in Ballista.

      Attachments

        Activity

          People

            andygrove Andy Grove
            andygrove Andy Grove
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: