Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41224

Optimize Arrow collect to stream the result from server to client

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • Connect
    • None

    Description

      https://github.com/apache/spark/pull/38468 implemented Arrow-based collect but they cannot stream the result from server to the client. We can stream them if the first partition is collected first

      Attachments

        Activity

          People

            gurwls223 Hyukjin Kwon
            gurwls223 Hyukjin Kwon
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: