Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21583

Create a ColumnarBatch with ArrowColumnVectors for row based iteration

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.0
    • Fix Version/s: 2.3.0
    • Component/s: SQL
    • Labels:
      None

      Description

      The existing ArrowColumnVector creates a read-only vector of Arrow data. It would be useful to be able to create a ColumnarBatch to allow row based iteration over multiple ArrowColumnVectors. This would avoid extra copying to translate column elements into rows and be more efficient memory usage while increasing performance.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bryanc Bryan Cutler
                Reporter:
                bryanc Bryan Cutler
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: