Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-48478

Allow passing iterator of PyArrow RecordBatches to createDataFrame()

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0, 3.5.1
    • None
    • Connect, Input/Output, PySpark, SQL
    • None

    Description

      As a follow-up to SPARK-48220:

      For larger data, it would be nice to be able to pass an iterator of PyArrow RecordBatches to createDataFrame().

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              icook Ian Cook
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: