Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-37697

Make it easier to convert numpy arrays to Spark Dataframes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.1.2
    • None
    • PySpark
    • None

    Description

      Make it easier to convert numpy arrays to dataframes.

      Often we receive errors:

       

      df = spark.createDataFrame(numpy.arange(10))
      Can not infer schema for type: <class 'numpy.int64'>
      

       

      OR

      df = spark.createDataFrame(numpy.arange(10.))
      Can not infer schema for type: <class 'numpy.float64'>
      

       

      Today (Spark 3.x) we have to:

      spark.createDataFrame(pd.DataFrame(numpy.arange(10.))) 

      Make this easier with a direct conversion from Numpy arrays to Spark Dataframes.

      Attachments

        1. image-2022-10-31-22-49-37-356.png
          112 kB
          Douglas Moore

        Activity

          People

            Unassigned Unassigned
            douglas.moore@databricks.com Douglas Moore
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: