Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-39130

How do I read parquet with python object

    XMLWordPrintableJSON

Details

    • Question
    • Status: Resolved
    • Trivial
    • Resolution: Won't Do
    • 2.4.5
    • None
    • PySpark
    • None
    • pyspark2.4.5

    Description

      python:

       

      import pandas as pd

      a=pd.DataFrame([[1,[2.3,1.2]]],columns=['a','b'])

      a.to_parquet('a.parquet')

       

      pyspark:

       

      d2 = spark.read.parquet('a.parquet')

       

      will return error:

      An error was encountered: An error occurred while calling o277.showString. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 14 in stage 9.0 failed 4 times, most recent failure: Lost task 14.2 in stage 9.0 (TID 63, 10.169.0.196, executor 15): java.lang.IllegalArgumentException: Illegal Capacity: -221

      how can I fix it?

      Thanks.

      Attachments

        Activity

          People

            Unassigned Unassigned
            BenDataLab Ben Wan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 0.5h
                0.5h
                Remaining:
                Remaining Estimate - 0.5h
                0.5h
                Logged:
                Time Spent - Not Specified
                Not Specified