Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-34463

toPandas failed with error: buffer source array is read-only when Arrow with self-destruct is enabled

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.0
    • 3.2.0
    • PySpark
    • None

    Description

      Environment:

      apache/spark master
      pandas version > 1.0.5

      Reproduce code:

      spark.conf.set('spark.sql.execution.arrow.pyspark.enabled', True)
      spark.conf.set('spark.sql.execution.arrow.pyspark.selfDestruct.enabled', True)
      spark.createDataFrame(sc.parallelize([(i,) for i in range(13)], 1), 'id long').selectExpr('IF(id % 3==0, id+1, NULL) AS f1', '(id+1) % 2 AS label').toPandas()['label'].value_counts()
      

      Get error like:

      Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/base.py", line 1033, in value_counts
      dropna=dropna,
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/algorithms.py", line 820, in value_counts
      keys, counts = value_counts_arraylike(values, dropna)
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/algorithms.py", line 865, in value_counts_arraylike
      keys, counts = f(values, dropna)
      File "pandas/_libs/hashtable_func_helper.pxi", line 1098, in pandas._libs.hashtable.value_count_int64
      File "stringsource", line 658, in View.MemoryView.memoryview_cwrapper
      File "stringsource", line 349, in View.MemoryView.memoryview._cinit_
      ValueError: buffer source array is read-only

      Attachments

        Issue Links

          Activity

            People

              lidavidm David Li
              weichenxu123 Weichen Xu
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: