Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44111 Prepare Apache Spark 4.0.0
  3. SPARK-49882

Handle or document `NumPy 2.1` difference in Python 3.13

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 4.0.0
    • None
    • PySpark
    • None

    Description

      Although SPARK-48710 fixed to use NumPy 2.0 compatible types. Python 3.13 requires NumPy 2.1 (SPARK-49869) and seems to reveal another instances of differences.

      New `NumPy 2.1` seems to have a different output style in Python 3.13. Of course, the values are correct.

      • https://github.com/apache/spark/actions/runs/11186188886/job/31100777649
        **********************************************************************
        File "/__w/spark/spark/python/pyspark/core/rdd.py", line 2463, in __main__.RDD.sampleStdev
        Failed example:
            sc.parallelize([1, 2, 3]).sampleStdev()
        Expected:
            1.0
        Got:
            np.float64(1.0)
        **********************************************************************
        File "/__w/spark/spark/python/pyspark/core/rdd.py", line 2436, in __main__.RDD.stdev
        Failed example:
            sc.parallelize([1, 2, 3]).stdev()
        Expected:
            0.816...
        Got:
            np.float64(0.816496580927726)
        **********************************************************************
        

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              dongjoon Dongjoon Hyun
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: