Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21327

ArrayConstructor should handle an array of typecode 'l' as long rather than int in Python 2.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0
    • 2.3.0
    • PySpark, SQL
    • None

    Description

      Currently ArrayConstructor handles an array of typecode 'l' as int when converting Python object in Python 2 into Java object, so if the value is larger than Integer.MAX_VALUE or smaller than Integer.MIN_VALUE then the overflow occurs.

      import array
      data = [Row(l=array.array('l', [-9223372036854775808, 0, 9223372036854775807]))]
      df = spark.createDataFrame(data)
      df.show(truncate=False)
      
      +----------+
      |l         |
      +----------+
      |[0, 0, -1]|
      +----------+
      

      This should be:

      +----------------------------------------------+
      |l                                             |
      +----------------------------------------------+
      |[-9223372036854775808, 0, 9223372036854775807]|
      +----------------------------------------------+
      

      Attachments

        Activity

          People

            ueshin Takuya Ueshin
            ueshin Takuya Ueshin
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: