Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32640

Spark 3.1 log(NaN) returns null instead of NaN

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.0
    • 3.1.0
    • SQL

    Description

      I was testing Spark 3.1.0 and I noticed that if you take the log(NaN) it now returns a null whereas in Spark 3.0 it returned a NaN.  I'm not an expert in this but I thought NaN was correct.

      Spark 3.1.0 Example:

      >>> df.selectExpr(["value", "log1p(value)"]).show()

      -----------------------------+

              value       LOG1P(value)

      -----------------------------+

      -3.4028235E38               null
      3.4028235E38 88.72283906194683
                0.0                0.0
               -0.0               -0.0
                1.0 0.6931471805599453
               -1.0               null
                NaN               null

      -----------------------------+

       

      Spark 3.0.0 example:

       

      -----------------------------+

      value LOG1P(value)

      -----------------------------+

      -3.4028235E38 null
      3.4028235E38 88.72283906194683
      0.0 0.0
      -0.0 -0.0
      1.0 0.6931471805599453
      -1.0 null
      NaN NaN

      -----------------------------+

       

      Note it also does the same for log1p, log2, log10

      Attachments

        Activity

          People

            cloud_fan Wenchen Fan
            tgraves Thomas Graves
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: