Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-30633

Codegen fails when xxHash seed is not an integer

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.4.4
    • Fix Version/s: 2.4.5, 3.0.0
    • Component/s: SQL
    • Labels:
      None

      Description

      If the seed for xxHash is not an integer the generated code does not compile.

      Steps to reproduce:

      import org.apache.spark.sql.catalyst.expressions.XxHash64
      import org.apache.spark.sql.Column
      
      val file = "..."
      val column = col("...")
      
      val df = spark.read.csv(file)
      
      def xxHash(seed: Long, cols: Column*): Column = new Column(
         XxHash64(cols.map(_.expr), seed)
      )
      
      val seed = (Math.pow(2, 32)+1).toLong
      df.select(xxHash(seed, column)).show()
      

      Appending an L to the seed when the datatype is long fixes the issue.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Cording Patrick Cording
                Reporter:
                Cording Patrick Cording
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: