Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-37348

PySpark pmod function

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.2.0
    • 3.4.0
    • PySpark
    • None

    Description

      Because Spark is built on the JVM, in PySpark, F.lit(-1) % F.lit(2) returns -1. However, the modulus is often desired instead of the remainder.

       

      There is a PMOD() function in Spark SQL, but not in PySpark. So at the moment, the two options for getting the modulus is to use F.expr("pmod(A, B)"), or create a helper function such as:
       

      def pmod(dividend, divisor):
          return F.when(dividend < 0, (dividend % divisor) + divisor).otherwise(dividend % divisor)

       
       
      Neither are optimal - pmod should be native to PySpark as it is in Spark SQL.

      Attachments

        Activity

          People

            podongfeng Ruifeng Zheng
            timschwab Tim Schwab
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: