Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26638

Pyspark vector classes always return error for unary negation

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.2, 2.4.0
    • Fix Version/s: 2.3.3, 2.4.1, 3.0.0
    • Component/s: ML, PySpark
    • Labels:
      None

      Description

      It looks like the implementation of _neg_ for Pyspark vector classes is wrong:

          def _delegate(op):
              def func(self, other):
                  if isinstance(other, DenseVector):
                      other = other.array
                  return DenseVector(getattr(self.array, op)(other))
              return func
      
          __neg__ = _delegate("__neg__")
      

      This delegation works for binary operators but not for unary, and indeed, it doesn't work at all:

      from pyspark.ml.linalg import DenseVector
      v = DenseVector([1,2,3])
      -v
      ...
      TypeError: func() missing 1 required positional argument: 'other'
      

      This was spotted by static analyis on lgtm.com:
      https://lgtm.com/projects/g/apache/spark/alerts/?mode=tree&lang=python&ruleFocus=7850093

      Easy to fix and add a test for, as I presume we want this to be implemented.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                srowen Sean Owen
                Reporter:
                srowen Sean Owen
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: