Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26638

Pyspark vector classes always return error for unary negation

Rank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.2, 2.4.0
    • 2.3.3, 2.4.1, 3.0.0
    • ML, PySpark
    • None

    Description

      It looks like the implementation of _neg_ for Pyspark vector classes is wrong:

          def _delegate(op):
              def func(self, other):
                  if isinstance(other, DenseVector):
                      other = other.array
                  return DenseVector(getattr(self.array, op)(other))
              return func
      
          __neg__ = _delegate("__neg__")
      

      This delegation works for binary operators but not for unary, and indeed, it doesn't work at all:

      from pyspark.ml.linalg import DenseVector
      v = DenseVector([1,2,3])
      -v
      ...
      TypeError: func() missing 1 required positional argument: 'other'
      

      This was spotted by static analyis on lgtm.com:
      https://lgtm.com/projects/g/apache/spark/alerts/?mode=tree&lang=python&ruleFocus=7850093

      Easy to fix and add a test for, as I presume we want this to be implemented.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            srowen Sean R. Owen
            srowen Sean R. Owen
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment