Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-14739

Vectors.parse doesn't handle dense vectors of size 0 and sparse vectors with no indices

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.6.0, 2.0.0
    • 1.6.2, 2.0.0
    • MLlib, PySpark
    • None

    Description

      DenseVector:

      Vectors.parse(str(Vectors.dense([])))
      ## ValueError                                Traceback (most recent call last)
      ## .. 
      ## ValueError: Unable to parse values from
      

      SparseVector:

      Vectors.parse(str(Vectors.sparse(5, [], [])))
      ## ValueError                                Traceback (most recent call last)
      ##  ... 
      ## ValueError: Unable to parse indices from .
      

      Attachments

        Issue Links

          Activity

            apachespark Apache Spark added a comment -

            User 'arashpa' has created a pull request for this issue:
            https://github.com/apache/spark/pull/12510

            apachespark Apache Spark added a comment - User 'arashpa' has created a pull request for this issue: https://github.com/apache/spark/pull/12510
            arashpa Arash Parsa added a comment -

            Thanks for posting the ticket on Jira, I created the PR here:
            https://github.com/apache/spark/pull/12510

            arashpa Arash Parsa added a comment - Thanks for posting the ticket on Jira, I created the PR here: https://github.com/apache/spark/pull/12510

            This solves only small part of the problem. Right now both Sparse and Dense vector parsing is broken, not to mention corresponding tests are dead code.

            zero323 Maciej Szymkiewicz added a comment - This solves only small part of the problem. Right now both Sparse and Dense vector parsing is broken, not to mention corresponding tests are dead code.
            apachespark Apache Spark added a comment -

            User 'zero323' has created a pull request for this issue:
            https://github.com/apache/spark/pull/12511

            apachespark Apache Spark added a comment - User 'zero323' has created a pull request for this issue: https://github.com/apache/spark/pull/12511
            apachespark Apache Spark added a comment -

            User 'vishnu667' has created a pull request for this issue:
            https://github.com/apache/spark/pull/12512

            apachespark Apache Spark added a comment - User 'vishnu667' has created a pull request for this issue: https://github.com/apache/spark/pull/12512
            arashpa Arash Parsa added a comment -

            zero323 sure I can adjust my PR (move the tests), but since I found the bug do you think I should be getting the fix in?

            arashpa Arash Parsa added a comment - zero323 sure I can adjust my PR (move the tests), but since I found the bug do you think I should be getting the fix in?
            apachespark Apache Spark added a comment -

            User 'vishnu667' has created a pull request for this issue:
            https://github.com/apache/spark/pull/12513

            apachespark Apache Spark added a comment - User 'vishnu667' has created a pull request for this issue: https://github.com/apache/spark/pull/12513
            zero323 Maciej Szymkiewicz added a comment - - edited

            Sure, but your latest PR still doesn't resolve problem with dead tests. Instead of copying you could actually pull changes from my repo.

            zero323 Maciej Szymkiewicz added a comment - - edited Sure, but your latest PR still doesn't resolve problem with dead tests. Instead of copying you could actually pull changes from my repo.

            I extracted relevant test fixes and made PR against your branch.

            zero323 Maciej Szymkiewicz added a comment - I extracted relevant test fixes and made PR against your branch.
            apachespark Apache Spark added a comment -

            User 'arashpa' has created a pull request for this issue:
            https://github.com/apache/spark/pull/12515

            apachespark Apache Spark added a comment - User 'arashpa' has created a pull request for this issue: https://github.com/apache/spark/pull/12515
            arashpa Arash Parsa added a comment -

            Sorry wasn't able to pull from your branch. I submitted a new PR with proper updates. Please let me know how it looks.

            arashpa Arash Parsa added a comment - Sorry wasn't able to pull from your branch. I submitted a new PR with proper updates. Please let me know how it looks.
            vishnu667 Vishnu Prasad added a comment -

            I've merged your PR with your test fixes. Thank you

            vishnu667 Vishnu Prasad added a comment - I've merged your PR with your test fixes. Thank you
            apachespark Apache Spark added a comment -

            User 'arashpa' has created a pull request for this issue:
            https://github.com/apache/spark/pull/12516

            apachespark Apache Spark added a comment - User 'arashpa' has created a pull request for this issue: https://github.com/apache/spark/pull/12516

            Note this may be lower priority as we move linear algebra to mllib-local in SPARK-13944. But it would be good to fix.

            josephkb Joseph K. Bradley added a comment - Note this may be lower priority as we move linear algebra to mllib-local in SPARK-13944 . But it would be good to fix.
            srowen Sean R. Owen added a comment -

            Issue resolved by pull request 12516
            https://github.com/apache/spark/pull/12516

            srowen Sean R. Owen added a comment - Issue resolved by pull request 12516 https://github.com/apache/spark/pull/12516

            People

              arashpa@gmail.com Arash Parsa
              zero323 Maciej Szymkiewicz
              Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: