Details

    • Type: Improvement
    • Status: In Progress
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: MLlib
    • Labels:
      None

      Description

      RowMatrix has a columnSimilarities method to find cosine similarities between columns.

      A rowSimilarities method would be useful to find similarities between rows.

      This is JIRA is to investigate which algorithms are suitable for such a method, better than brute-forcing it. Note that when there are many rows (> 10^6), it is unlikely that brute-force will be feasible, since the output will be of order 10^12.

        Attachments

        1. MovieLensSimilarity Comparisons.pdf
          93 kB
          Debasish Das
        2. SparkMeetup2015-Experiments1.pdf
          64 kB
          Debasish Das
        3. SparkMeetup2015-Experiments2.pdf
          56 kB
          Debasish Das

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                rezazadeh Reza Zadeh
              • Votes:
                4 Vote for this issue
                Watchers:
                16 Start watching this issue

                Dates

                • Created:
                  Updated: