Mahout
  1. Mahout
  2. MAHOUT-979

RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.6, 0.7
    • Fix Version/s: 0.7
    • Component/s: Math
    • Labels:

      Description

      Presently RowSimilarityJob expects to be provided the 'numberOfColumns' (-r) by the user and this is not an optional argument. If this is not specified explicitly RowSimilarityJob should be able to get the number of columns by the vector size.

      1. Mahout-979.patch
        5 kB
        Suneel Marthi
      2. Mahout-979.patch
        5 kB
        Suneel Marthi

        Activity

        Sean Owen made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Hide
        Hudson added a comment -

        Integrated in Mahout-Quality #1467 (See https://builds.apache.org/job/Mahout-Quality/1467/)
        MAHOUT-979 cleanup: removing unused imports (Revision 1335742)
        MAHOUT-979 RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified (Revision 1335732)

        Result = SUCCESS
        ssc : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1335742
        Files :

        • /mahout/trunk/core/src/main/java/org/apache/mahout/common/AbstractJob.java
        • /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/similarity/cooccurrence/RowSimilarityJob.java

        ssc : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1335732
        Files :

        • /mahout/trunk/core/src/main/java/org/apache/mahout/common/AbstractJob.java
        • /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/similarity/cooccurrence/RowSimilarityJob.java
        • /mahout/trunk/core/src/test/java/org/apache/mahout/math/hadoop/similarity/cooccurrence/RowSimilarityJobTest.java
        Show
        Hudson added a comment - Integrated in Mahout-Quality #1467 (See https://builds.apache.org/job/Mahout-Quality/1467/ ) MAHOUT-979 cleanup: removing unused imports (Revision 1335742) MAHOUT-979 RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified (Revision 1335732) Result = SUCCESS ssc : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1335742 Files : /mahout/trunk/core/src/main/java/org/apache/mahout/common/AbstractJob.java /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/similarity/cooccurrence/RowSimilarityJob.java ssc : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1335732 Files : /mahout/trunk/core/src/main/java/org/apache/mahout/common/AbstractJob.java /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/similarity/cooccurrence/RowSimilarityJob.java /mahout/trunk/core/src/test/java/org/apache/mahout/math/hadoop/similarity/cooccurrence/RowSimilarityJobTest.java
        Hide
        Suneel Marthi added a comment - - edited

        Thanks for this Sebastian, see that you had modified the patch and cleaned up my code.

        There are 2 unused imports in RowSimilarityJob.java that I had introduced and should have been taken out, could you take care of that.

        Show
        Suneel Marthi added a comment - - edited Thanks for this Sebastian, see that you had modified the patch and cleaned up my code. There are 2 unused imports in RowSimilarityJob.java that I had introduced and should have been taken out, could you take care of that.
        Hide
        Sebastian Schelter added a comment -

        Patch committed. Thanks for the great work, Suneel!

        Show
        Sebastian Schelter added a comment - Patch committed. Thanks for the great work, Suneel!
        Sebastian Schelter made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Sebastian Schelter added a comment -

        I already changed that

        Show
        Sebastian Schelter added a comment - I already changed that
        Hide
        Suneel Marthi added a comment -

        Modified getDimensions() to take the inputPath as a parameter.

        Show
        Suneel Marthi added a comment - Modified getDimensions() to take the inputPath as a parameter.
        Hide
        Sebastian Schelter added a comment -

        What did you change in the latest patch?

        Show
        Sebastian Schelter added a comment - What did you change in the latest patch?
        Suneel Marthi made changes -
        Attachment Mahout-979.patch [ 12526034 ]
        Suneel Marthi made changes -
        Status Reopened [ 4 ] Patch Available [ 10002 ]
        Hide
        Sebastian Schelter added a comment -

        I'm currently working in your patch. Let me run the tests myself.

        Your code looks very good, but you should only post patches that already passed the unit tests.

        Show
        Sebastian Schelter added a comment - I'm currently working in your patch. Let me run the tests myself. Your code looks very good, but you should only post patches that already passed the unit tests.
        Hide
        Suneel Marthi added a comment -

        Reopening the issue as the submitted patch is failing unit tests.

        Show
        Suneel Marthi added a comment - Reopening the issue as the submitted patch is failing unit tests.
        Suneel Marthi made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Suneel Marthi made changes -
        Attachment Mahout-979.patch [ 12525937 ]
        Hide
        Suneel Marthi added a comment -

        Uploaded patch with fixes based on Sebastian's feedback.

        Show
        Suneel Marthi added a comment - Uploaded patch with fixes based on Sebastian's feedback.
        Suneel Marthi made changes -
        Attachment Mahout-979.patch [ 12525985 ]
        Hide
        Sebastian Schelter added a comment -

        Patch looks very good. I have one change request: could you open the reader for the sequence file in getDimensions() and close it in a finally block?

        Show
        Sebastian Schelter added a comment - Patch looks very good. I have one change request: could you open the reader for the sequence file in getDimensions() and close it in a finally block?
        Suneel Marthi made changes -
        Attachment Mahout-979.patch [ 12525927 ]
        Suneel Marthi made changes -
        Attachment Mahout-979.patch [ 12525937 ]
        Suneel Marthi made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Suneel Marthi made changes -
        Attachment Mahout-979.patch [ 12525925 ]
        Suneel Marthi made changes -
        Attachment Mahout-979.patch [ 12525927 ]
        Suneel Marthi made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Suneel Marthi made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Suneel Marthi made changes -
        Attachment Mahout-979.patch [ 12525925 ]
        Suneel Marthi made changes -
        Status In Progress [ 3 ] Patch Available [ 10002 ]
        Affects Version/s 0.7 [ 12319261 ]
        Fix Version/s 0.7 [ 12319261 ]
        Hide
        Suneel Marthi added a comment - - edited

        Added method getDimensions() in AbstractJob.java, this would be invoked if the 'numberOfColumns' is not specified as a CLI argument while executing the RowSimilarityJob.

        Show
        Suneel Marthi added a comment - - edited Added method getDimensions() in AbstractJob.java, this would be invoked if the 'numberOfColumns' is not specified as a CLI argument while executing the RowSimilarityJob.
        Suneel Marthi made changes -
        Field Original Value New Value
        Status Open [ 1 ] In Progress [ 3 ]
        Suneel Marthi created issue -

          People

          • Assignee:
            Suneel Marthi
            Reporter:
            Suneel Marthi
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development