Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-1940

Provide a Java API to SimilarityAnalysis and any other needed APIs

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Algorithms, cooccurrence
    • Labels:
      None

      Description

      We want to port the functionality from org.apache.mahout.math.cf.SimilarityAnalysis.scala to java for easy integration with a java project we will be creating that derives a similarity measure from the co-occurrence and cross-occurrence matrix.

        Activity

        Hide
        pferrel Pat Ferrel added a comment -

        This would be Awesome! Let me know if you need help. There are some things that are no longer required. I just duplicated some methods to maintain backward compatibility, while adding new features.

        I also implemented some new helper object `apply` functions, which are alternative constructors, outside of Mahout in the PredictionIO Universal Recommender Template. When 0.5.1 of the Template is released concurrent with PIO 0.11.0 and Mahout 0.13.0. The ones in the Template code are all you will need for porting the Template to Java.

        To make SimilarityAnalysis complete and accepted into Mahout you'd probably need to port all of the SimilarityAnalysis class and IndexedDatasetSpark.

        Show
        pferrel Pat Ferrel added a comment - This would be Awesome! Let me know if you need help. There are some things that are no longer required. I just duplicated some methods to maintain backward compatibility, while adding new features. I also implemented some new helper object `apply` functions, which are alternative constructors, outside of Mahout in the PredictionIO Universal Recommender Template. When 0.5.1 of the Template is released concurrent with PIO 0.11.0 and Mahout 0.13.0. The ones in the Template code are all you will need for porting the Template to Java. To make SimilarityAnalysis complete and accepted into Mahout you'd probably need to port all of the SimilarityAnalysis class and IndexedDatasetSpark.
        Hide
        james_mackey James Mackey added a comment -

        Hi Pat! Thanks for the offer - we would really appreciate some guidance from you. Would you mind meeting with us virtually over the next couple of days to go over the file structure and what exactly we have to implement to make this happen?

        Show
        james_mackey James Mackey added a comment - Hi Pat! Thanks for the offer - we would really appreciate some guidance from you. Would you mind meeting with us virtually over the next couple of days to go over the file structure and what exactly we have to implement to make this happen?
        Hide
        dlyubimov Dmitriy Lyubimov added a comment -

        Normally, one who is writing in Java, does not have to really port anything from Scala.
        For example, Spark's Java APIs are in fact implemented in Scala.

        There are normally two ways of going about this:
        (1) write API in Java and implement them in Scala (the way Spark does),
        (2) write Java-compatible traits in Scala and then implement them in Scala as well. (which is what i do as it saves complexity a bit).

        to approach the (2), the APIs should only be using Java-compatible types. That is, no Scala libraries (such as collections) or incompatible language constructs (such as implicits, curried functions, generics context bounds etc. etc.) Implementing API interfaces in Java just verifies this a bit better and allows avoiding a mixed build (which may sometimes be a problem due to circular dependencies between Java and Scala code).

        Show
        dlyubimov Dmitriy Lyubimov added a comment - Normally, one who is writing in Java, does not have to really port anything from Scala. For example, Spark's Java APIs are in fact implemented in Scala. There are normally two ways of going about this: (1) write API in Java and implement them in Scala (the way Spark does), (2) write Java-compatible traits in Scala and then implement them in Scala as well. (which is what i do as it saves complexity a bit). to approach the (2), the APIs should only be using Java-compatible types. That is, no Scala libraries (such as collections) or incompatible language constructs (such as implicits, curried functions, generics context bounds etc. etc.) Implementing API interfaces in Java just verifies this a bit better and allows avoiding a mixed build (which may sometimes be a problem due to circular dependencies between Java and Scala code).

          People

          • Assignee:
            Unassigned
            Reporter:
            james_mackey James Mackey
          • Votes:
            4 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:

              Development