Details

    • Type: Sub-task
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Issue Links

        Activity

        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user holdenk opened a pull request:

        https://github.com/apache/mahout/pull/340

        MAHOUT-2015 [WIP]: Expose Mahout's OLS algorithm in the Spark ML API

            1. Purpose of PR:
              Expose Mahout's OLS algorithm in the Spark ML API
            1. Important ToDos
              Please mark each with an "x"
        • [ X] A JIRA ticket exists (if not, please create this first)https://issues.apache.org/jira/browse/ZEPPELIN/
        • [ X ] Title of PR is "MAHOUT-XXXX Brief Description of Changes" where XXXX is the JIRA number.
        • [ X ] Created unit tests where appropriate
        • [ X ] Added licenses correct on newly added files
        • [ X ] Assigned JIRA to self
        • [ ] Added documentation in scala docs/java docs, and to website
        • [ ] Successfully built and ran all unit tests, verified that all tests pass locally.

        If all of these things aren't complete, but you still feel it is
        appropriate to open a PR, please add [WIP] after MAHOUT-XXXX before the
        descriptions- e.g. "MAHOUT-XXXX [WIP] Description of Change"

        Does this change break earlier versions?

        No

        Is this the beginning of a larger project for which a feature branch should be made?

        No

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/holdenk/mahout add-pipelinesupport-magic

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/mahout/pull/340.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #340


        commit 0e3a9f935cf35bd2fe3b913dc18c374ac409baaa
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-07-17T02:51:20Z

        Start working on porting the first algorithm by hand

        commit e52f11b991d420d5f618d755d1d1a1dbccc9a7d9
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-07-26T00:32:41Z

        Start thinking about how to do Spark pipelines for only 2+

        commit 29cd357f97ea85a59a6b228219d0de8407be7bf1
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-07-26T00:33:08Z

        Allow us to construct instances of OrderedIntDoubleMapping for making sparse vectors

        commit bb57010809e9cf28e50dc2a3c06e2453ec2ddc37
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-07-26T00:33:40Z

        Work on the Spark Estimator

        commit 4f03eb664a42b5df9c7c1c93df395e26ff068d23
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-07-26T00:34:02Z

        Continue working on base classes and converters

        commit e7f428350d226e15434f0b6d7b6529a1a19f071c
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-07-26T00:41:27Z

        Change the type params so we can extend the predictor class correctly

        commit 82a6d144df42e252855005d9a250b2031274bc48
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-07-30T23:01:40Z

        Ok build successful now lets make it reasonable[ish]:

        commit c17fdc1c95f5805b111c59bf527b1dc08af6258f
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-07-30T23:14:55Z

        Ok don't specify the return model type in the type params since it seems to make the compiler confused.

        commit 20a58863ff2eba2597253cd202f8ad0fc66e7cb5
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-07-30T23:16:47Z

        We already have good enough testing basics

        commit 788de480f4b72afb1d1b2b453dabf1eb18686ba6
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-07-30T23:17:22Z

        Testing with local[1] is going to hide a whole class of bugs

        commit 73b3669346c17e55945910d07ade90aa72006e3f
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-09-20T20:49:30Z

        Add pipeline tests

        commit 632ffd80c956d2faa640dde8bfec506fcebea621
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-09-20T20:54:56Z

        Remove unecessary SuperVisedSparkEstimator in sparkbindings

        commit 699bd68d66cc97aed8fc82aa20aa6381262e656e
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-09-20T22:15:00Z

        Remove serializable and switch to Kyro

        commit 9ee28233e41a62aa4c89ece4a803aff2c8f76d50
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-09-20T22:19:45Z

        Remove unused/broken import

        commit 4f2c1fde8cf8959cb091577640c738b9a6287f5e
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-09-21T16:08:07Z

        Use the built in test framework for now (eventuall want to test withou mahout ctx for bootstrapping but this is a good first step) – fix reflection used to convert the DRM back to piepline stage

        commit 7b4fd3dc79e099610e37d7434afefd8918f134ad
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-09-21T20:05:53Z

        Switch back to 1.6.3

        commit 7f45935cd567ae0c0d0e40c228c91417ec5b94b4
        Author: Holden Karau <holden@us.ibm.com>
        Date: 2017-09-21T20:07:26Z

        Remove repo since not using different test lib


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user holdenk opened a pull request: https://github.com/apache/mahout/pull/340 MAHOUT-2015 [WIP] : Expose Mahout's OLS algorithm in the Spark ML API Purpose of PR: Expose Mahout's OLS algorithm in the Spark ML API Important ToDos Please mark each with an "x" [ X] A JIRA ticket exists (if not, please create this first) https://issues.apache.org/jira/browse/ZEPPELIN/ [ X ] Title of PR is "MAHOUT-XXXX Brief Description of Changes" where XXXX is the JIRA number. [ X ] Created unit tests where appropriate [ X ] Added licenses correct on newly added files [ X ] Assigned JIRA to self [ ] Added documentation in scala docs/java docs, and to website [ ] Successfully built and ran all unit tests, verified that all tests pass locally. If all of these things aren't complete, but you still feel it is appropriate to open a PR, please add [WIP] after MAHOUT-XXXX before the descriptions- e.g. "MAHOUT-XXXX [WIP] Description of Change" Does this change break earlier versions? No Is this the beginning of a larger project for which a feature branch should be made? No You can merge this pull request into a Git repository by running: $ git pull https://github.com/holdenk/mahout add-pipelinesupport-magic Alternatively you can review and apply these changes as the patch at: https://github.com/apache/mahout/pull/340.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #340 commit 0e3a9f935cf35bd2fe3b913dc18c374ac409baaa Author: Holden Karau <holden@us.ibm.com> Date: 2017-07-17T02:51:20Z Start working on porting the first algorithm by hand commit e52f11b991d420d5f618d755d1d1a1dbccc9a7d9 Author: Holden Karau <holden@us.ibm.com> Date: 2017-07-26T00:32:41Z Start thinking about how to do Spark pipelines for only 2+ commit 29cd357f97ea85a59a6b228219d0de8407be7bf1 Author: Holden Karau <holden@us.ibm.com> Date: 2017-07-26T00:33:08Z Allow us to construct instances of OrderedIntDoubleMapping for making sparse vectors commit bb57010809e9cf28e50dc2a3c06e2453ec2ddc37 Author: Holden Karau <holden@us.ibm.com> Date: 2017-07-26T00:33:40Z Work on the Spark Estimator commit 4f03eb664a42b5df9c7c1c93df395e26ff068d23 Author: Holden Karau <holden@us.ibm.com> Date: 2017-07-26T00:34:02Z Continue working on base classes and converters commit e7f428350d226e15434f0b6d7b6529a1a19f071c Author: Holden Karau <holden@us.ibm.com> Date: 2017-07-26T00:41:27Z Change the type params so we can extend the predictor class correctly commit 82a6d144df42e252855005d9a250b2031274bc48 Author: Holden Karau <holden@us.ibm.com> Date: 2017-07-30T23:01:40Z Ok build successful now lets make it reasonable [ish] : commit c17fdc1c95f5805b111c59bf527b1dc08af6258f Author: Holden Karau <holden@us.ibm.com> Date: 2017-07-30T23:14:55Z Ok don't specify the return model type in the type params since it seems to make the compiler confused. commit 20a58863ff2eba2597253cd202f8ad0fc66e7cb5 Author: Holden Karau <holden@us.ibm.com> Date: 2017-07-30T23:16:47Z We already have good enough testing basics commit 788de480f4b72afb1d1b2b453dabf1eb18686ba6 Author: Holden Karau <holden@us.ibm.com> Date: 2017-07-30T23:17:22Z Testing with local [1] is going to hide a whole class of bugs commit 73b3669346c17e55945910d07ade90aa72006e3f Author: Holden Karau <holden@us.ibm.com> Date: 2017-09-20T20:49:30Z Add pipeline tests commit 632ffd80c956d2faa640dde8bfec506fcebea621 Author: Holden Karau <holden@us.ibm.com> Date: 2017-09-20T20:54:56Z Remove unecessary SuperVisedSparkEstimator in sparkbindings commit 699bd68d66cc97aed8fc82aa20aa6381262e656e Author: Holden Karau <holden@us.ibm.com> Date: 2017-09-20T22:15:00Z Remove serializable and switch to Kyro commit 9ee28233e41a62aa4c89ece4a803aff2c8f76d50 Author: Holden Karau <holden@us.ibm.com> Date: 2017-09-20T22:19:45Z Remove unused/broken import commit 4f2c1fde8cf8959cb091577640c738b9a6287f5e Author: Holden Karau <holden@us.ibm.com> Date: 2017-09-21T16:08:07Z Use the built in test framework for now (eventuall want to test withou mahout ctx for bootstrapping but this is a good first step) – fix reflection used to convert the DRM back to piepline stage commit 7b4fd3dc79e099610e37d7434afefd8918f134ad Author: Holden Karau <holden@us.ibm.com> Date: 2017-09-21T20:05:53Z Switch back to 1.6.3 commit 7f45935cd567ae0c0d0e40c228c91417ec5b94b4 Author: Holden Karau <holden@us.ibm.com> Date: 2017-09-21T20:07:26Z Remove repo since not using different test lib
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user rawkintrevo commented on the issue:

        https://github.com/apache/mahout/pull/340

        @holdenk Status on this? Would like to call code freeze in the next week or so, willing to hold the door open a bit longer to get this in if you think it can be done.

        Show
        githubbot ASF GitHub Bot added a comment - Github user rawkintrevo commented on the issue: https://github.com/apache/mahout/pull/340 @holdenk Status on this? Would like to call code freeze in the next week or so, willing to hold the door open a bit longer to get this in if you think it can be done.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user rawkintrevo commented on the issue:

        https://github.com/apache/mahout/pull/340

        Bump

        Show
        githubbot ASF GitHub Bot added a comment - Github user rawkintrevo commented on the issue: https://github.com/apache/mahout/pull/340 Bump

          People

          • Assignee:
            holdenk holdenk
            Reporter:
            holdenk holdenk
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development