Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: v2.0
    • Labels:
      None

      Description

      Follow on from https://issues.apache.org/jira/browse/MADLIB-413 and https://issues.apache.org/jira/browse/MADLIB-1134

      Story

      As a data scientist,
      I want to use grouping in MLP,
      so that I can run multiple models at that same time

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user njayaram2 opened a pull request:

          https://github.com/apache/madlib/pull/179

          MLP: Add grouping support to neural nets

          JIRA: MADLIB-1149

          Changes to support grouping with neural nets. Changes include:

          • Standardize independent features by group.
          • Use GroupIterationController to iterate.
          • Create a temp scaled input table to be used by the
            GroupIterationController
          • Create a new UDF that converts a standard deviation value of
            0.0 to 1.0 when standardizing the independent variable.
          • Update docs, add more examples.
          • Refactor some utility functions code that was used by other
            modules such as svm and elastic_net.
          • Update install check test cases scenario for MLP.
          • Create a new standardization output table that stores the
            x_mean and x_std per group.
          • Add a new method in in_mem_group_control.py_in that returns
            back a specific index's value from the state variable.

          Closes #178

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/njayaram2/madlib features/mlp_grouping

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/madlib/pull/179.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #179


          commit d9d7a3efca47c23958d77de61434efad1d05f706
          Author: Nandish Jayaram <njayaram@apache.org>
          Date: 2017-08-24T19:39:27Z

          MLP: Add grouping support to neural nets

          JIRA: MADLIB-1149

          Changes to support grouping with neural nets. Changes include:

          • Standardize independent features by group.
          • Use GroupIterationController to iterate.
          • Create a temp scaled input table to be used by the
            GroupIterationController
          • Create a new UDF that converts a standard deviation value of
            0.0 to 1.0 when standardizing the independent variable.
          • Update docs, add more examples.
          • Refactor some utility functions code that was used by other
            modules such as svm and elastic_net.
          • Update install check test cases scenario for MLP.
          • Create a new standardization output table that stores the
            x_mean and x_std per group.
          • Add a new method in in_mem_group_control.py_in that returns
            back a specific index's value from the state variable.

          Closes #178


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/179 MLP: Add grouping support to neural nets JIRA: MADLIB-1149 Changes to support grouping with neural nets. Changes include: Standardize independent features by group. Use GroupIterationController to iterate. Create a temp scaled input table to be used by the GroupIterationController Create a new UDF that converts a standard deviation value of 0.0 to 1.0 when standardizing the independent variable. Update docs, add more examples. Refactor some utility functions code that was used by other modules such as svm and elastic_net. Update install check test cases scenario for MLP. Create a new standardization output table that stores the x_mean and x_std per group. Add a new method in in_mem_group_control.py_in that returns back a specific index's value from the state variable. Closes #178 You can merge this pull request into a Git repository by running: $ git pull https://github.com/njayaram2/madlib features/mlp_grouping Alternatively you can review and apply these changes as the patch at: https://github.com/apache/madlib/pull/179.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #179 commit d9d7a3efca47c23958d77de61434efad1d05f706 Author: Nandish Jayaram <njayaram@apache.org> Date: 2017-08-24T19:39:27Z MLP: Add grouping support to neural nets JIRA: MADLIB-1149 Changes to support grouping with neural nets. Changes include: Standardize independent features by group. Use GroupIterationController to iterate. Create a temp scaled input table to be used by the GroupIterationController Create a new UDF that converts a standard deviation value of 0.0 to 1.0 when standardizing the independent variable. Update docs, add more examples. Refactor some utility functions code that was used by other modules such as svm and elastic_net. Update install check test cases scenario for MLP. Create a new standardization output table that stores the x_mean and x_std per group. Add a new method in in_mem_group_control.py_in that returns back a specific index's value from the state variable. Closes #178
          Hide
          fmcquillan Frank McQuillan added a comment -

          Suggestions for neural nets user docs with grouping:

          1) Use
          INSERT INTO iris_data VALUES
          (1,ARRAY[5.1,3.5,1.4,0.2],'Iris-setosa',1),

          instead of

          COPY iris_data (attributes, class_text, class, state) FROM STDIN NULL '?' DELIMITER '|';

          {5.0,3.2,1.2,0.2}

          |Iris_setosa|1|Alaska

          since it makes it more Jupyter notebook friendly.

          2) In example 15, did you mean to do n_iterations=500 (not 50) to show a lower loss than examples 13-14?
          Also example 17 uses n_iterations=500 which makes me think example 15 also should use 500.

          Same question about example 21.

          3) In the prediction section, can you include at least one query that counts missclassifications (for classification) and RMS error (for regression), like in the 1.12 user docs?

          4) In example 1 at the top, drop table if exists before create new one:
          DROP TABLE IF EXISTS iris_data;

          5) In example 12 drop table if exists lin_housing before creating it.

          6) May want to add ORDER BYs to your results output in prediction sections.

          Thanks,
          Frank

          Show
          fmcquillan Frank McQuillan added a comment - Suggestions for neural nets user docs with grouping: 1) Use INSERT INTO iris_data VALUES (1,ARRAY [5.1,3.5,1.4,0.2] ,'Iris-setosa',1), … instead of COPY iris_data (attributes, class_text, class, state) FROM STDIN NULL '?' DELIMITER '|'; {5.0,3.2,1.2,0.2} |Iris_setosa|1|Alaska … since it makes it more Jupyter notebook friendly. 2) In example 15, did you mean to do n_iterations=500 (not 50) to show a lower loss than examples 13-14? Also example 17 uses n_iterations=500 which makes me think example 15 also should use 500. Same question about example 21. 3) In the prediction section, can you include at least one query that counts missclassifications (for classification) and RMS error (for regression), like in the 1.12 user docs? 4) In example 1 at the top, drop table if exists before create new one: DROP TABLE IF EXISTS iris_data; 5) In example 12 drop table if exists lin_housing before creating it. 6) May want to add ORDER BYs to your results output in prediction sections. Thanks, Frank
          Hide
          njayaram Nandish Jayaram added a comment -

          Thank you for the comments Frank McQuillan, will make the changes.

          Show
          njayaram Nandish Jayaram added a comment - Thank you for the comments Frank McQuillan , will make the changes.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/madlib/pull/179

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/madlib/pull/179

            People

            • Assignee:
              njayaram Nandish Jayaram
              Reporter:
              fmcquillan Frank McQuillan
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development