Uploaded image for project: 'Hivemall'
  1. Hivemall
  2. HIVEMALL-243

Fix nominal variable handling in DecisionTree and RegressionTree

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Task
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 0.5.2
    • Fix Version/s: 0.6.0
    • Labels:
      None

      Description

      For NOMINAL variable, the maximum attribute index 'm' is used for computing splits.

      This cause performance issues for sparse nominal variables. So, revise this handling for a better performance.

      https://github.com/apache/incubator-hivemall/blob/master/core/src/main/java/hivemall/smile/classification/DecisionTree.java#L703

        Attachments

          Activity

            People

            • Assignee:
              myui Makoto Yui
              Reporter:
              myui Makoto Yui

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment