Uploaded image for project: 'Apache MADlib'
  1. Apache MADlib
  2. MADLIB-777

SVM Regression produces different predictions on multiple runs of the same training and test sets.

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Abandoned
    • None
    • v1.9
    • None
    • None

    Description

      I tested this on MADlib 0.7 but I am not sure if this is version specific.

      Attaching the training & test tables (combo_svm_train.sql and combo_svm_dev.sql) and the SQL file to train MADlib's SVM Regression and to predict using the trained model on the dev set.

      Each time you run the training & prediction, you get wildly different prediction results (the R^2 varies between -0.50 to 0.50 in the several attempts that I ran the model).

      Not sure if this is expected behavior or if there is an error I've overlooked. If it is the expected behavior, the models are unusable unless I train the multiple models in parallel and use some sort of voting to minimize the variation. But this seems serious otherwise.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            riyer Rahul Iyer
            vatsan Srivatsan Ramanujam
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment