Uploaded image for project: 'Apache MADlib'
  1. Apache MADlib
  2. MADLIB-1352

Add warm start to LDA

    XMLWordPrintableJSON

Details

    Description

      In LDA
      http://madlib.apache.org/docs/latest/group__grp__lda.html
      implement warm start so can pick up from where you left off in the last training.

      I would suggest we model this on the warm start implemented in MLP
      http://madlib.apache.org/docs/latest/group__grp__nn.html
      since it will be the same general idea for LDA.

      The LDA interface will be:

      lda_train( data_table,
                 model_table,
                 output_data_table,
                 voc_size,
                 topic_num,
                 iter_num,
                 alpha,
                 beta,
                 evaluate_every,
                 perplexity_tol,
                 warm_start               -- new param
               )
      
      warm_start (optional)
      BOOLEAN, default: FALSE. Initialize weights with the coefficients from the last call of the training function. If set to true, weights will be initialized from the model_table generated by the previous run.  Note that parameters voc_size and  topic_num must remain constant between calls when warm_start is used.  Other parameters can be changed for the warm start run.
      

      Open questions

      1) Validate this statement:

      Note that parameters voc_size and  topic_num must remain constant between calls when warm_start is used.  Other parameters can be changed for the warm start run.
      

      Notes

      1) Depending on open question #1 above, do validation checks on user input to ensure that user does not change any parameter that they are not allowed to change from the previous run.

      Attachments

        Activity

          People

            hpandey Himanshu Pandey
            fmcquillan Frank McQuillan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: