Uploaded image for project: 'Apache MADlib'
  1. Apache MADlib
  2. MADLIB-1351

Add stopping criteria on perplexity to LDA

    XMLWordPrintableJSON

Details

    Description

      In LDA
      http://madlib.apache.org/docs/latest/group__grp__lda.html
      make stopping criteria on perplexity rather than just number of iterations.

      Suggested approach is to do what scikit-learn does
      https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html

      evaluate_every : int, optional (default=0)
      How often to evaluate perplexity. Set it to 0 or negative number to not evaluate perplexity in training at all. Evaluating perplexity can help you check convergence in training process, but it will also increase total training time. Evaluating perplexity in every iteration might increase training time up to two-fold.

      perplexity_tol : float, optional (default=1e-1)
      Perplexity tolerance to stop iterating. Only used when evaluate_every is greater than 0.

      Attachments

        Issue Links

          Activity

            People

              hpandey Himanshu Pandey
              fmcquillan Frank McQuillan
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 14h 50m
                  14h 50m