[MADLIB-1352] Add warm start to LDA - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: v3.0.0
Component/s: Module: Parallel Latent Dirichlet Allocation
Labels:
None

Description

In LDA
http://madlib.apache.org/docs/latest/group__grp__lda.html
implement warm start so can pick up from where you left off in the last training.

I would suggest we model this on the warm start implemented in MLP
http://madlib.apache.org/docs/latest/group__grp__nn.html
since it will be the same general idea for LDA.

The LDA interface will be:

lda_train( data_table,
           model_table,
           output_data_table,
           voc_size,
           topic_num,
           iter_num,
           alpha,
           beta,
           evaluate_every,
           perplexity_tol,
           warm_start               -- new param
         )

warm_start (optional)
BOOLEAN, default: FALSE. Initialize weights with the coefficients from the last call of the training function. If set to true, weights will be initialized from the model_table generated by the previous run.  Note that parameters voc_size and  topic_num must remain constant between calls when warm_start is used.  Other parameters can be changed for the warm start run.

Open questions

1) Validate this statement:

Note that parameters voc_size and  topic_num must remain constant between calls when warm_start is used.  Other parameters can be changed for the warm start run.

Notes

1) Depending on open question #1 above, do validation checks on user input to ensure that user does not change any parameter that they are not allowed to change from the previous run.

Attachments

Activity

People

Assignee:: Himanshu Pandey

Reporter:: Frank McQuillan

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 23/May/19 19:18

Updated:: 08/Jan/21 17:28