Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
None
-
None
Description
SPARK-8536 generalizes LDA to asymmetric document-topic priors, which Wallach et al proposes offers greater utility in terms of asymmetric priors.
However, Stanford NLP also permits asymmetric priors on the topic-word prior. We should not support manually specifying the entire matrix (which has numTopics * vocabSize entries); rather we should follow Stanford NLP and take a single vector of length vocabSize for a prior over words and assume that all topics share this prior (e.g. replicate it numTopics times to get the topic-word prior matrix).
We are leaving this as todo; any users who have a need for this feature should discuss on this JIRA.
Attachments
Issue Links
- is part of
-
SPARK-5572 LDA improvement listing
- Resolved
- is related to
-
SPARK-8536 Generalize LDA to asymmetric doc-topic priors
- Resolved