[SPARK-9134] LDA Asymmetric topic-word prior - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Incomplete
Affects Version/s: None
Fix Version/s: None
Component/s: MLlib
Labels:
- bulk-closed

Description

~~SPARK-8536~~ generalizes LDA to asymmetric document-topic priors, which Wallach et al proposes offers greater utility in terms of asymmetric priors.

However, Stanford NLP also permits asymmetric priors on the topic-word prior. We should not support manually specifying the entire matrix (which has numTopics * vocabSize entries); rather we should follow Stanford NLP and take a single vector of length vocabSize for a prior over words and assume that all topics share this prior (e.g. replicate it numTopics times to get the topic-word prior matrix).

We are leaving this as todo; any users who have a need for this feature should discuss on this JIRA.

Attachments

Issue Links

is part of

SPARK-5572 LDA improvement listing

Resolved

is related to

SPARK-8536 Generalize LDA to asymmetric doc-topic priors

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Feynman Liang

Votes:: 1 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 17/Jul/15 08:23

Updated:: 21/May/19 04:33

Resolved:: 21/May/19 04:33