Description
For each (document, term) pair, return top topic. Note that instances of (doc, term) pairs within a document (a.k.a. "tokens") are exchangeable, so we should provide an estimate per document-term, rather than per token.
Synopsis for DistributedLDAModel:
/** @return RDD of (doc ID, vector of top topic index for each term) */ def topTopicAssignments: RDD[(Long, Vector)]
Note that using Vector will let us have a sparse encoding which is Java-friendly.
Attachments
Attachments
Issue Links
- is required by
-
SPARK-5572 LDA improvement listing
- Resolved
- links to