Uploaded image for project: 'Apache MADlib'
  1. Apache MADlib
  2. MADLIB-934

MADlib LDA should allow users to supply names of input columns

    XMLWordPrintableJSON

Details

    Description

      When i attempt to create my own input table for LDA (one that has docid, wordid, count) which had 4 columns "docid", "wordid", "count" as well as a fourth column "word" (corresponding to the raw token). Of these, the type of the "count" column was bigint and not int. I am not sure what prompted the lda_train function to throw an error it said the input table did not contain docid, wordid and count columns, i did not check to see if it was because of the data type mismatch of the count column or if it was due to the additional column i had. Can you confirm which one is it?

      If it is just the case that it is a bigint vs int issue, can we allow user to supply the names of the docid, wordid and count columns? (instead of hard-coding it?).

      Attachments

        Activity

          People

            riyer Rahul Iyer
            vatsan Srivatsan Ramanujam
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: