Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-501

Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Trivial
    • Resolution: Fixed
    • 0.3
    • 0.4
    • None
    • None
    • Ubuntu 10.04.1 LTS

    Description

      Running 'mahout lucene.vector' results in the following stacktrace:

      frank@frankthetank:~$ mahout lucene.vector
      Running on hadoop, using HADOOP_HOME=/home/frank/software/dist/hadoop
      HADOOP_CONF_DIR=/home/frank/software/dist/hadoop/conf
      10/09/13 22:44:43 WARN driver.MahoutDriver: No lucene.vector.props found on classpath, will use command-line arguments only
      10/09/13 22:44:43 ERROR lucene.Driver: Exception
      org.apache.commons.cli2.OptionException: Missing required option --dir
      at org.apache.commons.cli2.option.DefaultOption.validate(DefaultOption.java:172)
      at org.apache.commons.cli2.option.GroupImpl.validate(GroupImpl.java:265)
      at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:104)
      at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:133)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
      at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
      at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:175)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
      Usage:
      [--dir <dir> --idField <idField> --output <output> --delimiter <delimiter>
      --help --field <field> --max <max> --dictOut <dictOut> --norm <norm>
      --outputWriter <outputWriter> --maxDFPercent <maxDFPercent> --weight <weight>
      --minDF <minDF>]
      Options
      --dir (-d) dir The Lucene directory
      --idField (-i) idField The field in the index containing the
      index. If null, then the Lucene internal
      doc id is used which is prone to error if
      the underlying index changes
      --output (-o) output The output file
      --delimiter (-l) delimiter The delimiter for outputing the
      dictionary
      --help (-h) Print out help
      --field (-f) field The field in the index
      --max (-m) max The maximum number of vectors to output.
      If not specified, then it will loop over
      all docs
      --dictOut (-t) dictOut The output of the dictionary
      --norm (-n) norm The norm to use, expressed as either a
      double or "INF" if you want to use the
      Infinite norm. Must be greater or equal
      to 0. The default is not to normalize
      --outputWriter (-e) outputWriter The VectorWriter to use, either seq
      (SequenceFileVectorWriter - default) or
      file (Writes to a File using JSON format)
      --maxDFPercent (-x) maxDFPercent The max percentage of docs for the DF.
      Can be used to remove really high
      frequency terms. Expressed as an integer
      between 0 and 100. Default is 99.
      --weight (-w) weight The kind of weight to use. Currently TF
      or TFIDF
      --minDF (-md) minDF The minimum document frequency. Default
      is 1
      10/09/13 22:44:43 INFO driver.MahoutDriver: Program took 54 ms

      This is because the program shortname lucene.vector from driver.classes.props has a dot, while the props file is called lucenevector.props, which doesn't have a dot between lucene and vector.

      Fix: change lucenevector.props filename to lucene.vector.props.

      Attachments

        1. MAHOUT-501.patch
          0.4 kB
          Frank Scholten

        Activity

          People

            drew.farris Drew Farris
            frankscholten Frank Scholten
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 5m
                5m
                Remaining:
                Remaining Estimate - 5m
                5m
                Logged:
                Time Spent - Not Specified
                Not Specified