Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-1327

org.apache.mahout.clustering.classify.ClusterClassifier.readFromSeqFiles is using a new instance of the Configuration object to read the file form the Path instead of using the Configuration object passed to the method

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Not A Problem
    • Affects Version/s: 0.7, 0.8
    • Fix Version/s: None
    • Component/s: Clustering
    • Labels:
      None

      Description

      When you use KmeansDriver.run with a Configuration object pointing to HDFS:

      Configuration conf = new Configuration();
      conf.addResource(new Path("C:\\hdp-win\\hadoop\\hadoop-1.1.0-SNAPSHOT\\conf
      core-site.xml"));
      conf.addResource(new Path("C:\\hdp-win\\hadoop\\hadoop-1.1.0-SNAPSHOT\\conf
      hdfs-site.xml"))

      It calls org.apache.mahout.clustering.classify.ClusterClassifier.readFromSeqFiles

      at some point and I get an exception (there is no problem if you run it with a conf object pointing to the local file system):

      java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
      at java.util.ArrayList.RangeCheck(ArrayList.java:547)
      at java.util.ArrayList.get(ArrayList.java:322)
      at org.apache.mahout.clustering.classify.ClusterClassifier.readFromSeqFiles(ClusterClassifier.java:215)

      I think this is happening because that method is using a new instance of the Configuration object to read the file form the Path instead of using the Configuration object passed to the method.

        Attachments

          Activity

            People

            • Assignee:
              smarthi Suneel Marthi
              Reporter:
              alankrum alan krumholz
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: