Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-651

Pass hadoop configuration to methods that use FileSystem operations, even if they don't invoke map/reduce jobs

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.4
    • Fix Version/s: 0.5
    • Component/s: Clustering
    • Labels:
      None

      Description

      Some classes in the Classification component internally use the hadoop's FileSystem class, however, they instantiate the hadoop configuration locally in the method using new Configuration(). This limits the ability to integrate these tools within applications that manage and enrich their own configuration rather than rely on the default hadoop resources that get loaded when calling new Configuration().
      The fix is simply to make these methods take a Configuration parameter rather than creating a new instance when needed. An example for an that creates a new Configuration instances is: org.apache.mahout.clustering.kmeans.KMeansUtil.configureWithClusterInfo(Path, List<Cluster>)

      This problem could also exists beyond the Clustering module, but this issue only addresses the Clustering code.

        Attachments

        1. patch-mahout-651.txt
          13 kB
          Robert Mahfoud

          Activity

            People

            • Assignee:
              srowen Sean R. Owen
              Reporter:
              rmahfoud Robert Mahfoud
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: