Uploaded image for project: 'Chukwa'
  1. Chukwa
  2. CHUKWA-488

Hadoop cannot find custom Demux class

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.4.0
    • Fix Version/s: 0.5.0
    • Component/s: MR Data Processors
    • Labels:
      None
    • Environment:

      Linux x86-64
      Java 1.6.0_20

      Description

      I'm getting ClassNotFoundException errors when running inside Hadoop's map phase, unable to find my class org.apache.hadoop.chukwa.extraction.demux.processor.mapper.XmlBasedDemux which I've packaged in a JAR named data-collection-demux-0.1.jar.

      The problem seems to be in the values of these two properties in the Hadoop job configuration:

      <property>
          <name>mapred.job.classpath.files</name>
          <value>hdfs://localhost:9000/chukwa/demux/data-collection-demux-0.1.jar</value>
      </property>
      <property>
          <name>mapred.cache.files</name>
          <value>hdfs://localhost:9000/chukwa/demux/data-collection-demux-0.1.jar</value>
      </property>
      

      The problem seems to stem from the fact that the call to DistributedCache.addFileToClassPath is passing in a Path that is in URI form, i.e. hdfs://localhost:9000/chukwa/demux/data-collection-demux-0.1.jar whereas the DistributedCache API expects it to be a filesystem-based path (i.e. /chukwa/demux/data-collection-demux-0.1.jar). I'm not sure why, but the FileStatus object returned by FileSystem.listStatus is returning a URL-based path instead of a filesystem-based path.

      I kludged the Demux class' addParsers to strip the "hdfs://localhost:9000" portion of the string and now my class is found. I will attempt to provide a patch today that determines the value of Hadoop's fs.default.name and strips that from the value returned in Demux.java.

      1. Demux.diff
        0.9 kB
        Kirk True

        Activity

        Hide
        kirktrue Kirk True added a comment -

        Not sure if this is the correct way to fix this, but it works on my development environment with default settings in Chukwa and Hadoop.

        Show
        kirktrue Kirk True added a comment - Not sure if this is the correct way to fix this, but it works on my development environment with default settings in Chukwa and Hadoop.
        Hide
        eyang Eric Yang added a comment -

        +1 Looks good, and works on my test environment.

        Show
        eyang Eric Yang added a comment - +1 Looks good, and works on my test environment.
        Hide
        eyang Eric Yang added a comment -

        I just committed this. Thanks Kirk

        Show
        eyang Eric Yang added a comment - I just committed this. Thanks Kirk
        Hide
        hudson Hudson added a comment -

        Integrated in Chukwa-trunk #373 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/373/)
        CHUKWA-488. Filter user customized jar file path URL. (Kirk True via Eric Yang)

        Show
        hudson Hudson added a comment - Integrated in Chukwa-trunk #373 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/373/ ) CHUKWA-488 . Filter user customized jar file path URL. (Kirk True via Eric Yang)

          People

          • Assignee:
            Unassigned
            Reporter:
            kirktrue Kirk True
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development