Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3706

CompositeInputFormat: Unable to wrap custom InputFormats

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.17.0
    • Fix Version/s: 0.18.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      I am unable to use a custom InputFormat with the CompositeInputFormat as the classloader that is used by Parser is unable to find my class.

      To reproduce (although I've got an example program, if that's preferred?):
      1) Create a custom InputFormat (I made a copy of SequenceFileInputFormat and named it MyInputFormat)
      2) Create a program using CompositeInputFormat [Set "mapred.join.expr" to CompositeInputFormat.compose("outer", MyInputFormat.class, plist)]
      3) Create jar file
      4) Run job (must be via the jar - the problem cannot be reproduced in Local mode)

      Doing so causes the following exception:

      Caused by: java.io.IOException
      	at org.apache.hadoop.mapred.join.Parser$WNode.parse(Parser.java:274)
      	at org.apache.hadoop.mapred.join.Parser.reduce(Parser.java:463)
      	at org.apache.hadoop.mapred.join.Parser.parse(Parser.java:481)
      	at org.apache.hadoop.mapred.join.CompositeInputFormat.setFormat(CompositeInputFormat.java:77)
      	at org.apache.hadoop.mapred.join.CompositeInputFormat.validateInput(CompositeInputFormat.java:118)
      
      Caused by: java.lang.ClassNotFoundException: my.custom.input.format.MyInputFormat
      	at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
      	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:268)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
      	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
      	at java.lang.Class.forName0(Native Method)
      	at java.lang.Class.forName(Class.java:164)
      	at org.apache.hadoop.mapred.join.Parser$WNode.parse(Parser.java:270)
      

      Should the line on Parser.java:271 be something like:

      jobConf.getClassByName(sb.toString());
      

      instead of:

      Class.forName(sb.toString()).asSubclass(InputFormat.class)
      

      to ensure the correct classloader is used?

        Attachments

        1. HADOOP-3706-0.patch
          0.8 kB
          Jingkei Ly

          Activity

            People

            • Assignee:
              jly Jingkei Ly
              Reporter:
              jly Jingkei Ly
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: