Hadoop Common
  1. Hadoop Common
  2. HADOOP-3299

org.apache.hadoop.mapred.join.CompositeInputFormat does not initialize TextInput format files with the configuration resulting in an NullPointerException

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.16.3
    • Fix Version/s: 0.18.0
    • Component/s: io
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Changed the TextInputFormat and KeyValueTextInput classes to initialize the compressionCodecs member variable before dereferencing it.

      Description

      The input formats are not initialized with the Configuration object before the isSplitable method is called.

      bin/hadoop jar hadoop-0.16.3-examples.jar join -r 1 -inFormat org.apache.hadoop.mapred.KeyValueTextInputFormat -outFormat org.apache.hadoop.mapred.TextOutputFormat -joinOp outer datajoin/input datajoin/output -outKey org.apache.hadoop.io.Text -outValue org.apache.hadoop.mapred.join.TupleWritable
      08/04/22 15:05:33 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
      Job started: Tue Apr 22 15:05:33 GMT-08:00 2008
      08/04/22 15:05:33 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
      08/04/22 15:05:33 INFO mapred.FileInputFormat: Total input paths to process : 2
      java.lang.NullPointerException
      at org.apache.hadoop.mapred.KeyValueTextInputFormat.isSplitable(KeyValueTextInputFormat.java:44)
      at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:185)
      at org.apache.hadoop.mapred.join.Parser$WNode.getSplits(Parser.java:304)
      at org.apache.hadoop.mapred.join.Parser$CNode.getSplits(Parser.java:374)
      at org.apache.hadoop.mapred.join.CompositeInputFormat.getSplits(CompositeInputFormat.java:129)
      at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:542)
      at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:803)
      at org.apache.hadoop.examples.Join.run(Join.java:149)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
      at org.apache.hadoop.examples.Join.main(Join.java:158)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
      at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
      at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:52)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.util.RunJar.main(RunJar.java:155)

      1. text.diff
        1 kB
        Jason
      2. keyvaluetext.diff
        1 kB
        Jason
      3. 3299-0.patch
        7 kB
        Chris Douglas

        Activity

        Hide
        Jason added a comment -

        ensure that compressionCodes is initialized in isSplittable before dereferencing it.

        Show
        Jason added a comment - ensure that compressionCodes is initialized in isSplittable before dereferencing it.
        Hide
        Jason added a comment -

        ensure that compressionCodes is initialized in isSplittable before dereferencing it.

        Show
        Jason added a comment - ensure that compressionCodes is initialized in isSplittable before dereferencing it.
        Hide
        Chris Douglas added a comment -

        The issue is with o.a.h.mapred.join.Parser. When it instantiates an InputFormat through ReflectionUtils::newInstance, it passes in a null JobConf so the InputFormat- if Configurable or JobConfigurable- remains uninitialized.

        The test case includes a separate class, ConfigurableInputFormat because the Parser doesn't support inner classes (i.e. '$' is unexpected in its grammar).

        Show
        Chris Douglas added a comment - The issue is with o.a.h.mapred.join.Parser. When it instantiates an InputFormat through ReflectionUtils::newInstance, it passes in a null JobConf so the InputFormat- if Configurable or JobConfigurable - remains uninitialized. The test case includes a separate class, ConfigurableInputFormat because the Parser doesn't support inner classes (i.e. '$' is unexpected in its grammar).
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12380974/3299-0.patch
        against trunk revision 645773.

        @author +1. The patch does not contain any @author tags.

        tests included +1. The patch appears to include 6 new or modified tests.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new javac compiler warnings.

        release audit +1. The applied patch does not generate any new release audit warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests +1. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2334/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2334/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2334/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2334/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12380974/3299-0.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 6 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2334/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2334/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2334/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2334/console This message is automatically generated.
        Hide
        Owen O'Malley added a comment -

        I just committed this. Thanks, Chris!

        Show
        Owen O'Malley added a comment - I just committed this. Thanks, Chris!
        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-trunk #484 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/484/ )

          People

          • Assignee:
            Chris Douglas
            Reporter:
            Jason
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development