Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.2.0
    • Fix Version/s: 5.0.0
    • Component/s: core
    • Labels:
      None

      Description

      If we define argument for the Spark action like this

              <spark xmlns="uri:oozie:spark-action:0.2">
                  <job-tracker>${jobTracker}</job-tracker>
                  <name-node>${nameNode}</name-node>
                  <configuration>
                      <property>
                          <name>oozie.use.system.libpath</name>
                          <value>true</value>
                      </property>
                      <property>
                          <name></name>
                          <value></value>
                      </property>
                  </configuration>
                  <master>yarn-cluster</master>
                  <mode>cluster</mode>
                  <name>Something</name>
                  <class>package.of.my.Class</class>
                  <jar>${nameNode}/myjar.jar</jar>
                  <arg></arg>    <---  cause of the NPE
              </spark>
      

      then we get a NullPointerException in LauncherMapper:

      Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], exception invoking main(), null
      java.lang.NullPointerException
      	at org.apache.oozie.action.hadoop.LauncherMapper.printArgs(LauncherMapper.java:627)
      	at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:212)
      	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
      ...
      
      1. OOZIE-2748-001.patch
        3 kB
        Peter Bacsko
      2. OOZIE-2748-002.patch
        3 kB
        Peter Bacsko
      3. OOZIE-2748-003.patch
        4 kB
        Peter Bacsko
      4. OOZIE-2748-004.patch
        7 kB
        Peter Bacsko
      5. OOZIE-2748-005.patch
        7 kB
        Peter Bacsko
      6. OOZIE-2748-006.patch
        7 kB
        Peter Bacsko

        Issue Links

          Activity

          Hide
          pbacsko Peter Bacsko added a comment - - edited

          I suspect the problem is that we have the args[] array with a single "null" element if we use <args></args>.

          Show
          pbacsko Peter Bacsko added a comment - - edited I suspect the problem is that we have the args[] array with a single "null" element if we use <args></args> .
          Hide
          pbacsko Peter Bacsko added a comment -

          Other actions are affected too, I added a testcase which covers this scenario for Java action.

          Show
          pbacsko Peter Bacsko added a comment - Other actions are affected too, I added a testcase which covers this scenario for Java action.
          Hide
          jackson Christopher Jackson added a comment - - edited

          Just ran into this myself with an EL function that was returning a null. Cheers for the patch and unit tests.

          Show
          jackson Christopher Jackson added a comment - - edited Just ran into this myself with an EL function that was returning a null. Cheers for the patch and unit tests.
          Hide
          abhishekbafna Abhishek Bafna added a comment -
                      if (arg == null) {
                          arg = "";   // handle <args></args> case
                      }
          

          Would it be better to set it as "null" instead of "" empty string. It will also inform the user about a argument/parameter being null. Thanks.

          Show
          abhishekbafna Abhishek Bafna added a comment - if (arg == null ) { arg = ""; // handle <args></args> case } Would it be better to set it as "null" instead of "" empty string. It will also inform the user about a argument/parameter being null. Thanks.
          Hide
          pbacsko Peter Bacsko added a comment -

          Makes sense, using "null".

          Show
          pbacsko Peter Bacsko added a comment - Makes sense, using "null".
          Hide
          andras.piros Andras Piros added a comment -

          +1 (non-binding)

          Show
          andras.piros Andras Piros added a comment - +1 (non-binding)
          Hide
          rkanter Robert Kanter added a comment -

          If we use "null", might it not cause problems for some programs? Imagine passing "null" as an argument to something. I think users wouldn't be expecting to see "null" anywhere.

          I think the right approach here would be to simply drop the arguments if they're empty. That would also allow users to do something like this:

          ...
          <arg>${foo}</arg>
          ...
          

          And from job.properties, they can set foo to a value to pass it along or to an empty string to not pass an argument.
          This behavior is also more consistent with Hadoop Configuration in that empty values are dropped.

          If we want to alert the user to the fact that they have empty <arg>, we could just print out a warning message.

          Thoughts?

          Show
          rkanter Robert Kanter added a comment - If we use "null", might it not cause problems for some programs? Imagine passing "null" as an argument to something. I think users wouldn't be expecting to see "null" anywhere. I think the right approach here would be to simply drop the arguments if they're empty. That would also allow users to do something like this: ... <arg> ${foo} </arg> ... And from job.properties, they can set foo to a value to pass it along or to an empty string to not pass an argument. This behavior is also more consistent with Hadoop Configuration in that empty values are dropped. If we want to alert the user to the fact that they have empty <arg> , we could just print out a warning message. Thoughts?
          Hide
          abhishekbafna Abhishek Bafna added a comment -

          If we drop the null arguments, then number and position of argument could differ than what other application might be expecting. This can cause some unexpected results for user/application.

          May be we can put the null check into Oozie codebase and pass the null to the application, Now it will be applications responsibility to handle the incoming null values, along with warning the user about received null in the arguments. Thanks.

          Show
          abhishekbafna Abhishek Bafna added a comment - If we drop the null arguments, then number and position of argument could differ than what other application might be expecting. This can cause some unexpected results for user/application. May be we can put the null check into Oozie codebase and pass the null to the application, Now it will be applications responsibility to handle the incoming null values, along with warning the user about received null in the arguments. Thanks.
          Hide
          rkanter Robert Kanter added a comment -

          If we drop the null arguments, then number and position of argument could differ than what other application might be expecting.

          If you think about it like a command line thing, then passing something like

          <arg>AAA</arg>
          <arg></arg>
          <arg>CCC</arg>
          

          would be equivalent to

          spark-submit AAA CCC
          

          right? The empty argument wouldn't even be there. Even if you put an extra space, programs typically split on whitespace, so it would be the same. If we put in "null", then we're doing this:

          spark-submit AAA null CCC
          

          Now it will be applications responsibility to handle the incoming null values

          I'm pretty sure no programs will be able to properly handle this. And users will blame Oozie because they didn't put in "null".

          Show
          rkanter Robert Kanter added a comment - If we drop the null arguments, then number and position of argument could differ than what other application might be expecting. If you think about it like a command line thing, then passing something like <arg> AAA </arg> <arg> </arg> <arg> CCC </arg> would be equivalent to spark-submit AAA CCC right? The empty argument wouldn't even be there. Even if you put an extra space, programs typically split on whitespace, so it would be the same. If we put in "null", then we're doing this: spark-submit AAA null CCC Now it will be applications responsibility to handle the incoming null values I'm pretty sure no programs will be able to properly handle this. And users will blame Oozie because they didn't put in "null".
          Hide
          abhishekbafna Abhishek Bafna added a comment -

          Assuming end-application does not have number of arguments and null checks in place.

          AAA CCC Vs AAA null CCC, application will use the argument CCC instead of null, for performing the operation and some operation could be hard to recover back (e.g. delete a dir/file). Though user application will any way fail with ArrayIndexOutOfBounds given one argument is missing. [This is a specific type of use case.]

          In either case users will blame Oozie. It is just that, passing a null seems to be safer than removing the argument. Thanks.

          Show
          abhishekbafna Abhishek Bafna added a comment - Assuming end-application does not have number of arguments and null checks in place. AAA CCC Vs AAA null CCC , application will use the argument CCC instead of null , for performing the operation and some operation could be hard to recover back (e.g. delete a dir/file). Though user application will any way fail with ArrayIndexOutOfBounds given one argument is missing. [This is a specific type of use case.] In either case users will blame Oozie. It is just that, passing a null seems to be safer than removing the argument. Thanks.
          Hide
          pbacsko Peter Bacsko added a comment - - edited

          Honestly I misunderstood this thing a little bit - I thought there are only a single <arg> element in the XML, not a list of <arg>. So I was a bit confused

          Anyway if that's the case, I do agree with the decision that we should drop nulls. I think programs usually scan through the args[] and 99% of them are unprepared to handle nulls. I will update the patch soon.

          Show
          pbacsko Peter Bacsko added a comment - - edited Honestly I misunderstood this thing a little bit - I thought there are only a single <arg> element in the XML, not a list of <arg> . So I was a bit confused Anyway if that's the case, I do agree with the decision that we should drop nulls. I think programs usually scan through the args[] and 99% of them are unprepared to handle nulls. I will update the patch soon.
          Hide
          hadoopqa Hadoop QA added a comment -

          Testing JIRA OOZIE-2748

          Cleaning local git workspace

          ----------------------------

          +1 PATCH_APPLIES
          +1 CLEAN
          +1 RAW_PATCH_ANALYSIS
          . +1 the patch does not introduce any @author tags
          . +1 the patch does not introduce any tabs
          . +1 the patch does not introduce any trailing spaces
          . +1 the patch does not introduce any line longer than 132
          . +1 the patch does adds/modifies 2 testcase(s)
          +1 RAT
          . +1 the patch does not seem to introduce new RAT warnings
          +1 JAVADOC
          . +1 the patch does not seem to introduce new Javadoc warnings
          +1 COMPILE
          . +1 HEAD compiles
          . +1 patch compiles
          . +1 the patch does not seem to introduce new javac warnings
          +1 There are no new bugs found in total.
          . +1 There are no new bugs found in [core].
          . +1 There are no new bugs found in [docs].
          . +1 There are no new bugs found in [tools].
          . +1 There are no new bugs found in [server].
          . +1 There are no new bugs found in [examples].
          . +1 There are no new bugs found in [sharelib/oozie].
          . +1 There are no new bugs found in [sharelib/streaming].
          . +1 There are no new bugs found in [sharelib/spark].
          . +1 There are no new bugs found in [sharelib/pig].
          . +1 There are no new bugs found in [sharelib/hcatalog].
          . +1 There are no new bugs found in [sharelib/sqoop].
          . +1 There are no new bugs found in [sharelib/hive2].
          . +1 There are no new bugs found in [sharelib/hive].
          . +1 There are no new bugs found in [sharelib/distcp].
          . +1 There are no new bugs found in [hadooplibs/hadoop-utils-2].
          . +1 There are no new bugs found in [client].
          +1 BACKWARDS_COMPATIBILITY
          . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations
          . +1 the patch does not modify JPA files
          -1 TESTS
          . Tests run: 1832
          . Tests failed: 1
          . Tests errors: 0

          . The patch failed the following testcases:

          . testMemoryUsageAndSpeed(org.apache.oozie.service.TestPartitionDependencyManagerEhcache)

          . Tests failing with errors:
          .

          +1 DISTRO
          . +1 distro tarball builds with the patch

          ----------------------------
          -1 Overall result, please check the reported -1(s)

          The full output of the test-patch run is available at

          . https://builds.apache.org/job/oozie-trunk-precommit-build/3487/

          Show
          hadoopqa Hadoop QA added a comment - Testing JIRA OOZIE-2748 Cleaning local git workspace ---------------------------- +1 PATCH_APPLIES +1 CLEAN +1 RAW_PATCH_ANALYSIS . +1 the patch does not introduce any @author tags . +1 the patch does not introduce any tabs . +1 the patch does not introduce any trailing spaces . +1 the patch does not introduce any line longer than 132 . +1 the patch does adds/modifies 2 testcase(s) +1 RAT . +1 the patch does not seem to introduce new RAT warnings +1 JAVADOC . +1 the patch does not seem to introduce new Javadoc warnings +1 COMPILE . +1 HEAD compiles . +1 patch compiles . +1 the patch does not seem to introduce new javac warnings +1 There are no new bugs found in total. . +1 There are no new bugs found in [core] . . +1 There are no new bugs found in [docs] . . +1 There are no new bugs found in [tools] . . +1 There are no new bugs found in [server] . . +1 There are no new bugs found in [examples] . . +1 There are no new bugs found in [sharelib/oozie] . . +1 There are no new bugs found in [sharelib/streaming] . . +1 There are no new bugs found in [sharelib/spark] . . +1 There are no new bugs found in [sharelib/pig] . . +1 There are no new bugs found in [sharelib/hcatalog] . . +1 There are no new bugs found in [sharelib/sqoop] . . +1 There are no new bugs found in [sharelib/hive2] . . +1 There are no new bugs found in [sharelib/hive] . . +1 There are no new bugs found in [sharelib/distcp] . . +1 There are no new bugs found in [hadooplibs/hadoop-utils-2] . . +1 There are no new bugs found in [client] . +1 BACKWARDS_COMPATIBILITY . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations . +1 the patch does not modify JPA files -1 TESTS . Tests run: 1832 . Tests failed: 1 . Tests errors: 0 . The patch failed the following testcases: . testMemoryUsageAndSpeed(org.apache.oozie.service.TestPartitionDependencyManagerEhcache) . Tests failing with errors: . +1 DISTRO . +1 distro tarball builds with the patch ---------------------------- -1 Overall result, please check the reported -1(s) The full output of the test-patch run is available at . https://builds.apache.org/job/oozie-trunk-precommit-build/3487/
          Hide
          hadoopqa Hadoop QA added a comment -

          Testing JIRA OOZIE-2748

          Cleaning local git workspace

          ----------------------------

          +1 PATCH_APPLIES
          +1 CLEAN
          -1 RAW_PATCH_ANALYSIS
          . +1 the patch does not introduce any @author tags
          . +1 the patch does not introduce any tabs
          . -1 the patch contains 1 line(s) with trailing spaces
          . -1 the patch contains 1 line(s) longer than 132 characters
          . +1 the patch does adds/modifies 2 testcase(s)
          +1 RAT
          . +1 the patch does not seem to introduce new RAT warnings
          +1 JAVADOC
          . +1 the patch does not seem to introduce new Javadoc warnings
          +1 COMPILE
          . +1 HEAD compiles
          . +1 patch compiles
          . +1 the patch does not seem to introduce new javac warnings
          +1 There are no new bugs found in total.
          . +1 There are no new bugs found in [server].
          . +1 There are no new bugs found in [client].
          . +1 There are no new bugs found in [core].
          . +1 There are no new bugs found in [docs].
          . +1 There are no new bugs found in [hadooplibs/hadoop-utils-2].
          . +1 There are no new bugs found in [tools].
          . +1 There are no new bugs found in [examples].
          . +1 There are no new bugs found in [sharelib/streaming].
          . +1 There are no new bugs found in [sharelib/sqoop].
          . +1 There are no new bugs found in [sharelib/distcp].
          . +1 There are no new bugs found in [sharelib/oozie].
          . +1 There are no new bugs found in [sharelib/hcatalog].
          . +1 There are no new bugs found in [sharelib/hive].
          . +1 There are no new bugs found in [sharelib/hive2].
          . +1 There are no new bugs found in [sharelib/pig].
          . +1 There are no new bugs found in [sharelib/spark].
          +1 BACKWARDS_COMPATIBILITY
          . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations
          . +1 the patch does not modify JPA files
          +1 TESTS
          . Tests run: 1836
          +1 DISTRO
          . +1 distro tarball builds with the patch

          ----------------------------
          -1 Overall result, please check the reported -1(s)

          The full output of the test-patch run is available at

          . https://builds.apache.org/job/oozie-trunk-precommit-build/3495/

          Show
          hadoopqa Hadoop QA added a comment - Testing JIRA OOZIE-2748 Cleaning local git workspace ---------------------------- +1 PATCH_APPLIES +1 CLEAN -1 RAW_PATCH_ANALYSIS . +1 the patch does not introduce any @author tags . +1 the patch does not introduce any tabs . -1 the patch contains 1 line(s) with trailing spaces . -1 the patch contains 1 line(s) longer than 132 characters . +1 the patch does adds/modifies 2 testcase(s) +1 RAT . +1 the patch does not seem to introduce new RAT warnings +1 JAVADOC . +1 the patch does not seem to introduce new Javadoc warnings +1 COMPILE . +1 HEAD compiles . +1 patch compiles . +1 the patch does not seem to introduce new javac warnings +1 There are no new bugs found in total. . +1 There are no new bugs found in [server] . . +1 There are no new bugs found in [client] . . +1 There are no new bugs found in [core] . . +1 There are no new bugs found in [docs] . . +1 There are no new bugs found in [hadooplibs/hadoop-utils-2] . . +1 There are no new bugs found in [tools] . . +1 There are no new bugs found in [examples] . . +1 There are no new bugs found in [sharelib/streaming] . . +1 There are no new bugs found in [sharelib/sqoop] . . +1 There are no new bugs found in [sharelib/distcp] . . +1 There are no new bugs found in [sharelib/oozie] . . +1 There are no new bugs found in [sharelib/hcatalog] . . +1 There are no new bugs found in [sharelib/hive] . . +1 There are no new bugs found in [sharelib/hive2] . . +1 There are no new bugs found in [sharelib/pig] . . +1 There are no new bugs found in [sharelib/spark] . +1 BACKWARDS_COMPATIBILITY . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations . +1 the patch does not modify JPA files +1 TESTS . Tests run: 1836 +1 DISTRO . +1 distro tarball builds with the patch ---------------------------- -1 Overall result, please check the reported -1(s) The full output of the test-patch run is available at . https://builds.apache.org/job/oozie-trunk-precommit-build/3495/
          Hide
          andras.piros Andras Piros added a comment -

          +1 (non-binding)

          Show
          andras.piros Andras Piros added a comment - +1 (non-binding)
          Hide
          hadoopqa Hadoop QA added a comment -

          Testing JIRA OOZIE-2748

          Cleaning local git workspace

          ----------------------------

          +1 PATCH_APPLIES
          +1 CLEAN
          +1 RAW_PATCH_ANALYSIS
          . +1 the patch does not introduce any @author tags
          . +1 the patch does not introduce any tabs
          . +1 the patch does not introduce any trailing spaces
          . +1 the patch does not introduce any line longer than 132
          . +1 the patch does adds/modifies 2 testcase(s)
          +1 RAT
          . +1 the patch does not seem to introduce new RAT warnings
          +1 JAVADOC
          . +1 the patch does not seem to introduce new Javadoc warnings
          +1 COMPILE
          . +1 HEAD compiles
          . +1 patch compiles
          . +1 the patch does not seem to introduce new javac warnings
          +1 There are no new bugs found in total.
          . +1 There are no new bugs found in [docs].
          . +1 There are no new bugs found in [server].
          . +1 There are no new bugs found in [client].
          . +1 There are no new bugs found in [tools].
          . +1 There are no new bugs found in [core].
          . +1 There are no new bugs found in [examples].
          . +1 There are no new bugs found in [sharelib/streaming].
          . +1 There are no new bugs found in [sharelib/hive2].
          . +1 There are no new bugs found in [sharelib/distcp].
          . +1 There are no new bugs found in [sharelib/hive].
          . +1 There are no new bugs found in [sharelib/pig].
          . +1 There are no new bugs found in [sharelib/oozie].
          . +1 There are no new bugs found in [sharelib/sqoop].
          . +1 There are no new bugs found in [sharelib/spark].
          . +1 There are no new bugs found in [sharelib/hcatalog].
          . +1 There are no new bugs found in [hadooplibs/hadoop-utils-2].
          +1 BACKWARDS_COMPATIBILITY
          . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations
          . +1 the patch does not modify JPA files
          -1 TESTS
          . Tests run: 1836
          . Tests failed: 1
          . Tests errors: 0

          . The patch failed the following testcases:

          . testMemoryUsageAndSpeed(org.apache.oozie.service.TestPartitionDependencyManagerEhcache)

          . Tests failing with errors:
          .

          +1 DISTRO
          . +1 distro tarball builds with the patch

          ----------------------------
          -1 Overall result, please check the reported -1(s)

          The full output of the test-patch run is available at

          . https://builds.apache.org/job/oozie-trunk-precommit-build/3499/

          Show
          hadoopqa Hadoop QA added a comment - Testing JIRA OOZIE-2748 Cleaning local git workspace ---------------------------- +1 PATCH_APPLIES +1 CLEAN +1 RAW_PATCH_ANALYSIS . +1 the patch does not introduce any @author tags . +1 the patch does not introduce any tabs . +1 the patch does not introduce any trailing spaces . +1 the patch does not introduce any line longer than 132 . +1 the patch does adds/modifies 2 testcase(s) +1 RAT . +1 the patch does not seem to introduce new RAT warnings +1 JAVADOC . +1 the patch does not seem to introduce new Javadoc warnings +1 COMPILE . +1 HEAD compiles . +1 patch compiles . +1 the patch does not seem to introduce new javac warnings +1 There are no new bugs found in total. . +1 There are no new bugs found in [docs] . . +1 There are no new bugs found in [server] . . +1 There are no new bugs found in [client] . . +1 There are no new bugs found in [tools] . . +1 There are no new bugs found in [core] . . +1 There are no new bugs found in [examples] . . +1 There are no new bugs found in [sharelib/streaming] . . +1 There are no new bugs found in [sharelib/hive2] . . +1 There are no new bugs found in [sharelib/distcp] . . +1 There are no new bugs found in [sharelib/hive] . . +1 There are no new bugs found in [sharelib/pig] . . +1 There are no new bugs found in [sharelib/oozie] . . +1 There are no new bugs found in [sharelib/sqoop] . . +1 There are no new bugs found in [sharelib/spark] . . +1 There are no new bugs found in [sharelib/hcatalog] . . +1 There are no new bugs found in [hadooplibs/hadoop-utils-2] . +1 BACKWARDS_COMPATIBILITY . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations . +1 the patch does not modify JPA files -1 TESTS . Tests run: 1836 . Tests failed: 1 . Tests errors: 0 . The patch failed the following testcases: . testMemoryUsageAndSpeed(org.apache.oozie.service.TestPartitionDependencyManagerEhcache) . Tests failing with errors: . +1 DISTRO . +1 distro tarball builds with the patch ---------------------------- -1 Overall result, please check the reported -1(s) The full output of the test-patch run is available at . https://builds.apache.org/job/oozie-trunk-precommit-build/3499/
          Hide
          pbacsko Peter Bacsko added a comment -

          Test failure is unrelated

          Show
          pbacsko Peter Bacsko added a comment - Test failure is unrelated
          Hide
          rkanter Robert Kanter added a comment -

          +1

          Show
          rkanter Robert Kanter added a comment - +1
          Hide
          rkanter Robert Kanter added a comment -

          Thanks Peter Bacsko. Committed to master!

          Show
          rkanter Robert Kanter added a comment - Thanks Peter Bacsko . Committed to master!

            People

            • Assignee:
              pbacsko Peter Bacsko
              Reporter:
              pbacsko Peter Bacsko
            • Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development