Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4076

Stream job fails with ZipException when use yarn jar command

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-alpha, 3.0.0
    • Fix Version/s: 2.0.0-alpha
    • Component/s: mrv2
    • Labels:
      None

      Description

      Stream job fails with ZipException when use yarn jar command and executes successfully with hadoop jar command.

      linux-f330:/home/devaraj/hadoop/trunk/hadoop-0.24.0-SNAPSHOT/bin # ./yarn jar ../share/hadoop/tools/lib/hadoop-streaming-0.24.0-SNAPSHOT.jar -input /hadoop -output /test/output/1 -mapper cat -reducer wc
      packageJobJar: [] [/home/devaraj/hadoop/trunk/hadoop-0.24.0-SNAPSHOT/bin/$%7Bhadoop.home.dir%7D/hadoop-$%7Buser.name%7D/hadoop-unjar4241129353499211360/] /tmp/streamjob7683981905208294893.jar tmpDir=null
      Exception in thread "main" java.io.IOException: java.util.zip.ZipException: ZIP file must have at least one entry
              at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:82)
              at org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java:707)
              at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java:948)
              at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:127)
              at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
              at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
              at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
              at java.lang.reflect.Method.invoke(Method.java:597)
              at org.apache.hadoop.util.RunJar.main(RunJar.java:200)
      

        Activity

        Hide
        Devaraj K added a comment -

        When we use 'yarn jar' command, RunJar.java tries to create temp directory if doesn't exist using configuration property "hadoop.tmp.dir". When it gets from the conf object, it will get the value as $

        {hadoop.home.dir}/hadoop-${user.name}. Here these vars are not replaced with system properties because of unavailability of 'hadoop.home.dir' system property. It will create the temp dir with the same name(i.e ${hadoop.home.dir}

        /hadoop-$

        {user.name}) in the current dir.

        StreamJob unjars and keeps classes in the directory current-dir/${hadoop.home.dir}/hadoop-${user.name}

        , and then it tries to find "org/apache/hadoop/streaming/StreamJob.class" in the classpath and it gets the path as curent-dir/$%7Bhadoop.home.dir%7D/hadoop-$%7Buser.name%7D/hadoop-unjar8421477351848586067/ due to special chars in the directory name. And finally fails to merge from this path to the job jar file.

        If we do the same with 'hadoop jar', it will get the prop as $HADOOP_HOME/hadoop-username which is replaced with 'hadoop.home.dir' and 'user.name' properties , it will create the temp dir properly uses the same for other things to do and works fine.

        I have attached the patch to address the above problem by adding the hadoop.home.dir system property in yarn file.

        Show
        Devaraj K added a comment - When we use 'yarn jar' command, RunJar.java tries to create temp directory if doesn't exist using configuration property "hadoop.tmp.dir". When it gets from the conf object, it will get the value as $ {hadoop.home.dir}/hadoop-${user.name}. Here these vars are not replaced with system properties because of unavailability of 'hadoop.home.dir' system property. It will create the temp dir with the same name(i.e ${hadoop.home.dir} /hadoop-$ {user.name}) in the current dir. StreamJob unjars and keeps classes in the directory current-dir/${hadoop.home.dir}/hadoop-${user.name} , and then it tries to find "org/apache/hadoop/streaming/StreamJob.class" in the classpath and it gets the path as curent-dir/$%7Bhadoop.home.dir%7D/hadoop-$%7Buser.name%7D/hadoop-unjar8421477351848586067/ due to special chars in the directory name. And finally fails to merge from this path to the job jar file. If we do the same with 'hadoop jar', it will get the prop as $HADOOP_HOME/hadoop-username which is replaced with 'hadoop.home.dir' and 'user.name' properties , it will create the temp dir properly uses the same for other things to do and works fine. I have attached the patch to address the above problem by adding the hadoop.home.dir system property in yarn file.
        Hide
        Robert Joseph Evans added a comment -

        The patch looks OK I just have one concern. Inside hadoop-config.sh there is the line

        HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.home.dir=$HADOOP_PREFIX"

        This should be sourced by yarn-config.sh. Why is that not sufficient to set hadoop.home.dir?

        I am also a bit confused where the hadoop.home.dir is coming from and why this is a Blocker, it appears to be from your core-site.xml, because hadoop.tmp.dir defaults to /tmp/$

        {user}

        -hadoop. Is this critical can you update your config as a workaround?

        Show
        Robert Joseph Evans added a comment - The patch looks OK I just have one concern. Inside hadoop-config.sh there is the line HADOOP_OPTS= "$HADOOP_OPTS -Dhadoop.home.dir=$HADOOP_PREFIX" This should be sourced by yarn-config.sh. Why is that not sufficient to set hadoop.home.dir? I am also a bit confused where the hadoop.home.dir is coming from and why this is a Blocker, it appears to be from your core-site.xml, because hadoop.tmp.dir defaults to /tmp/$ {user} -hadoop. Is this critical can you update your config as a workaround?
        Hide
        Robert Joseph Evans added a comment -

        OK I just looked and answered my own question HADOOP_OPTS is not the same as YARN_OPTS.

        I don't really like that if you run it with yarn hadoop.home.dir is set to YARN_HOME, and if you run it with hadoop it is set to HADOOP_PREFIX, but that is something for a different JIRA. +1

        Show
        Robert Joseph Evans added a comment - OK I just looked and answered my own question HADOOP_OPTS is not the same as YARN_OPTS. I don't really like that if you run it with yarn hadoop.home.dir is set to YARN_HOME, and if you run it with hadoop it is set to HADOOP_PREFIX, but that is something for a different JIRA. +1
        Hide
        Robert Joseph Evans added a comment -

        Thanks Devaraj, I just put this into trunk and branch-2

        Show
        Robert Joseph Evans added a comment - Thanks Devaraj, I just put this into trunk and branch-2
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #2118 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2118/)
        MAPREDUCE-4076. Stream job fails with ZipException when use yarn jar command (Devaraj K via bobby) (Revision 1312003)

        Result = SUCCESS
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1312003
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/bin/yarn
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2118 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2118/ ) MAPREDUCE-4076 . Stream job fails with ZipException when use yarn jar command (Devaraj K via bobby) (Revision 1312003) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1312003 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/bin/yarn
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #2044 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2044/)
        MAPREDUCE-4076. Stream job fails with ZipException when use yarn jar command (Devaraj K via bobby) (Revision 1312003)

        Result = SUCCESS
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1312003
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/bin/yarn
        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #2044 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2044/ ) MAPREDUCE-4076 . Stream job fails with ZipException when use yarn jar command (Devaraj K via bobby) (Revision 1312003) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1312003 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/bin/yarn
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #2057 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2057/)
        MAPREDUCE-4076. Stream job fails with ZipException when use yarn jar command (Devaraj K via bobby) (Revision 1312003)

        Result = SUCCESS
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1312003
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/bin/yarn
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #2057 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2057/ ) MAPREDUCE-4076 . Stream job fails with ZipException when use yarn jar command (Devaraj K via bobby) (Revision 1312003) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1312003 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/bin/yarn
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #1011 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1011/)
        MAPREDUCE-4076. Stream job fails with ZipException when use yarn jar command (Devaraj K via bobby) (Revision 1312003)

        Result = FAILURE
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1312003
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/bin/yarn
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1011 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1011/ ) MAPREDUCE-4076 . Stream job fails with ZipException when use yarn jar command (Devaraj K via bobby) (Revision 1312003) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1312003 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/bin/yarn
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #1046 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1046/)
        MAPREDUCE-4076. Stream job fails with ZipException when use yarn jar command (Devaraj K via bobby) (Revision 1312003)

        Result = SUCCESS
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1312003
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/bin/yarn
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1046 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1046/ ) MAPREDUCE-4076 . Stream job fails with ZipException when use yarn jar command (Devaraj K via bobby) (Revision 1312003) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1312003 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/bin/yarn

          People

          • Assignee:
            Devaraj K
            Reporter:
            Devaraj K
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development