Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-629

Kylin failed to run mapreduce job if there is no mapreduce.application.classpath in mapred-site.xml

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: v0.7.1
    • Fix Version/s: v0.7.1
    • Component/s: Job Engine
    • Labels:
      None

      Description

      I deployed 0.7.1 snapshot in our hadoop cluster, sync a table to Kylin, which trigger Kylin submit a hadoop job to calculate the fields cardinality; But this job was failed, the error is:

      15/03/10 01:05:25 INFO mapreduce.Job: Job job_1425075571333_139544 failed with state FAILED due to: Application application_1425075571333_139544 failed 2 times due to AM Container for appattempt_1425075571333_139544_000002 exited with exitCode: 1 due to: Exception from container-launch:
      org.apache.hadoop.util.Shell$ExitCodeException:
      at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
      at org.apache.hadoop.util.Shell.run(Shell.java:418)
      at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
      at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:279)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
      at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)

      Adding more log in AbstractHadoopJob.setJobClasspath, get the following debug message:
      [pool-7-thread-1]:[2015-03-10 00:33:17,739][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob.setJobClasspath(AbstractHadoopJob.java:146)] - append job jar: /export/home/b_kylin/kylin_ii/lib/kylin-job-0.7.1-SNAPSHOT.jar
      [pool-7-thread-1]:[2015-03-10 00:33:17,740][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob.setJobClasspath(AbstractHadoopJob.java:152)] - append kylin.hive.dependency: /apache/hive/conf:/apache/hive/lib/*:/apache/hive-0.13.0.2.1.3.0-563/hcatalog/share/hcatalog/hive-hcatalog-core-0.13.0.2.1.3.0-563.jar to mapreduce.application.classpath
      [pool-7-thread-1]:[2015-03-10 00:33:17,740][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob.setJobClasspath(AbstractHadoopJob.java:164)] - Hadoop job classpath is: /apache/hive/conf,/apache/hive/lib/*,/apache/hive-0.13.0.2.1.3.0-563/hcatalog/share/hcatalog/hive-hcatalog-core-0.13.0.2.1.3.0-563.jar

      From it I see the job configuration's "mapreduce.application.classpath" was empty before appending the hive dependencies; After appending, there is only hive on the classpath, which would cause the hadoop job failed to run.

      Doing some search, find the explaination of "mapreduce.application.classpath" in:
      https://hadoop.apache.org/docs/r2.6.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

      Now we know that, the added hive dependency will overwrite the default classpath, which is not expected.

        Attachments

          Activity

            People

            • Assignee:
              shaofengshi Shao Feng Shi
              Reporter:
              shaofengshi Shao Feng Shi
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: