Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-11227

Spark1.5+ HDFS HA mode throw java.net.UnknownHostException: nameservice1

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.5.0, 1.5.1, 1.6.1, 2.0.0
    • 2.0.1, 2.1.0
    • Spark Core
    • None
    • OS: CentOS 6.6
      Memory: 28G
      CPU: 8
      Mesos: 0.22.0
      HDFS: Hadoop 2.6.0-CDH5.4.0 (build by Cloudera Manager)

    Description

      When running jar including Spark Job at HDFS HA Cluster, Mesos and Spark1.5.1, the job throw Exception as "java.net.UnknownHostException: nameservice1" and fail.

      I do below in Terminal.

      /opt/spark/bin/spark-submit \
        --class com.example.Job /jobs/job-assembly-1.0.0.jar
      

      So, job throw below message.

      15/10/21 15:22:12 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, spark003.example.com): java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice1
              at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
              at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:312)
              at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:178)
              at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:665)
              at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:601)
              at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:148)
              at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
              at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
              at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
              at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
              at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
              at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)
              at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:656)
              at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:436)
              at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:409)
              at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1016)
              at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1016)
              at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
              at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
              at scala.Option.map(Option.scala:145)
              at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176)
              at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:220)
              at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216)
              at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
              at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
              at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
              at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
              at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
              at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
              at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
              at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
              at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
              at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
              at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
              at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
              at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
              at org.apache.spark.scheduler.Task.run(Task.scala:88)
              at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:745)
      Caused by: java.net.UnknownHostException: nameservice1
              ... 41 more
      

      But, I changed from Spark Cluster 1.5.1 to Spark Cluster 1.4.0, then run the job, job complete with Success.
      In Addition, I disable High Availability on HDFS, then run the job, job complete with Success.

      So, I think Spark1.5 and higher have bug as the point.

      note: I try these packages in my Cluster, But both of these fail.

      • spark-1.5.1-bin-hadoop2.6.tgz
      • spark-1.5.1-bin-without-hadoop.tgz

      Only spark-1.4.0-bin-hadoop2.6.tgz success.

      Attachments

        Activity

          People

            sarutak Kousuke Saruta
            x1 Yuri Saito
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: