Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-7029

Unable to use hive built-in functions in sparkSQL

    XMLWordPrintableJSON

Details

    • Test
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 1.2.0
    • None
    • SQL
    • None

    Description

      Trying to use hive built-in functions in spark sql. facing classnotfound exception.

      Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.udf.generic.GenericUDF
      at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
      at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
      at java.security.AccessController.doPrivileged(Native Method)
      at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
      at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
      ... 160 more

      2015-04-20 18:00:45,663 INFO [sparkDriver-akka.actor.default-dispatcher-15] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Starting task 0.1 in stage 1.0 (TID 3, , PROCESS_LOCAL, 1107 bytes)
      2015-04-20 18:00:45,988 ERROR [sparkDriver-akka.actor.default-dispatcher-4] scheduler.TaskSchedulerImpl (Logging.scala:logError(75)) - Lost executor 0 on : remote Akka client disassociated
      2015-04-20 18:00:45,994 INFO [sparkDriver-akka.actor.default-dispatcher-4] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Re-queueing tasks for 0 from TaskSet 1.0
      2015-04-20 18:00:45,995 INFO [sparkDriver-akka.actor.default-dispatcher-7] client.AppClient$ClientActor (Logging.scala:logInfo(59)) - Executor updated: app-20150420180040-0003/0 is now EXITED (Command exited with code 50)
      2015-04-20 18:00:45,998 INFO [sparkDriver-akka.actor.default-dispatcher-7] cluster.SparkDeploySchedulerBackend (Logging.scala:logInfo(59)) - Executor app-20150420180040-0003/0 removed: Command exited with code 50
      2015-04-20 18:00:46,003 WARN [sparkDriver-akka.actor.default-dispatcher-4] scheduler.TaskSetManager (Logging.scala:logWarning(71)) - Lost task 0.1 in stage 1.0 (TID 3, ): ExecutorLostFailure (executor 0 lost)
      2015-04-20 18:00:46,005 ERROR [sparkDriver-akka.actor.default-dispatcher-4] cluster.SparkDeploySchedulerBackend (Logging.scala:logError(75)) - Asked to remove non-existent executor 0
      2015-04-20 18:00:46,005 ERROR [sparkDriver-akka.actor.default-dispatcher-4] cluster.SparkDeploySchedulerBackend (Logging.scala:logError(75)) - Asked to remove non-existent executor 0
      2015-04-20 18:00:46,005 INFO [sparkDriver-akka.actor.default-dispatcher-19] scheduler.DAGScheduler (Logging.scala:logInfo(59)) - Executor lost: 0 (epoch 1)
      2015-04-20 18:00:46,005 ERROR [sparkDriver-akka.actor.default-dispatcher-4] cluster.SparkDeploySchedulerBackend (Logging.scala:logError(75)) - Asked to remove non-existent executor 0
      2015-04-20 18:00:46,005 INFO [sparkDriver-akka.actor.default-dispatcher-7] client.AppClient$ClientActor (Logging.scala:logInfo(59)) - Executor added: app-20150420180040-0003/1 on worker-20150420155831- -7078 ( :7078) with 4 cores
      2015-04-20 18:00:46,006 INFO [sparkDriver-akka.actor.default-dispatcher-7] cluster.SparkDeploySchedulerBackend (Logging.scala:logInfo(59)) - Granted executor ID app-20150420180040-0003/1 on hostPort :7078 with 4 cores, 512.0 MB RAM
      2015-04-20 18:00:46,006 INFO [sparkDriver-akka.actor.default-dispatcher-17] storage.BlockManagerMasterActor (Logging.scala:logInfo(59)) - Trying to remove executor 0 from BlockManagerMaster.
      2015-04-20 18:00:46,006 INFO [sparkDriver-akka.actor.default-dispatcher-7] client.AppClient$ClientActor (Logging.scala:logInfo(59)) - Executor updated: app-20150420180040-0003/1 is now LOADING
      2015-04-20 18:00:46,007 INFO [sparkDriver-akka.actor.default-dispatcher-17] storage.BlockManagerMasterActor (Logging.scala:logInfo(59)) - Removing block manager BlockManagerId(0, , 40573)
      2015-04-20 18:00:46,008 ERROR [sparkDriver-akka.actor.default-dispatcher-5] cluster.SparkDeploySchedulerBackend (Logging.scala:logError(75)) - Asked to remove non-existent executor 0
      2015-04-20 18:00:46,008 INFO [sparkDriver-akka.actor.default-dispatcher-19] storage.BlockManagerMaster (Logging.scala:logInfo(59)) - Removed 0 successfully in removeExecutor
      2015-04-20 18:00:46,010 INFO [sparkDriver-akka.actor.default-dispatcher-19] scheduler.Stage (Logging.scala:logInfo(59)) - Stage 0 is now unavailable on executor 0 (0/2, false)
      2015-04-20 18:00:46,011 ERROR [sparkDriver-akka.actor.default-dispatcher-6] cluster.SparkDeploySchedulerBackend (Logging.scala:logError(75)) - Asked to remove non-existent executor 0
      2015-04-20 18:00:46,012 INFO [sparkDriver-akka.actor.default-dispatcher-6] client.AppClient$ClientActor (Logging.scala:logInfo(59)) - Executor updated: app-20150420180040-0003/1 is now RUNNING
      2015-04-20 18:00:47,215 INFO [sparkDriver-akka.actor.default-dispatcher-6] cluster.SparkDeploySchedulerBackend (Logging.scala:logInfo(59)) - Registered executor: Actor[akka.tcp://sparkExecutor@ :38892/user/Executor#-1691085362] with ID 1
      2015-04-20 18:00:47,216 INFO [sparkDriver-akka.actor.default-dispatcher-6] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Starting task 0.2 in stage 1.0 (TID 4, , PROCESS_LOCAL, 1107 bytes)
      2015-04-20 18:00:47,480 INFO [sparkDriver-akka.actor.default-dispatcher-19] storage.BlockManagerMasterActor (Logging.scala:logInfo(59)) - Registering block manager :58200 with 265.4 MB RAM, BlockManagerId(1, , 58200)
      2015-04-20 18:00:47,688 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerInfo (Logging.scala:logInfo(59)) - Added broadcast_2_piece0 in memory on :58200 (size: 45.8 KB, free: 265.4 MB)
      2015-04-20 18:00:47,916 INFO [task-result-getter-3] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Lost task 0.2 in stage 1.0 (TID 4) on executor : java.lang.NoClassDefFoundError (Lorg/apache/hadoop/hive/ql/udf/generic/GenericUDF [duplicate 1]
      2015-04-20 18:00:47,917 INFO [sparkDriver-akka.actor.default-dispatcher-4] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Starting task 0.3 in stage 1.0 (TID 5, , PROCESS_LOCAL, 1107 bytes)
      2015-04-20 18:00:48,246 ERROR [sparkDriver-akka.actor.default-dispatcher-19] scheduler.TaskSchedulerImpl (Logging.scala:logError(75)) - Lost executor 1 on : remote Akka client disassociated
      2015-04-20 18:00:48,246 INFO [sparkDriver-akka.actor.default-dispatcher-19] scheduler.TaskSetManager (Logging.scala:logInfo(59)) - Re-queueing tasks for 1 from TaskSet 1.0
      2015-04-20 18:00:48,246 WARN [sparkDriver-akka.actor.default-dispatcher-19] scheduler.TaskSetManager (Logging.scala:logWarning(71)) - Lost task 0.3 in stage 1.0 (TID 5, ): ExecutorLostFailure (executor 1 lost)
      2015-04-20 18:00:48,251 ERROR [sparkDriver-akka.actor.default-dispatcher-19] scheduler.TaskSetManager (Logging.scala:logError(75)) - Task 0 in stage 1.0 failed 4 times; aborting job
      2015-04-20 18:00:48,254 INFO [sparkDriver-akka.actor.default-dispatcher-5] client.AppClient$ClientActor (Logging.scala:logInfo(59)) - Executor updated: app-20150420180040-0003/1 is now EXITED (Command exited with code 50)
      2015-04-20 18:00:48,254 INFO [sparkDriver-akka.actor.default-dispatcher-5] cluster.SparkDeploySchedulerBackend (Logging.scala:logInfo(59)) - Executor app-20150420180040-0003/1 removed: Command exited with code 50
      2015-04-20 18:00:48,258 INFO [sparkDriver-akka.actor.default-dispatcher-19] scheduler.TaskSchedulerImpl (Logging.scala:logInfo(59)) - Removed TaskSet 1.0, whose tasks have all completed, from pool
      2015-04-20 18:00:48,258 ERROR [sparkDriver-akka.actor.default-dispatcher-19] cluster.SparkDeploySchedulerBackend (Logging.scala:logError(75)) - Asked to remove non-existent executor 1
      2015-04-20 18:00:48,260 ERROR [sparkDriver-akka.actor.default-dispatcher-19] cluster.SparkDeploySchedulerBackend (Logging.scala:logError(75)) - Asked to remove non-existent executor 1
      2015-04-20 18:00:48,260 ERROR [sparkDriver-akka.actor.default-dispatcher-19] cluster.SparkDeploySchedulerBackend (Logging.scala:logError(75)) - Asked to remove non-existent executor 1
      2015-04-20 18:00:48,260 INFO [sparkDriver-akka.actor.default-dispatcher-4] scheduler.TaskSchedulerImpl (Logging.scala:logInfo(59)) - Cancelling stage 1
      2015-04-20 18:00:48,260 INFO [sparkDriver-akka.actor.default-dispatcher-5] client.AppClient$ClientActor (Logging.scala:logInfo(59)) - Executor added: app-20 0-0003/2 on worker-20150420155831- -7078 ( :7078) with 4 cores
      2015-04-20 18:00:48,260 INFO [sparkDriver-akka.actor.default-dispatcher-5] cluster.SparkDeploySchedulerBackend (Logging.scala:logInfo(59)) - Granted executo 150420180040-0003/2 on hostPort :7078 with 4 cores, 512.0 MB RAM
      2015-04-20 18:00:48,261 INFO [sparkDriver-akka.actor.default-dispatcher-5] client.AppClient$ClientActor (Logging.scala:logInfo(59)) - Executor updated: app- 040-0003/2 is now LOADING
      2015-04-20 18:00:48,265 ERROR [sparkDriver-akka.actor.default-dispatcher-19] cluster.SparkDeploySchedulerBackend (Logging.scala:logError(75)) - Asked to remo tent executor 1
      2015-04-20 18:00:48,267 INFO [main] scheduler.DAGScheduler (Logging.scala:logInfo(59)) - Job 0 failed: saveAsTextFile at JavaSchemaRDD.scala:42, took 4.8417
      Exception in thread "main" 2015-04-20 18:00:48,267 INFO [sparkDriver-akka.actor.default-dispatcher-16] client.AppClient$ClientActor (Logging.scala:logInfo(5 tor updated: app-20150420180040-0003/2 is now RUNNING
      2015-04-20 18:00:48,268 ERROR [sparkDriver-akka.actor.default-dispatcher-20] cluster.SparkDeploySchedulerBackend (Logging.scala:logError(75)) - Asked to remo tent executor 1
      org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 5 6.igatecorp.com): ExecutorLostFailure (executor 1 lost)
      Driver stacktrace:
      at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
      at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
      at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
      at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
      at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
      at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
      at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
      at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
      at scala.Option.foreach(Option.scala:236)
      at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696)
      at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1420)
      at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
      at akka.actor.ActorCell.invoke(ActorCell.scala:456)
      at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
      at akka.dispatch.Mailbox.run(Mailbox.scala:219)
      at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
      at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
      at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
      at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
      at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
      2015-04-20 18:00:48,272 INFO [sparkDriver-akka.actor.default-dispatcher-4] scheduler.DAGScheduler (Logging.scala:logInfo(59)) - Executor lost: 1 (epoch 2)
      2015-04-20 18:00:48,272 INFO [sparkDriver-akka.actor.default-dispatcher-3] storage.BlockManagerMasterActor (Logging.scala:logInfo(59)) - Trying to remove ex om BlockManagerMaster.
      2015-04-20 18:00:48,272 INFO [sparkDriver-akka.actor.default-dispatcher-3] storage.BlockManagerMasterActor (Logging.scala:logInfo(59)) - Removing block mana nagerId(1, , 58200)
      2015-04-20 18:00:48,273 INFO [sparkDriver-akka.actor.default-dispatcher-4] storage.BlockManagerMaster (Logging.scala:logInfo(59)) - Removed 1 successfully i cutor

      Attachments

        Activity

          People

            Unassigned Unassigned
            adii.parmar Aditya Parmar
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: