Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21431

Unable to parallelize a list in Scala

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 2.1.0
    • None
    • None

    Description

      I am using RabbitMQ spark streaming connector to process AVRO messages and I am unable to parallelize a list in scala, getting java.lang.NullPointerException. The same worked when I was using Kafka spark connector
      messages.foreachRDD( rdd => {
      for(avroLine <- rdd)

      { val record = Injection.injection.invert(avroLine.getBytes).get val field1Value = record.get("username") val jsonStrings=Seq(record.toString()) val newRow = sqlContext.sparkContext.parallelize(Seq(record.toString())) }

      })

      Output: jsonStrings...List(

      {"username": "user_118", "tweet": "tweet_218", "timestamp": 18}

      )

      Exception:

      Caused by: java.lang.NullPointerException
      at com.capitalone.AvroConsumer$$anonfun$main$1$$anonfun$apply$1.apply(AvroConsumer.scala:83)
      at com.capitalone.AvroConsumer$$anonfun$main$1$$anonfun$apply$1.apply(AvroConsumer.scala:74)
      at scala.collection.Iterator$class.foreach(Iterator.scala:893)
      at org.apache.spark.util.CompletionIterator.foreach(CompletionIterator.scala:26)
      at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:917)
      at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:917)
      at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
      at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
      at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
      at org.apache.spark.scheduler.Task.run(Task.scala:99)
      at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
      17/07/16 19:11:00 WARN TaskSetManager: Lost task 0.0 in stage 47.0 (TID 47, localhost, executor driver): java.lang.NullPointerException
      at com.capitalone.AVROMqStreaming$$anonfun$main$1$$anonfun$apply$1.apply(AVROMqStreaming.scala:56)
      at com.capitalone.AVROMqStreaming$$anonfun$main$1$$anonfun$apply$1.apply(AVROMqStreaming.scala:46)
      at scala.collection.Iterator$class.foreach(Iterator.scala:893)
      at org.apache.spark.util.CompletionIterator.foreach(CompletionIterator.scala:26)
      at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:917)
      at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:917)
      at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
      at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
      at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
      at org.apache.spark.scheduler.Task.run(Task.scala:99)
      at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)

      Thanks in Advance!!

      Attachments

        Activity

          People

            Unassigned Unassigned
            mg2729 Mounika
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: