Uploaded image for project: 'TOREE'
  1. TOREE
  2. TOREE-457

spark context seen corrupted after load KAfka libraries

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 0.2.0
    • Kernel
    • None

    Description

      I am trying to set up a jupyter notebook (apache-toree Scala) to access kafka logs from spark a streaming.
      First I add dependencies using AddDeps:

      %AddDeps org.apache.spark spark-streaming-kafka-0-10_2.11 2.2.0. 
      Marking org.apache.spark:spark-streaming-kafka-0-10_2.11:2.2.0 for download Preparing to fetch from:
      -> file:/tmp/toree_add_deps8235567186565695423/
      -> https://repo1.maven.org/maven2
      -> New file at /tmp/toree_add_deps8235567186565695423/https/repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka-0-10_2.11/2.2.0/spark-streaming-kafka-0-10_2.11-2.2.0.jar
      

      After that I am able to import successfully part of necesary libraries:

      
      import org.apache.spark.SparkConf
      import org.apache.spark.streaming._
      import org.apache.spark.streaming.kafka010._
      

      However code fails when I try to create streaming context:

      val ssc = new StreamingContext(sc, Seconds(2))
      
          Name: Compile Error
      Message: <console>:38: error: overloaded method constructor StreamingContext with alternatives:
        (path: String,sparkContext: org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext)org.apache.spark.streaming.StreamingContext <and>
        (path: String,hadoopConf: org.apache.hadoop.conf.Configuration)org.apache.spark.streaming.StreamingContext <and>
        (conf: org.apache.spark.SparkConf,batchDuration: org.apache.spark.streaming.Duration)org.apache.spark.streaming.StreamingContext <and>
        (sparkContext: org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext,batchDuration: org.apache.spark.streaming.Duration)org.apache.spark.streaming.StreamingContext
       cannot be applied to (org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext, org.apache.spark.streaming.Duration)
             val ssc = new StreamingContext(sc, Seconds(2))
                       ^
      StackTrace: 
      
      

      I have try it, in a jupyter docker
      https://github.com/jupyter/docker-stacks/tree/master/all-spark-notebook
      and in spark cluster set up in Google cloud platform with the same results
      Thanks

      Attachments

        Activity

          People

            Unassigned Unassigned
            fxoSdev fxoSa
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: