[TOREE-457] spark context seen corrupted after load KAfka libraries - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.2.0
Component/s: Kernel
Labels:
None

Description

I am trying to set up a jupyter notebook (apache-toree Scala) to access kafka logs from spark a streaming.
First I add dependencies using AddDeps:

%AddDeps org.apache.spark spark-streaming-kafka-0-10_2.11 2.2.0. 
Marking org.apache.spark:spark-streaming-kafka-0-10_2.11:2.2.0 for download Preparing to fetch from:
-> file:/tmp/toree_add_deps8235567186565695423/
-> https://repo1.maven.org/maven2
-> New file at /tmp/toree_add_deps8235567186565695423/https/repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka-0-10_2.11/2.2.0/spark-streaming-kafka-0-10_2.11-2.2.0.jar

After that I am able to import successfully part of necesary libraries:


import org.apache.spark.SparkConf
import org.apache.spark.streaming._
import org.apache.spark.streaming.kafka010._

However code fails when I try to create streaming context:

val ssc = new StreamingContext(sc, Seconds(2))

    Name: Compile Error
Message: <console>:38: error: overloaded method constructor StreamingContext with alternatives:
  (path: String,sparkContext: org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext)org.apache.spark.streaming.StreamingContext <and>
  (path: String,hadoopConf: org.apache.hadoop.conf.Configuration)org.apache.spark.streaming.StreamingContext <and>
  (conf: org.apache.spark.SparkConf,batchDuration: org.apache.spark.streaming.Duration)org.apache.spark.streaming.StreamingContext <and>
  (sparkContext: org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext,batchDuration: org.apache.spark.streaming.Duration)org.apache.spark.streaming.StreamingContext
 cannot be applied to (org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext, org.apache.spark.streaming.Duration)
       val ssc = new StreamingContext(sc, Seconds(2))
                 ^
StackTrace:

I have try it, in a jupyter docker
https://github.com/jupyter/docker-stacks/tree/master/all-spark-notebook
and in spark cluster set up in Google cloud platform with the same results
Thanks

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: fxoSa

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 27/Nov/17 18:46

Updated:: 26/Dec/17 13:40

Resolved:: 26/Dec/17 13:40