Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Invalid
-
None
-
None
-
None
-
spark running in emr 4.3 with hadoop 2.7 and spark 1.6.0
Description
> config <- spark_config()
> config$`sparklyr.shell.driver-memory` <- "4G"
> config$`sparklyr.shell.executor-memory` <- "4G"
> sc <- spark_connect(master = "yarn-client", config = config)
Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 27.0 failed 4 times, most recent failure: Lost task 0.3 in stage 27.0 (TID 1941, ip- .ec2.internal): org.apache.spark.SparkException: Values to assemble cannot be null.
at org.apache.spark.ml.feature.VectorAssembler$$anonfun$assemble$1.apply(VectorAssembler.scala:154)
at org.apache.spark.ml.feature.VectorAssembler$$anonfun$assemble$1.apply(VectorAssembler.scala:137)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
at org.apache.spark.ml.feature.VectorAssembler$.assemble(VectorAssembler.scala:137)
at org.apache.spark.ml.feature.VectorAssembler$$anonfun$3.apply(VectorAssembler.scala:95)
at org.apache.spark.ml.feature.VectorAssembler$$anonfun$3.apply(VectorAssembler.scala:94)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Sou