Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Cannot Reproduce
-
1.4.1
-
None
-
None
-
Windows 10, spark - 1.4.1,
Description
When attempting to train a machine learning model using ALS in Spark's MLLib (1.4) on windows, Pyspark always terminates with a StackoverflowError. I tried adding the checkpoint as described in http://stackoverflow.com/a/31484461/36130 – doesn't seem to help.
Here's the training code and stack trace:
ranks = [8, 12] lambdas = [0.1, 10.0] numIters = [10, 20] bestModel = None bestValidationRmse = float("inf") bestRank = 0 bestLambda = -1.0 bestNumIter = -1 for rank, lmbda, numIter in itertools.product(ranks, lambdas, numIters): ALS.checkpointInterval = 2 model = ALS.train(training, rank, numIter, lmbda) validationRmse = computeRmse(model, validation, numValidation) if (validationRmse < bestValidationRmse): bestModel = model bestValidationRmse = validationRmse bestRank = rank bestLambda = lmbda bestNumIter = numIter testRmse = computeRmse(bestModel, test, numTest)
Stacktrace:
15/08/27 02:02:58 ERROR Executor: Exception in task 3.0 in stage 56.0 (TID 127)
java.lang.StackOverflowError
at java.io.ObjectInputStream$BlockDataInputStream.readInt(Unknown Source)
at java.io.ObjectInputStream.readHandle(Unknown Source)
at java.io.ObjectInputStream.readClassDesc(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
at java.io.ObjectInputStream.readSerialData(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
at java.io.ObjectInputStream.readSerialData(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
at java.io.ObjectInputStream.readSerialData(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at java.io.ObjectStreamClass.invokeReadObject(Unknown Source)
at java.io.ObjectInputStream.readSerialData(Unknown Source)