[SYSTEMDS-1370] Py4JError: An error occurred while calling z:org.apache.sysml.runtime.instructions.spark.utils.RDDConverterUtilsExt.convertPy4JArrayToMB. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: Not Applicable
Fix Version/s: SystemML 0.14
Component/s: APIs
Labels:
None
Environment:
pyspark with local Spark 2.1

Description

Do we have undocumented limits for RDDConverterUtilsExt.convertPy4JArrayToMB?

Below simple script works for 23100 rows, while 46900 fails. This is how to easily and consistently reproduce.

START:
$pyspark --master local --jars $SYSTEMML_HOME/SystemML.jar --driver-memory 8G --executor-memory 2G

PYTHON SCRIPT:
from systemml import MLContext, dml
import pandas as pd

sc.version
ml = MLContext(sc)
print "Spark Version:", sc.version
print "SystemML Version:", ml.version()
print "SystemML Built-Time:", ml.buildTime()

!! number of rows 23100 works, while 46900 fails
nr = 46900

X_pd = pd.DataFrame(range(1, (nr*784)+1,1),dtype=float).values.reshape(nr,784)

script ="""
write(X, $Xfile, format="csv")
"""
prog = dml(script).input(X=X_pd).input(**

{"$Xfile":"/tmp/X_pd.csv"}

)
ml.execute(prog)

OUTPUT:
Spark Version: 2.1.0
SystemML Version: 0.14.0-incubating-SNAPSHOT
SystemML Built-Time: 2017-03-03 07:33:40 UTC
---------------------------------------------------------------------------
Py4JError Traceback (most recent call last)
.......

Py4JError: An error occurred while calling z:org.apache.sysml.runtime.instructions.spark.utils.RDDConverterUtilsExt.convertPy4JArrayToMB. Trace:
java.lang.NegativeArraySizeException
at py4j.Base64.decode(Base64.java:321)
at py4j.Protocol.getBytes(Protocol.java:173)
at py4j.Protocol.getObject(Protocol.java:294)
at py4j.commands.AbstractCommand.getArguments(AbstractCommand.java:82)
at py4j.commands.CallCommand.execute(CallCommand.java:77)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:745)

Attachments

Activity

People

Assignee:: Niketan Pansare

Reporter:: Berthold Reinwald

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 03/Mar/17 08:55

Updated:: 02/May/17 18:33

Resolved:: 17/Mar/17 19:23