Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Cannot Reproduce
-
2.3.1
-
None
-
None
-
Spark in standalone mode, and 48 cores are available.
spark-defaults.conf as blew:
spark.pyshark.python /usr/bin/python3.6
spark.driver.memory 4g
spark.executor.memory 8gother configurations are at default.
Description
The dataset which I used attached.
First I loaded a dataframe from local disk:
df = spark.read.load('report_dataset')
there are about 200 partitions stored in s3, and the max size of partitions is 28.37MB.
after data loaded, I execute "df.take(1)" to test the dataframe, and expected output printed "[Row(s3_link='https://dcm-ul-phy.s3-china-1.eecloud.nsn-net.net/normal/run2/pool1/Tests.NbIot.NBCellSetupDelete.LTE3374_CellSetup_4x5M_2RX_3CELevel_Loop100.html', sequences=[364, 15, 184, 34, 524, 49, 30, 527, 44, 366, 125, 85, 69, 524, 49, 389, 575, 29, 179, 447, 168, 3, 223, 116, 573, 524, 49, 30, 527, 56, 366, 125, 85, 524, 118, 295, 440, 123, 389, 32, 575, 529, 192, 524, 49, 389, 575, 29, 179, 29, 140, 268, 96, 508, 389, 32, 575, 529, 192, 524, 49, 389, 575, 29, 179, 180, 451, 69, 286, 524, 49, 389, 575, 29, 42, 553, 451, 37, 125, 524, 49, 389, 575, 29, 42, 553, 451, 37, 125, 524, 49, 389, 575, 29, 42, 553, 451, 368, 125, 88, 588, 524, 49, 389, 575, 29, 42, 553, 451, 368, 125, 88, 588, 524, 49, 389, 575, 29, 42, 553, 451, 368, 125, 88, 588, 524, 49, 389], next_word=575, line_num=12)]"
Then I try to convert dataframe to the local iterator and want to print one row in dataframe for testing, and blew code is used:
for row in df.toLocalIterator():
print(row)
break
But there is no output printed after that code executed.
Then I execute "df.take(1)" and blew error is reported:
ERROR:root:Exception while sending command.
Traceback (most recent call last):
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/java_gateway.py", line 1159, in send_command
raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty
During handling of the above exception, another exception occurred:
ERROR:root:Exception while sending command.
Traceback (most recent call last):
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/java_gateway.py", line 1159, in send_command
raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/java_gateway.py", line 985, in send_command
response = connection.send_command(command)
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/java_gateway.py", line 1164, in send_command
"Error while receiving", e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:37735)
Traceback (most recent call last):
File "/opt/k2-v02/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-7-3959105b378f>", line 1, in <module>
df.take(1)
File "/opt/k2-v02/lib/python3.6/site-packages/pyspark/sql/dataframe.py", line 504, in take
return self.limit(num).collect()
File "/opt/k2-v02/lib/python3.6/site-packages/pyspark/sql/dataframe.py", line 493, in limit
jdf = self._jdf.limit(num)
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/java_gateway.py", line 1257, in _call_
answer, self.gateway_client, self.target_id, self.name)
File "/opt/k2-v02/lib/python3.6/site-packages/pyspark/sql/utils.py", line 63, in deco
return f(*a, **kw)
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/protocol.py", line 336, in get_return_value
format(target_id, ".", name))
py4j.protocol.Py4JError: An error occurred while calling o29.limit
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/k2-v02/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 1863, in showtraceback
stb = value.render_traceback()
AttributeError: 'Py4JError' object has no attribute 'render_traceback'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/java_gateway.py", line 929, in _get_connection
connection = self.deque.pop()
IndexError: pop from an empty deque
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/java_gateway.py", line 1067, in start
self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:37735)
Traceback (most recent call last):
File "/opt/k2-v02/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-7-3959105b378f>", line 1, in <module>
df.take(1)
File "/opt/k2-v02/lib/python3.6/site-packages/pyspark/sql/dataframe.py", line 504, in take
return self.limit(num).collect()
File "/opt/k2-v02/lib/python3.6/site-packages/pyspark/sql/dataframe.py", line 493, in limit
jdf = self._jdf.limit(num)
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/java_gateway.py", line 1257, in _call_
answer, self.gateway_client, self.target_id, self.name)
File "/opt/k2-v02/lib/python3.6/site-packages/pyspark/sql/utils.py", line 63, in deco
return f(*a, **kw)
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/protocol.py", line 336, in get_return_value
format(target_id, ".", name))
py4j.protocol.Py4JError: An error occurred while calling o29.limit
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/k2-v02/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 1863, in showtraceback
stb = value.render_traceback()
AttributeError: 'Py4JError' object has no attribute 'render_traceback'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/java_gateway.py", line 929, in _get_connection
connection = self.deque.pop()
IndexError: pop from an empty deque
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/java_gateway.py", line 1067, in start
self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:37735)
Traceback (most recent call last):
File "/opt/k2-v02/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-7-3959105b378f>", line 1, in <module>
df.take(1)
File "/opt/k2-v02/lib/python3.6/site-packages/pyspark/sql/dataframe.py", line 504, in take
return self.limit(num).collect()
File "/opt/k2-v02/lib/python3.6/site-packages/pyspark/sql/dataframe.py", line 493, in limit
jdf = self._jdf.limit(num)
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/java_gateway.py", line 1257, in _call_
answer, self.gateway_client, self.target_id, self.name)
File "/opt/k2-v02/lib/python3.6/site-packages/pyspark/sql/utils.py", line 63, in deco
return f(*a, **kw)
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/protocol.py", line 336, in get_return_value
format(target_id, ".", name))
py4j.protocol.Py4JError: An error occurred while calling o29.limit
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/k2-v02/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 1863, in showtraceback
stb = value.render_traceback()
AttributeError: 'Py4JError' object has no attribute 'render_traceback'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/java_gateway.py", line 929, in _get_connection
connection = self.deque.pop()
IndexError: pop from an empty deque
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/k2-v02/lib/python3.6/site-packages/py4j/java_gateway.py", line 1067, in start
self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
---------------------------------------------------------------------------Py4JError Traceback (most recent call last)<ipython-input-7-3959105b378f> in <module>()----> 1 df.take(1)/opt/k2-v02/lib/python3.6/site-packages/pyspark/sql/dataframe.py in take(self, num) 502 [Row(age=2, name=u'Alice'), Row(age=5, name=u'Bob')] 503 """--> 504 return self.limit(num).collect() 505 506 @since(1.3)/opt/k2-v02/lib/python3.6/site-packages/pyspark/sql/dataframe.py in limit(self, num) 491 [] 492 """--> 493 jdf = self._jdf.limit(num) 494 return DataFrame(jdf, self.sql_ctx) 495/opt/k2-v02/lib/python3.6/site-packages/py4j/java_gateway.py in _call_(self, *args) 1255 answer = self.gateway_client.send_command(command) 1256 return_value = get_return_value(-> 1257 answer, self.gateway_client, self.target_id, self.name) 1258 1259 for temp_arg in temp_args:/opt/k2-v02/lib/python3.6/site-packages/pyspark/sql/utils.py in deco(*a, **kw) 61 def deco(*a, **kw): 62 try:---> 63 return f(*a, **kw) 64 except py4j.protocol.Py4JJavaError as e: 65 s = e.java_exception.toString()/opt/k2-v02/lib/python3.6/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name) 334 raise Py4JError( 335 "An error occurred while calling {0}{1}{2}".--> 336 format(target_id, ".", name)) 337 else: 338 type = answer[1]Py4JError: An error occurred while calling o29.limit
Attachments
Attachments
Issue Links
- duplicates
-
SPARK-23961 pyspark toLocalIterator throws an exception
- Resolved