Details
-
Bug
-
Status: Open
-
Blocker
-
Resolution: Unresolved
-
3.3.0
-
None
Description
SPARK-37820 was introduced in Spark 3.3 and breaks behavior of base64 (which is fine but shouldn't happen between minor version).
Spark 3.2
>>> spark.sql(f"""SELECT base64('{'a' * 58}') AS base64""").collect()[0][0] 'YWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYQ=='
Note the different output in Spark 3.3 (the addition of \r\n newlines).
Spark 3.3
>>> spark.sql(f"""SELECT base64('{'a' * 58}') AS base64""").collect()[0][0] 'YWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFh\r\nYQ=='
The former decodes fine with the base64 on my machine but the latter does not:
$ pbpaste | base64 --decode
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa%
$ pbpaste | base64 --decode
base64: stdin: (null): error decoding base64 input stream
Attachments
Issue Links
- links to