When run in cluster mode, the driver may have different memory (and configs) than executor, also if Kyro is used, then string can not be collected back to driver:
>>> sqlContext.range(10).selectExpr("repeat(cast(id as string), 9)").show() +----------------------------+ |repeat(cast(id as string),9)| +----------------------------+ | 0| | 1| | 2| | 3| | 4| | 5| | 6| | 7| | 8| | 9| +----------------------------+
- duplicates
-
SPARK-11657 Bad Dataframe data read from parquet
-
- Resolved
-
-
SPARK-14524 In SparkSQL, it can't be select column of String type because of UTF8String when setting more than 32G for executors.
-
- Closed
-
- links to