Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
1.6.2
-
None
-
None
Description
I have reproduced the issue in Spark 1.6.2 and latest 1.6.3-SNAPSHOT code.
The code works ok on Spark 1.6.1.
I have a notebook up on Databricks Community Edition that demonstrates the issue. The notebook depends on the library com.databricks:spark-csv_2.10:1.4.0
The code uses some custom code to join 4 dataframes.
It calls show on this dataframe and the data is as expected.
After calling .cache, the data is blanked.
Attachments
Issue Links
- duplicates
-
SPARK-16664 Spark 1.6.2 - Persist call on Data frames with more than 200 columns is wiping out the data.
- Resolved