Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.4.8, 3.3.1
-
None
Description
Recently, we have encountered the thread-safe issue of[ SPARK-31511|https://issues.apache.org/jira/browse/SPARK-31511] in production. The version 2.4.3 we use has not been fixed yet, which leads to data errors. I think this is a serious error, the data broadcast by the Driver is inconsistent with the data of the Executor. The Executor side should confirm the correctness of the data when reading the data. The numKeys and numValues read from the file header should be consistent with the real data read. This judgment should be added to prevent wrong data from being calculated.