Details
Description
The JobConf object created in DStream.saveAsHadoopFiles is used concurrently in multiple places:
- The JobConf is updated by RDD.saveAsHadoopFile() before the job is launched
- The JobConf is serialized as part of the DStream checkpoints.
These concurrent accesses (updating in one thread, while the another thread is serializing it) can lead to concurrentModidicationException in the underlying Java hashmap using in the internal Hadoop Configuration object.
Attachments
Issue Links
- links to