Details
-
Bug
-
Status: Resolved
-
P2
-
Resolution: Fixed
-
None
-
None
Description
Problem
Dataflow Java batch jobs with large side input intermittently throws NullPointerException or IllegalStateException.
- NullPointerException happens at IsmReaderImpl.overKeyComponents:
- IllegalStateException happens at IsmReaderImpl. initializeForKeyedRead .
(all error logs in the Dataflow job is here.)
Hypothesis
The initializeForKeyedRead is not synchronized. Multiple threads can enter the method so that initialize the index for the same shard and update indexPerShard without synchronization. And, the overKeyComponents also accesses indexPerShard without synchronization. As indexPerShard is just a HashMap which is not thread-safe, it can cause NullPointerException and IllegalStateException above.
Suggestion
I think it can fix this issue if we change the type of indexPerShard to a thread-safe map (e.g. ConcurrentHashMap).