Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
1.0.0
Description
Ran a microbenchmark to have concurrent clients reading chunks from a DataNode.
When the number of clients grows, there is a significant amount of overhead in accessing a concurrent hash map. The overhead grows exponentially with respect to the number of clients.
@VisibleForTesting static <T> T processFileExclusively(Path path, Supplier<T> op) { for (;;) { if (LOCKS.add(path)) { break; } } try { return op.get(); } finally { LOCKS.remove(path); } }
In my test, having 64 concurrent clients reading chunks from a 1-disk DataNode caused the DN to spend nearly half of the time adding into the LOCKS object (a concurrent hash map).
Given that it is not uncommon to find HDFS DataNodes with tens of thousands of incoming client connections, I expect to see similar traffic to an Ozone DataNode at scale.
We should fix this code.
Attachments
Attachments
Issue Links
- is broken by
-
HDDS-2026 Overlapping chunk region cannot be read concurrently
- Resolved
- links to