Details
-
Sub-task
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
2.9.0, 2.8.3, 2.7.5, 3.0.0, 3.1.0
-
None
Description
While profiling 1TB TeraGen job on Hadoop 2.8.2 cluster (Google Dataproc, 2 workers, GCS connector) I saw that FileSystem.Statistics code paths Wall time is 5.58% and CPU time is 26.5% of total execution time.
After switching FileSystem.Statistics implementation to LongAdder, consumed Wall time decreased to 0.006% and CPU time to 0.104% of total execution time.
Total job runtime decreased from 66 mins to 61 mins.
These results are not conclusive, because I didn't benchmark multiple times to average results, but regardless of performance gains switching to LongAdder simplifies code and reduces its complexity.
Attachments
Attachments
Issue Links
- is related to
-
HADOOP-13065 Add a new interface for retrieving FS and FC Statistics
- Resolved
- links to