Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
3.0.0
-
None
-
None
Description
Rename in workloads like Teragen/Terasort who use Hadoop default outputcommitters really hurt performance a lot.
Stocator announce it doesn't create the temporary directories any all, and still preserves Hadoop's fault tolerance. I add a switch when creating file via integrating it's code into s3a, I got 5x performance gain in Teragen and 15% performance improvement in Terasort.
Attachments
Issue Links
- duplicates
-
HADOOP-13786 Add S3A committers for zero-rename commits to S3 endpoints
- Resolved