Description
Terasort is very slow on S3, because it still uses the classic rename-to-commit algorithm on the sort, even while teragen and the reporting can use the new committer
Reason: org.apache.hadoop.examples.terasort.TeraOutputFormat has overriden getOutputCommitter even though it doesn't need to.
Attachments
Issue Links
- Is contained by
-
HADOOP-16058 S3A tests to include Terasort
- Resolved
- is duplicated by
-
HADOOP-16153 Allow TeraGen to use schema-specific output committer
- Resolved
- is part of
-
HADOOP-16058 S3A tests to include Terasort
- Resolved