Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
2.6.5
-
None
-
None
Description
When using distcp to copy lots of small files, NameNode naturally becomes a bottleneck.
The current distcp code did not optimize to reduce the NameNode calls. We should restructure the code to reduce the number of NameNode calls as much as possible to speed up the copy of small files.
Attachments
Issue Links
- is depended upon by
-
HADOOP-15788 Improve Distcp for long-haul/cloud deployments
- Open