Details
-
Improvement
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
3.3.1, 3.4.0
Description
Presently distCp, uses the producer-consumer kind of setup while building the listing, the input queue and output queue are both unbounded, thus the listStatus grows quite huge.
Rel Code Part :
This goes on bredth-first traversal kind of stuff(uses queue instead of earlier stack), so if you have files at lower depth, it will like open up the entire tree and the start processing....
Attachments
Attachments
Issue Links
- causes
-
HADOOP-17628 Distcp contract test is really slow with ABFS and S3A; timing out
- Resolved
-
HBASE-25900 HBoss tests compile/failure against Hadoop 3.3.1
- Resolved
- is related to
-
HADOOP-11827 Speed-up distcp buildListing() using threadpool
- Resolved
- relates to
-
HADOOP-17558 DistCp: Reduce memory usage using a fixed size ThreadPoolExecutor
- Open
-
HIVE-24852 Add support for Snapshots during external table replication
- Resolved
- links to