Description
PIG-4074 added support for mapred.submit.replication, which sets the replication factor for files added to the distributed cache. The purpose is to avoid a huge number of task attempts downloading the same file in HDFS at once during localization and slowing down because of contention over few replicas. The replication factor for files was set correctly, but registered jars are added to HDFS through a different code path and weren't using the submit replication factor. This causes localization time for jobs to increase by as much as 10 minutes (at which point the tasks are killed).