Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4850

Registered jars do not use submit replication

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.16.0
    • impl
    • None

    Description

      PIG-4074 added support for mapred.submit.replication, which sets the replication factor for files added to the distributed cache. The purpose is to avoid a huge number of task attempts downloading the same file in HDFS at once during localization and slowing down because of contention over few replicas. The replication factor for files was set correctly, but registered jars are added to HDFS through a different code path and weren't using the submit replication factor. This causes localization time for jobs to increase by as much as 10 minutes (at which point the tasks are killed).

      Attachments

        1. PIG-4850.1.patch
          1 kB
          Ryan Blue

        Activity

          People

            rdblue Ryan Blue
            rdblue Ryan Blue
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: