Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2011

Reduce number of getFileStatus call made from every task(TaskDistributedCache) setup

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: distributed-cache
    • Labels:
      None

      Description

      On our cluster, we had jobs with 20 dist cache and very short-lived tasks resulting in 500 map tasks launched per second resulting in 10,000 getFileStatus calls to the namenode. Namenode can handle this but asking to see if we can reduce this somehow.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              knoguchi Koji Noguchi
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: