Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2011

Reduce number of getFileStatus call made from every task(TaskDistributedCache) setup

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • distributed-cache
    • None

    Description

      On our cluster, we had jobs with 20 dist cache and very short-lived tasks resulting in 500 map tasks launched per second resulting in 10,000 getFileStatus calls to the namenode. Namenode can handle this but asking to see if we can reduce this somehow.

      Attachments

        Activity

          People

            Unassigned Unassigned
            knoguchi Koji Noguchi
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: