Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6219

Reduce memory required for FileInputFormat located status optimization

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 2.1.1-beta
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      MAPREDUCE-1981 introduced an optimization to drastically reduce the number of namenode operations required to compute input splits when processing a directory. However it requires more memory to perform this optimization as it retains the full LocatedFileStatus object for all input files while computing the splits. This can lead to odd situations for users where using a directory as input can run the job client out of heap space but using directory/* as the input spec allows it to run within the original heap space.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                jlowe Jason Darrell Lowe
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated: