Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-7016

Avoid making separate RPC calls for FileStatus and block locations in FileInputFormat

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      FileInputFormat::getSplits uses FileSystem::globStatus to determine its inputs. When the glob returns directories, each is traversed and LocatedFileStatus instances are returned with the block locations. However, when the glob returns files, this is a FileStatus that requires a second RPC to obtain its locations.

      Attachments

        Activity

          People

            Unassigned Unassigned
            cdouglas Christopher Douglas
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: