Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1296

Improve interface to FileSystem.getFileCacheHints

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • None
    • fs
    • None

    Description

      The FileSystem interface provides a very limited interface for finding the location of the data. The current method looks like:

      String[][] getFileCacheHints(Path file, long start, long len) throws IOException

      which returns a list of "block info" where the block info consists of a list host names. Because the hints don't include the information about where the block boundaries are, map/reduce is required to call the name node for each split. I'd propose that we fix the naming a bit and make it:

      public class BlockInfo extends Writable {
      public long getStart();
      public String[] getHosts();
      }

      BlockInfo[] getFileHints(Path file, long start, long len) throws IOException;

      So that map/reduce can query about the entire file and get the locations in a single call.

      Attachments

        Issue Links

          Activity

            People

              dhruba Dhruba Borthakur
              omalley Owen O'Malley
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: