Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-451

Add a Split interface

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.9.2
    • 0.10.0
    • None
    • None

    Description

      The InputFormat interface has a method:

      FileSplit[] getSplits();

      This should change to:

      Split[] getSplits();

      The Split interface would look like:

      public interface Split extends Writable {

      /** Returns a list of hosts that contain this split.
      This is only used to optimize task placement, so this may be empty. */
      String[] getLocations(FileSystem fs);

      /** The relative, estimated cost of operating on this. Typically the size of the data in the split.
      Used to prioritize tasks in a job (high-cost tasks are run first). */
      long getCost();
      }

      Attachments

        1. input-split.patch
          47 kB
          Owen O'Malley
        2. input-split-2.patch
          47 kB
          Owen O'Malley

        Activity

          People

            omalley Owen O'Malley
            cutting Doug Cutting
            Votes:
            2 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: