Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-2879

While grouping splits, allow an alternate list of preferred locations to be provided per split

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.8.2
    • None
    • None

    Description

      Split locations - at least for FileInputSplits - are generally tied to the location on HDFS where the split resides.

      There are situations in which this location is not necessarily the best location to process this split.

      e.g.
      Clusters where compute and storage are separate.
      Systems which cache data - cache affinity is more important.

      Providing an alternate list of preferred locations allows grouping to the preferred locations, instead of always grouping based on the locations specified in the split.

      Attachments

        1. TEZ-2879.2.txt
          25 kB
          Siddharth Seth
        2. TEZ-2879.1.txt
          25 kB
          Siddharth Seth
        3. TEZ-2879.1.wip.txt
          14 kB
          Siddharth Seth

        Activity

          People

            sseth Siddharth Seth
            sseth Siddharth Seth
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: