Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12534

Provide logical BlockLocations for EC files for better split calculation

    XMLWordPrintableJSON

    Details

    • Target Version/s:

      Description

      I talked to Marcelo Masiero Vanzin and Alexander Behm some more about split calculation with EC. It turns out HDFS-12222 was resolved prematurely. Applications depend on HDFS BlockLocation to understand where the split points are. The current scheme of returning one BlockLocation per block group loses this information.

      We should change this to provide logical blocks. Divide the file length by the block size and provide suitable BlockLocations to match, with virtual offsets and lengths too.

      I'm not marking this as incompatible, since changing it this way would in fact make it more compatible from the perspective of applications that are scheduling against replicated files. Thus, it'd be good for beta1 if possible, but okay for later too.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              andrew.wang Andrew Wang
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated: