Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-7065

BlockStorage interface should return IDs for getStorageIds

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Ozone Filesystem
    • None

    Description

      The hadoop BlockLocation interface defines several functions that Impala relies on:

      • getNames
      • getHosts
      • getCachedHosts
      • getOffset
      • getLength
      • getStorageIds

      The result of getStorageIds is used to identify individual disks so that Impala can balance load across multiple disks on a node. Ozone returns NULLs for getStorageIds, which does not allow us to accurately identify individual disks.

      Ozone should setStorageIds so we can accurately schedule reads to separate disks.

      For some reason Impala expects size of getHosts and getStorageIds to match. Presumably indexes should match, so if a block is stored on two different disks on the same host, it should have matching duplicate hosts in getHosts. StorageIds don't make much sense with erasure coding, so they can be omitted there.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              MikaelSmith Michael Smith
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: