Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1353

Remove most of getBlockLocation optimization

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.22.0
    • Component/s: namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      <This description is not valid. See comment.>
      HDFS-1081 optimized the number of block access tokens (BATs) created in a single call to getBlockLocations, as this is an expensive operation. However, that JIRA put off another optimization which was then made possible, which is to just send a single block access token across the wire (and maintain a single BAT on the client side). This JIRA is for implementing that optimization. Since a single BAT is generated for all the blocks, we just write that single BAT to the wire, rather than writing n BATs for n blocks, as is currently done. This turns out to be a useful optimization for files with very large numbers of blocks, as the new lone BAT is much larger than was a BAT previously.

      1. HDFS-1353-y20-2.patch
        13 kB
        Jakob Homan
      2. HDFS-1353-y20.patch
        22 kB
        Jakob Homan
      3. HDFS-1353-optmized-wire-not-to-be-committed.patch
        17 kB
        Jakob Homan
      4. HDFS-1353.patch
        17 kB
        Jakob Homan
      5. Benchmarking results.xlsx
        73 kB
        Jakob Homan

        Issue Links

          Activity

            People

            • Assignee:
              Jakob Homan
              Reporter:
              Jakob Homan
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development