Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-1353

Remove most of getBlockLocation optimization

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.22.0
    • Component/s: namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      <This description is not valid. See comment.>
      HDFS-1081 optimized the number of block access tokens (BATs) created in a single call to getBlockLocations, as this is an expensive operation. However, that JIRA put off another optimization which was then made possible, which is to just send a single block access token across the wire (and maintain a single BAT on the client side). This JIRA is for implementing that optimization. Since a single BAT is generated for all the blocks, we just write that single BAT to the wire, rather than writing n BATs for n blocks, as is currently done. This turns out to be a useful optimization for files with very large numbers of blocks, as the new lone BAT is much larger than was a BAT previously.

        Attachments

        1. Benchmarking results.xlsx
          73 kB
          Jakob Homan
        2. HDFS-1353.patch
          17 kB
          Jakob Homan
        3. HDFS-1353-optmized-wire-not-to-be-committed.patch
          17 kB
          Jakob Homan
        4. HDFS-1353-y20.patch
          22 kB
          Jakob Homan
        5. HDFS-1353-y20-2.patch
          13 kB
          Jakob Homan

          Issue Links

            Activity

              People

              • Assignee:
                jghoman Jakob Homan
                Reporter:
                jghoman Jakob Homan
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: