Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-1353

Remove most of getBlockLocation optimization

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.21.0
    • 0.22.0
    • namenode
    • None
    • Reviewed

    Description

      <This description is not valid. See comment.>
      HDFS-1081 optimized the number of block access tokens (BATs) created in a single call to getBlockLocations, as this is an expensive operation. However, that JIRA put off another optimization which was then made possible, which is to just send a single block access token across the wire (and maintain a single BAT on the client side). This JIRA is for implementing that optimization. Since a single BAT is generated for all the blocks, we just write that single BAT to the wire, rather than writing n BATs for n blocks, as is currently done. This turns out to be a useful optimization for files with very large numbers of blocks, as the new lone BAT is much larger than was a BAT previously.

      Attachments

        1. Benchmarking results.xlsx
          73 kB
          Jakob Homan
        2. HDFS-1353.patch
          17 kB
          Jakob Homan
        3. HDFS-1353-optmized-wire-not-to-be-committed.patch
          17 kB
          Jakob Homan
        4. HDFS-1353-y20.patch
          22 kB
          Jakob Homan
        5. HDFS-1353-y20-2.patch
          13 kB
          Jakob Homan

        Issue Links

          Activity

            People

              jghoman Jakob Homan
              jghoman Jakob Homan
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: