Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-2148

Inefficient FSDataset.getBlockFile()

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.14.0
    • 0.17.0
    • None
    • None

    Description

      FSDataset.getBlockFile() first verifies that the block is valid and then returns the file name corresponding to the block.
      Doing that it performs the data-node blockMap lookup twice. Only one lookup is needed here.
      This is important since the data-node blockMap is big.

      Another observation is that data-nodes do not need the blockMap at all. File names can be derived from the block IDs,
      there is no need to hold Block to File mapping in memory.

      Attachments

        1. getBlockFile.patch
          2 kB
          Konstantin Shvachko
        2. getBlockFile1.patch
          4 kB
          Konstantin Shvachko

        Activity

          People

            shv Konstantin Shvachko
            shv Konstantin Shvachko
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: