Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-911

Minimize filesystem footprint

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Invalid
    • None
    • None
    • None
    • None

    Description

      This issue is about looking into how much space in filesystem hbases uses. Daniel Ploeg suggests that hbase is profligate in its use of space in hdfs. Given that block sizes by default are 64MB, and that every time hbase writes a store file that its accompanied by an index file and a very small metadata file, thats 3*64MB even if the file is empty (TODO: Prove this). The situation is aggrevated by the fact that hbase does a flush of whatever is in memory every 30 minutes to minimize loss in the absence of appends; this latter action makes for lots of small files.

      The solution to the above is implement append so optional flush is not necessary and a file format that aggregates info, index and data all in the one file. Short-term, we should set block size on the info/metadata file down to 4k or some such small size and look into doing likewise for the mapfile index.

      Attachments

        Activity

          People

            Unassigned Unassigned
            stack Michael Stack
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: