Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-4147

StoreFile query usage report

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Not A Problem
    • None
    • None
    • None
    • None

    Description

      Detailed information on what HBase is doing in terms of reads is hard to come by.

      What would be useful is to have a periodic StoreFile query report. Specifically, this could run on a configured interval (e.g., every 30 seconds, 60 seconds) and dump the output to the log files.

      This would have all StoreFiles accessed during the reporting period (and with the Path we would also know region, CF, and table), # of times the StoreFile was accessed, the size of the StoreFile, and the total time (ms) spent processing that StoreFile.

      Even this level of summary would be useful to detect a which tables & CFs are being accessed the most, and including the StoreFile would provide insight into relative "uncompaction" (i.e., lots of StoreFiles).

      I think the log-output, as opposed to UI, is an important facet with this. I'm assuming that users will slice and dice this data on their own so I think we should skip any kind of admin view for now (i.e., new JSPs, new APIs to expose this data). Just getting this to log-file would be a big improvement.

      Will this have a non-zero performance impact? Yes. Hopefully small, but yes it will. However, flying a plane without any instrumentation isn't fun.

      Attachments

        1. hbase_4147_storefilereport_2011_08_10.pdf
          125 kB
          Doug Meil
        2. hbase_4147_storefilereport.pdf
          130 kB
          Doug Meil

        Activity

          People

            Unassigned Unassigned
            dmeil Doug Meil
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: