Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-17756

We should have better introspection of HFiles

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Brainstorming
    • Status: Closed
    • Major
    • Resolution: Implemented
    • None
    • None
    • HFile
    • None

    Description

      Stack was suggesting to use DataSketches (https://datasketches.github.io) in order to write additional statistics to the HFiles. This could be used to improve our split decisions, troubleshooting or potentially do other interesting analysis without having to perform full table scans. The statistics could be stored as part of the HFile but we could initially improve the visibility of the data by adding some statistics to HFilePrettyPrinter.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            shahrs87 Rushabh Shah Assign to me
            esteban Esteban Gutierrez
            Votes:
            1 Vote for this issue
            Watchers:
            16 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment