Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-10323

Auto detect data block encoding in HFileOutputFormat

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.98.0, 0.99.0
    • mapreduce
    • None
    • Reviewed

    Description

      Currently, one has to specify the data block encoding of the table explicitly using the config parameter "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload load. This option is easily missed, not documented and also works differently than compression, block size and bloom filter type, which are auto detected.

      The solution would be to add support to auto detect datablock encoding similar to other parameters.

      The current patch does the following:
      1. Automatically detects datablock encoding in HFileOutputFormat.
      2. Keeps the legacy option of manually specifying the datablock encoding
      around as a method to override auto detections.
      3. Moves string conf parsing to the start of the program so that it fails
      fast during starting up instead of failing during record writes. It also
      makes the internals of the program type safe.
      4. Adds missing doc strings and unit tests for code serializing and
      deserializing config paramerters for bloom filer type, block size and
      datablock encoding.

      Attachments

        1. HBASE_10323-0.94.15-v1.patch
          32 kB
          Ishan Chhabra
        2. HBASE_10323-0.94.15-v2.patch
          33 kB
          Ishan Chhabra
        3. HBASE_10323-0.94.15-v3.patch
          33 kB
          Ishan Chhabra
        4. HBASE_10323-0.94.15-v4.patch
          33 kB
          Ishan Chhabra
        5. HBASE_10323-0.94.15-v5.patch
          34 kB
          Ishan Chhabra
        6. HBASE_10323-trunk-v1.patch
          33 kB
          Ishan Chhabra
        7. HBASE_10323-trunk-v2.patch
          33 kB
          Ishan Chhabra
        8. HBASE_10323-trunk-v3.patch
          31 kB
          Ishan Chhabra
        9. HBASE_10323-trunk-v4.patch
          33 kB
          Ishan Chhabra

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            ishanc Ishan Chhabra
            ishanc Ishan Chhabra
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment