Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-20361

Non-successive TableInputSplits may wrongly be merged by auto balancing feature

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.1.0
    • Component/s: mapreduce
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      TableInputFormatBase class offers users a mechanism to exclude specific splits from returned list of TableInputFormatBase#getSplits through TableInputFormatBase#includeRegionInSplit.
      It also offers users a feature called "auto balancing" to mitigate data skew by splitting large splits and merging small splits.
      If a user overrides TableInputFormatBase#includeRegionInSplit, i th split and i+1 th split may not be successive(i th split's end key is smaller than i+1 th split's start key).
      If he or she further enable auto balancing feature, non-successive splits can be merged, which means excluded splits between merged non-successive splits "revive".

      To avoid such cases, we should not merge non-successive splits.

        Attachments

        1. HBASE-20361.master.001.patch
          12 kB
          Yuki Tawara
        2. HBASE-20361.master.002.patch
          12 kB
          Yuki Tawara

          Activity

            People

            • Assignee:
              twyuki Yuki Tawara
              Reporter:
              yktawara Yuki Tawara

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment