Description
When I tried to do a manual region split from HBase shell, I found that split command acts incorrectly with hex split keys.
Here is an example.
I execute hbase(main):003:0> split 'tsdb', "\x00\x00\xC3" .
While I expect it to split at the 3-byte key "\x00\x00\xC3" , it actually split at a 5-byte key "\x00\x00\xEF\xBF\xBD".
I test with more split keys and find some patterns:
- If the all bytes in the split key represented in hexadecimal are between "\x00" and "\x7F" , it works as expected and split at exactly the key specified.
- If there are any bytes between "\x80" and "xFF", it works incorrectly. No matter the byte is, it is interpreted as "\xEF\xBF\xBD". Here is another example. Specifying split key "\x00\xA0\x00\xB0" actually splits at "\x00\xEF\xBF\xBD\x00\xEF\xBF\xBD".
I'm running Hbase 0.94.8, r1485407, both server-side and client-side.
Attachments
Attachments
Issue Links
- is related to
-
HBASE-15357 TableInputFormatBase getSplitKey does not handle signed bytes correctly
- Resolved
- relates to
-
HBASE-17461 HBase shell "major_compact" command should properly convert "table_or_region_name" parameter to java byte array properly before simply calling "HBaseAdmin.majorCompact" method
- Resolved