Well, it is not a "bug" of hbase. HBase only provides int -> byte conversion as a convenience, and it seems that Bytes.toBytes(int) and others only guarantees lexicographic ordering for unsigned numbers. We can definitely add something like Bytes.toSignedBytes() in HBase so that you can ensure signed numbers are sorted correctly in lexicographic order.
Coming to Hive, I think Ashutosh is right, that we have to keep supporting already existing data in hbase serialized through Bytes.toBytes(). So, I would suggest we add another storage type (hbase.table.default.storage.type), like "signedbinary", which should do the hive-specific signed byte conversion.
So, we would have:
- cf:col#string : serialize as string
- cf:col#binary : serialize as binary, compatible with Bytes.toBytes()
- cf:col#signedBinary : serialize as signed binary.
I would also suggest that, people might be interested in custom ser/de from Hive types to byte, but I am not sure how feasible that would be to implement.