Details
-
Improvement
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
None
Description
UTF8Type converts both byte arrays into Strings and then compares them. This is unnecessary and slow because UTF-8 encoded Strings are already directly comparable. Higher codepoints yield higher initial and subsequent bytes. One can safely use BytesType.compare() for UTF-8. Maybe UTF8Type should be a subclass only overriding getString().
BTW, It's also dangerous to ignore invalid byte sequences. At this point the byte array should contain valid UTF-8.
Attachments
Attachments
Issue Links
- is related to
-
CASSANDRA-1196 Invalid UTF-8 keys [for legacy OPP] should cause exceptions
- Resolved