Description
SedesHelper.writeChararray does writeUTF, but we do str1 = new String(bb1.array(), bb1.position(), casz1, BinInterSedes.UTF8); when reading it in the BinInterSedesTupleRawComparator https://github.com/apache/pig/blob/e0c5f265c68491395d8303c86195445be3d8aecf/src/org/apache/pig/data/BinInterSedes.java#L959-L964. For some reason, this works fine in my MAC (both jdk7 and jdk8) but not in Linux. Not sure about the actual cause and have not dug into it. Suspecting either charset environment or the specific update of jdk 8 (different in my MAC and Linux).
Attachments
Attachments
Issue Links
- relates to
-
PIG-4914 Add testcase for join with special characters in chararray
- Open
Attaching patch. I have not been able to come up with a test case for it and had postponed uploading the patch for long. Patch is simple. Been struggling with test case due to two issues
This patch is important and needs to go into the release. Will create a separate jira to add testcase later.