Description
For the snippet of code which runs on the latest types branch. (utf8.txt attached)
A = load 'utf8.txt' using PigStorage() as (t1: chararray);
illustrate A;
results in this output being produced
-------------------------------
A | t1: bytearray cn: 1 |
-------------------------------
gabriella?? |
-------------------------------
Three observations:
1) text should be chararray, not bytearray.
2) cn: 1 should be removed from the display
3) Value for text is "username??" is not displayed properly
Now replacing illustrate with dump
A = load 'utf8.txt' using PigStorage() as (t1: chararray);
dump A;
(david?)
(rachel?)
(jessica?)
(sarah?)
(katie?)
(wendy?)
(david?)
(priscilla?)
(oscar?)
(xavier?)
..some more.
The utf8 characters after username are not displayed correctly but instead substituted by ?.