Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
0.7.0
-
None
-
None
-
None
Description
Consider the following data:
1\t ( hello , bye ) \n
1\t( hello , bye )a\n
2 \t (good , bye)\n
The following script gives the results below:
a = load 'junk' as (i:int, t:tuple(s:chararray, r:chararray)); dump a;
(1,( hello , bye ))
(1,( hello , bye ))
(2,(good , bye))
The current bytesToTuple implementation discards leading and trailing characters before the tuple delimiters and parses the tuple out - I think instead it should treat any leading and trailing characters (including space) near the delimiters as an indication of a malformed tuple and return null.
Also in the code, consumeBag() should handle the special case of {} and not delegate the handling to consumeTuple().
In consumeBag() null tuples should not be skipped.