Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
None
-
None
-
None
Description
PigStorage allows overriding the default field delimiter ('\t'), but does not allow overriding the record delimiter ('\n').
It is a valid use case that fields contain new lines, e.g. because they are contents of a document/web page. It is possible for the user to create a custom load/store UDF to achieve that, but that is extra work on the user, many users will have to do it , and that udf would be the exact code duplicate of the PigStorage except for the delimiter.
Thus, PigStorage() should allow to configure both field and record separators.