Impala incorrectly stores a string value with newlines as multiple rows, rather than a single row with newlines:
[example.com:21000] > create table try (text string);
Query: create table try (text string)
[example.com:21000] > insert into try values ('foo
Query: insert into try values ('foo
Inserted 1 rows in 2.46s
[example.com:21000] > select * from try;
Query: select * from try
Query finished, fetching results ...
| text |
| foo |
| bar |
| baz |
Returned 3 row(s) in 0.42s
As you can see, it thinks it inserted one row, but when you select from the table, it returns three rows. I had a look at the text file generated for this table, and it looks like this:
So I think newlines are not properly escaped for storage in the text file.