I've been using the /update/csv option to bulk import large numbers of data with great success, but I believe I've found a corner case in the parsing of csv when the field is a multi-valued string field with a new-line character in it.
As soon as you specify f.[fieldname].split=true&f.[fieldname].separator=[something] the multi-field/split parsing stops at the first linebreak
My managed schema:
Example POST url, I'm using ! as split character for test1_strs and test2_strs
CSV content: (notice the new-lines are included but encapsulated by "", these new-lines need to be maintained as is)
Resulting Solr Doc:
Note in the single value test3_str the new-line is appropriately maintained as \r\n (or just \n when this is done via code instead of manually)
test2_strs shows that the mutli-value split on ! worked correctly
test1_strs immediately stops processing after the first value's new-line, instead of the actual separator after the new-line.
Expected values should look like:
I've tried pre-escaping line breaks but all that gives me is the escaped new-line in solr, which would need to be post-processed on the consuming end to return to a \r\n (or \n) and would be nontrivial to do. Solr handles \n just find in all other cases so I consider this an expected behavior.