> if more aggressive unescaping is required - we can always provide unescape UDFs. That would be much better since we could have some standard semantics for the unescaping (json/html/xml etc)
Let's say the content of the file contains something like "\ \ 0 0 5", and also "\ 0 0 5". After the unescaping, both of them will be "\ 0 0 5" so UDF won't be able to unescape it further.
I think it's OK that the escaping and unescaping are not exactly the same, given that a string escaped using our logic will be unescaped back to the original value.
The extra logic in unescaping is only used to deal with "impossible" cases - the cases that will never happen using our own escaping logic. In that case we can either throw an error or "guess" what the user means. My guess is that most other escaping logic will escape "\" to "\ \" (if they escape something using "\", they have to escape "\" too). And they don't do special handling according to the actual separator (most of them escape all special characters). So if there is actually a "\ 0 0 5", it is most likely be the ascii code 5.
I am also OK if we just escape (and unescape reflectively) all characters outside the range of 32-127, plus "\" and all separators. That makes the logic simpler. But it is still a question what if we see an "impossible" sequence of characters in the escaped stream, e.g. "\ 0 4 0 " (0x20, 32, space).