Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
My interpretation of the "string" in quoted_strings_can_be_null is that it is referring to the unparsed CSV input string and not the actual output data type.
So when converting:
Unable to find source-code formatter for language: csv. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
"one","two","three" "1","2","3" "4","","6"'
We should get...
[1, 4], [2, None], [3, 6]
...currently we get...
[1, 4], ['2', None], [3, 6]
In pandas the above string parses to...
>>> f = io.BytesIO(b'"one","two","three"\n"1","2","3"\n"4","","6"')
>>> pandas.read_csv(f)
one two three
0 1 2.0 3
1 4 NaN 6
So this is bringing us closer to pandas which is probably a good thing.
Inspired by: https://github.com/apache/arrow/issues/10892
Attachments
Issue Links
- links to