Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.17.1
-
CPython 3.8.2, MacOS Mojave 10.14.6
Description
I am trying to read a json file using an explicit schema but it looks like the schema is ignored. Moreover, if the my schema contains a field not present in the json file, then the output table contains all the fields in the json file plus the fields of my schema not found in the file.
A minimal example:
import pyarrow as pa from pyarrow import json # allowing for type inference print(json.read_json('tmp.json')) # prints: # pyarrow.Table # foo: string # baz: string # using an explicit schema that would read only "foo" schema = pa.schema([('foo', pa.string())]) print(json.read_json('tmp.json', parse_options=json.ParseOptions(explicit_schema=schema))) # prints: # pyarrow.Table # foo: string # baz: string # using an explicit schema that would read only "not_a_field", # which is not present in the json file schema = pa.schema([('not_a_field', pa.string())]) print(json.read_json('tmp.json', parse_options=json.ParseOptions(explicit_schema=schema))) # prints: # pyarrow.Table # not_a_field: string # foo: string # baz: string
And the tmp.json file looks like:
{"foo": "bar", "baz": "1"}
Attachments
Issue Links
- links to