Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-10955

[C++] Reading empty json lists results in invalid non-nullable null type

    XMLWordPrintableJSON

Details

    Description

      We're using Arrow to convert from JSON to Parquet and occasionally have empty lists in our json. Reading such JSON into an Arrow table and writing it to Parquet currently fails. We noticed this issue in our C++ Arrow code, but it also happens from Python.

      Minimal repro:

      input.json:

      {"foo": []}

       

      convert.py:
      import pyarrow.json
      import pyarrow.parquet

      t = pyarrow.json.read_json("input.json")
      pyarrow.parquet.write_table(t, "out.parquet")
       

      Produces:

      Traceback (most recent call last):
      File "repro.py", line 5, in <module>
      pyarrow.parquet.write_table(t, "out.parquet")
      env/lib/python3.8/site-packages/pyarrow/parquet.py", line 1717, in write_table
      with ParquetWriter(
      File "env/lib/python3.8/site-packages/pyarrow/parquet.py", line 554, in _init_
      self.writer = _parquet.ParquetWriter(
      File "pyarrow/parquet.pyx", line 1409, in pyarrow._parquet.ParquetWriter.cinit_
      File "pyarrow/error.pxi", line 84, in pyarrow.lib.check_status
      pyarrow.lib.ArrowInvalid: NullType Arrow field must be nullable

       

      Attachments

        Issue Links

          Activity

            People

              apitrou Antoine Pitrou
              pgoldsborough Peter Goldsborough
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h