Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-6006

[C++] Empty IPC streams containing a dictionary are corrupt

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.15.0
    • C++

    Description

       

      #include <arrow/api.h>
      #include <arrow/ipc/api.h>
      #include <arrow/io/api.h>
      
      void check(arrow::Status status) {
          if (!status.ok()) {
              status.Abort();
          }
      }
      
      int main() {
          auto type = arrow::dictionary(arrow::int8(), arrow::utf8());
          auto f0 = arrow::field("f0", type);
          auto schema = arrow::schema({f0});
      
          std::shared_ptr<arrow::io::BufferOutputStream> os;
          check(arrow::io::BufferOutputStream::Create(0, arrow::default_memory_pool(), &os));
      
          std::shared_ptr<arrow::ipc::RecordBatchWriter> writer;
          check(arrow::ipc::RecordBatchStreamWriter::Open(&*os, schema, &writer));
          check(writer->Close());
      
          std::shared_ptr<arrow::Buffer> buffer;
          check(os->Finish(&buffer));
          arrow::io::BufferReader is(buffer);
      
          std::shared_ptr<arrow::ipc::RecordBatchReader> reader;
          check(arrow::ipc::RecordBatchStreamReader::Open(&is, &reader));
      
          std::shared_ptr<arrow::RecordBatch> batch;
          check(reader->ReadNext(&batch));
      }
      

       

      -- Arrow Fatal Error --
      Invalid: Expected message in stream, was null or length 0

      It seems like this was caused by https://github.com/apache/arrow/commit/e68ca7f9aed876a1afcad81a417afb87c94ee951, which moved the dictionary values from the DataType to the array itself.

      I initially thought I could work around this by writing a zero-length table but that doesn't seem to actually work.

       

      Attachments

        Issue Links

          Activity

            People

              wesm Wes McKinney
              sfackler Steven Fackler
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m