Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-8749

[C++] IpcFormatWriter writes dictionary batches with wrong ID

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.16.0, 0.17.0
    • Fix Version/s: 2.0.0
    • Component/s: C++
    • Labels:
      None

      Description

      IpcFormatWriter assigns dictionary IDs once when it writes the schema message. Then, when it writes dictionary batches, it assigns dictionary IDs again because it re-collects dictionaries from the given batch. So for example, if you have 5 dictionaries, the first dictionary will end up with ID 0 but be written with ID 5.

      For example, this will fail with "'_error_or_value11.status()' failed with Key error: No record of dictionary type with id 9"

      TEST_F(TestMetadata, DoPutDictionaries) {
        ASSERT_OK_AND_ASSIGN(auto sink, arrow::io::BufferOutputStream::Create());
        std::shared_ptr<Schema> schema = ExampleDictSchema();
        BatchVector expected_batches;
        ASSERT_OK(ExampleDictBatches(&expected_batches));
        ASSERT_OK_AND_ASSIGN(auto writer, arrow::ipc::NewStreamWriter(sink.get(), schema));
        for (auto& batch : expected_batches) {
          ASSERT_OK(writer->WriteRecordBatch(*batch));
        }
        ASSERT_OK_AND_ASSIGN(auto buf, sink->Finish());
        arrow::io::BufferReader source(buf);
        ASSERT_OK_AND_ASSIGN(auto reader, arrow::ipc::RecordBatchStreamReader::Open(&source));
        AssertSchemaEqual(schema, reader->schema());
        for (auto& batch : expected_batches) {
          ASSERT_OK_AND_ASSIGN(auto actual, reader->Next());
          AssertBatchesEqual(*actual, *batch);
        }
      }

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                apitrou Antoine Pitrou
                Reporter:
                lidavidm David Li
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: