Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-9969

[C++] RecordBatchBuilder yields invalid result with dictionary fields

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0.1
    • 2.0.0
    • C++

    Description

      The record batch builder takes a schema as input and uses that schema when creating the record batch.

      However when one or more fields are dictionaries, the data type is unknown until the dictionary builder flushes and the initial schema often does not match. The builder needs to modify the schema for the actual data type generated.

      This problem is easily reproduced by providing a schema with a field dictionary(int16(), utf8()) and adding a single row. This yields a data type of dictionary(int8(),utf8()).

      Attachments

        Issue Links

          Activity

            People

              troels Troels Nielsen
              belzilep Pierre Belzile
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 20m
                  2h 20m