Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-17459

[C++] Support nested data conversions for chunked array

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Blocker
    • Resolution: Unresolved
    • None
    • None
    • C++
    • None

    Description

      `FileReaderImpl::ReadRowGroup` fails with "Nested data conversions not implemented for chunked array outputs". It fails on [ChunksToSingle](https://github.com/apache/arrow/blob/7f6b074b84b1ca519b7c5fc7da318e8d47d44278/cpp/src/parquet/arrow/reader.cc#L95)

      Data schema is: 

        optional group fields_map (MAP) = 217 {
          repeated group key_value {
            required binary key (STRING) = 218;
            optional binary value (STRING) = 219;
          }
        }
      fields_map.key_value.value-> Size In Bytes: 13243589 Size In Ratio: 0.20541047
      fields_map.key_value.key-> Size In Bytes: 3008860 Size In Ratio: 0.046667963
      

      Is there a way to work around this issue in the cpp lib?

      In any case, I am willing to implement this, but I need some guidance. I am very new to parquet (as in started reading about it yesterday).

       

      Probably related to: https://issues.apache.org/jira/browse/ARROW-10958

      Attachments

        Issue Links

          Activity

            People

              arthurpassos Arthur Passos
              arthurpassos Arthur Passos
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: